Scientists at Istanbul Technical University (İTÜ) have developed software that uses artificial intelligence to identify errors made by foreigners in Turkish.
Under the leadership of professor Gülşen Eryiğit, head of the Department of Artificial Intelligence and Data Engineering at İTÜ, and the Natural Language Processing Research Group, a project has been launched to detect and correct Turkish language errors made by foreigners using artificial intelligence.
Software powered by artificial intelligence that identifies the Turkish language usage errors of foreigners was developed by Eryiğit and her team.
Eryiğit told Anadolu Agency (AA) that many researchers from various fields are involved in the project under ITÜ’s Turkish Language Teaching Application and Research Center (TÖMER).
She explained that they initially began developing a mobile application to facilitate the teaching of Turkish to foreigners at the ITÜ Department of Artificial Intelligence and Data Engineering. “Then we began identifying the needs for this project. One of the biggest gaps we found was determining the order of topics, which topics should be prioritized and which topics should be expanded for foreign students coming from different demographic backgrounds,” she said.
Emphasizing that technologies based on natural language processing and large language models are advancing rapidly, Eryiğit noted: “These models are primarily trained on specialized data for widely studied languages such as English and Chinese. Therefore, open-source models have very low success rates when applied to languages other than these main languages. Software that corrects errors at the word or grammar level in different languages does not exist for languages like Turkish.”
After conducting research, Eryiğit said the need to label the errors collected from students' data emerged. "We found that there was no standard worldwide for error labeling in language usage. Therefore, we developed a labeling standard for language errors that covers all languages. We published this internationally. This is not a standard specific to Turkish; it covers all languages," she said.
Regarding the process of detecting errors in Turkish usage by foreigners, Eryiğit said: “Our teachers use this taxonomy and these standards to continuously label data collected from foreign students. At the same time, our artificial intelligence software, which learns from our experts, automatically detects errors in students’ work. Currently, we can identify the types of errors students make. Teachers or any automatic error correction system can correct the student's mistakes. Then, the software automatically detects where in the sentence or text the error is located, and at what level it is.”
She explained that the software matches flawed and correct texts, determines the severity of errors, and assigns these errors automatically. Initially, this process is done manually by teachers and experts. “This way, we generate test datasets and example data for our systems. Later, automatic software is developed to start automating the process. For example, the software corrects a sentence like 'Ben kendimi nefret ediyorum' to 'Ben kendimden nefret ediyorum' using artificial intelligence. After detecting errors such as nonexistent letters or sounds in a language, sentence structure and word order, statistics from the collective data allow for changes to be made to curricula according to different demographic groups, or teachers can rapidly receive analysis on which areas their class is struggling with. This paves the way for improvement,” she said.
Eryiğit highlighted that the artificial intelligence software would benefit teachers. She stated that the software would provide interfaces that allow teachers to query common errors made by students from different groups. "For instance, a Syrian student may have trouble pronouncing certain sounds, or a French student may struggle with verb endings or word order.
Artificial intelligence will show this to the teacher. It will analyze which demographic groups make what types of mistakes based on big data, and language learning flows can be updated accordingly. Teachers will be able to add special lessons or sessions for students who need extra support in areas of difficulty. This will lead to a major leap in the learning of Turkish," she concluded.
Eryiğit also mentioned that the artificial intelligence software was locally developed by scientists at İTÜ and that the research teams include foreign researchers, as well as numerous undergraduate, graduate and researchers from other universities.
She also noted that they had recently received two patents related to language learning and emphasized that interest in the Turkish language was growing and that research funding in this field should be increased.