A copy of an Iberian "document" found in the city of Ulstreet, Gerona, in northeastern Spain (Wikipedia Commons)
Researchers at the Massachusetts Institute of Technology’s Computer Science and Artificial Intelligence Laboratory (MIT CSAIL) have developed a new artificial intelligence algorithm that can automatically decode a lost language no longer understood without knowing its relationship to other languages.
Researchers developed the algorithm in response to the rapid disappearance of human languages, as most languages that once existed are no longer spoken, and half of the remaining languages are expected to perish in the next 100 years.
The new system could, according to the next web, help restore it. Most importantly, it can preserve our understanding of the cultures and wisdom of their speakers.
The algorithm works by harnessing basic principles from historical linguistics, such as the expected ways in which languages use phonemic alternatives. The researchers provided an example of a word with a “p” in the native language that would likely change to “b” in its lineage, but perhaps not to “k” due to the difference in pronunciation.
These patterns are then converted into mathematical determinants, and this allows the model to segment words from an ancient language and maps them to a related language.
The algorithm can also identify different language families. For example, their method suggested that the Iberian language had nothing to do with Basque (Basque, an ethnic group from southern Europe distinguished by the Basque language), thus supporting modern academic studies.
The project was spearheaded by MIT Professor Regina Barzilai, who last month won a $ 1 million prize from the world’s largest AI association for her pioneering work on drug development and breast cancer screening.
The researcher now wants to expand the scope of work to determine the words’ semantic meaning even if we do not know how to read them.
“For example, we may identify all references to people or locations in the document which can then be investigated further in light of known historical evidence,” Barzilai said in a statement.
She added that “these methods of identifying entities are commonly used in many word processing applications today and are very accurate, but the main research question is whether the task is feasible without any training data for the old language.”