• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • The Cross-evaluation Crux for Computational Phylogenetic Linguistics
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
May 22, 2026
HSE Graduates AI Project Wins at TECH & AI Awards
Daria Davydova, graduate of the HSE Graduate School of Business and Head of the AI Implementation Unit at the Artificial Intelligence Department of Alfa-Bank, received a prize at the TECH & AI Awards. She was awarded for the best AI solution for optimising business processes. The winners were determined as part of the VII Russian Summit and Awards on Digital Transformation (CDO/CDTO Summit & Awards).
May 20, 2026
HSE University Opens First Representative Office of Satellite Laboratory in Brazil
HSE University-St Petersburg opened a representative office of the Satellite Laboratory on Social Entrepreneurship at the University of Campinas in Brazil. The platform is going to unite research and educational projects in the spheres of sustainable development, communications and social innovations.
May 18, 2026
The 'Second Shift' Is Not Why Women Avoid News
Women are more likely than men to avoid political and economic news, but the reasons for this behaviour are linked less to structural inequality or family-related stress than to personal attitudes and the emotional perception of news content. This conclusion was reached by HSE researchers after analysing data from a large-scale survey of more than 10,000 residents across 61 regions of Russia. The study findings have been published in Woman in Russian Society.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

The Cross-evaluation Crux for Computational Phylogenetic Linguistics

.
Afanasev I.

The flourishing of computational phylogenetic linguistics increased the pressing need for
cross-evaluation between the existing classification approaches, which are often imperfect,
whether performed by a human or a computer. We present a study of cross-evaluation
approaches for both methods (including an interdisciplinary approach to test the linguistic
findings against) and data (complementing traditional word lists by linguistic atlases, surveys
and databases). The focus of the research is on the use of insufficient cross-evaluation which
leads to misleading conclusions about methods. We perform a case study of cross-evaluation
misuse in computational phylogenetic linguistics research of South American languages
based on Levenshtein distance measurement between Swadesh list items. The conclusion
presents the prospects of language outgroup comparison implementation. It is a new possible
cross-evaluation method that joins method cross-evaluation and data cross-evaluation.

Language: English
Full text
Keywords: классификация языковавтоматическая классификацияLevenshtein distanceрасстояние Левенштейнаlanguage classificationautomatic language distance measurementcomputational phylogenetic linguisticscross-evaluationперекрестная оценкавычислительная филогенетическая лингвистика

In book

Digital Geography: Proceedings of the International Conference on Internet and Modern Society (IMS 2022)
Springer, 2024.
Similar publications
The application of corpus-based language distance measurement to the diatopic variation study (on the material of the Old Novgorodian birchbark letters)
Afanasev I., Lyashevskaya O., , in: Proceedings of the Third Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2025).: Tartu: University of Tartu Library, 2025. P. 153–164.
The paper presents a computer-assisted exploration of a set of texts, where qualitative analysis complements the linguistically-aware vector-based language distance measurements, interpreting them through close reading and thus proving or disproving their conclusions. It proposes using a method designed for small raw corpora to explore the individual, chronological, and gender-based differences within an extinct single ...
Added: July 17, 2025
Basic vocabulary of Yupik languages: a lexicostatistical analysis
Yuri B. Koryakov, Journal of Language Relationship 2024 Vol. 22 No. 3–4 P. 296–341
This article presents a lexicostatistical classification of Yupik languages included in the Eskaleut family, using 110-word lists as the basis for comparison. The study aims to refine and expand upon previous lexicostatistical work on Yupik languages, focusing on semantic clarifications and contextual considerations in compiling the word lists. The study includes new data from recent ...
Added: March 7, 2025
String Similarity Measures for Evaluating the Lemmatisation in Old Church Slavonic
Afanasev I., Lyashevskaya O., , in: Structuring Lexical Data and Digitising Dictionaries: Grammatical Theory, Language Processing and Databases in Historical Linguistics.: Boston, Leiden: Brill, 2024. P. 13–35.
Added: January 7, 2025
Cipher, transform, get lost: an anti-transparent system for distance measurement in East Slavic lects
Afanasev I., Journal of Language Relationship 2023 Vol. 21 No. 3-4 P. 159–177
Recent advances in computational historical linguistics have inspired a discussion on newly implemented quantitative methods — mainly, it is about their lack of transparency, and the ways to overcome it. This paper aims to demonstrate the advantages of transparency for such tools. The study compares two types of language distance measurement systems used in classification. ...
Added: May 15, 2024
The Use of Khislavichi Lect Morphological Tagging to Determine its Position in the East Slavic Group
Afanasev I., , in: Proceedings of Tenth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2023).: Association for Computational Linguistics, 2023. P. 174–186.
The study of low-resourced East Slavic lects is becoming increasingly relevant as they face the prospect of extinction under the pressure of standard Russian while being treated by academia as an inferior part of this lect. The Khislavichi lect, spoken in a settlement on the border of Russia and Belarus, is a perfect example of ...
Added: May 15, 2023
Proceedings of Tenth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2023)
Association for Computational Linguistics, 2023.
These proceedings include the 23 papers presented at the 10th Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), co-located with the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL). Both EACL and VarDial were held in Dubrovnik, Croatia, in a hybrid format, allowing participants to attend on-site or ...
Added: May 15, 2023
Функционирование родных языков в современном мире: в вопросах и ответах
Ханова А. Ф., Славина Л. Р., Каз.: Институт языка, литературы и искусства им. Г. Ибрагимова АН РТ, 2021.
Научно-популярное издание включает в себя актуальную информацию, касающуюся функционирования родных языков в современном мире, происхождения и генеалогической классификации языков, отношений языка и общества, языка и культуры, языка и этноса, изложенную в формате «вопрос-ответ». Книга предназначена для широкого круга читателей. ...
Added: January 24, 2023
Лексические инновации и классификация уральских языков
Zhivlov M., В кн.: Hämeenmaalta Jamalille: kirja Tapani Salmiselle.: Helda Open Books, 2022. С. 361–375.
Традиционная классификация уральских языков предполагает последовательное бинарное ветвление сначала на самодийские и финно-угорские языки, затем финно-угорских – на угорские и финно-пермские, затем финно-пермских – на пермские и финно-волжские, и, наконец, финно-волжских – на волжские и финно-саамские. Признание за промежуточными узлами в этом дереве генетического статуса (т.е. наличия соответствующих праязыков) предполагает, что эти праязыки должны были ...
Added: April 8, 2022
Новая схема применения автоматической классификации для анализа социально-экономических систем
Rubchinskiy A., В кн.: XVII Апрельская международная научная конференция по проблемам развития экономики и общества: в 4 кн.Кн. 4.: М.: Издательский дом НИУ ВШЭ, 2017. С. 517–526.
Изложение предложенного подхода к задачам АК, а также полученных в его рамках результатов и является целью данной работы. В первой части излагается алгоритм построения семейства классифика- ций и определения по нему сложности рассматриваемой задачи АК. Во второй части рассматривается возможность применения предложенного подхода к анализу фондовых рынков. ...
Added: December 14, 2017
FAMILY OF GRAPH DECOMPOSITIONS AND ITS APPLICATIONS TO DATA ANALYSIS
Rubchinskiy A., / Series WP7 "Математические методы анализа решений в экономике, бизнесе и политике". 2016. No. WP7/2016/09.
A new decomposition approach to complex systems analysis is suggested. The conventional approach deals with the construction of a single, “the most correct”, decomposition of the considered system. Meanwhile the suggested approach is oriented to the construction of a family of decompositions, whose properties reveal some important meaningful features of the initial system. The expedience ...
Added: October 20, 2017
Divisive-Agglomerative Algorithm and Complexity of Automatic Classification Problems
Rubchinskiy A., / NRU Higher School of Economics. Series WP7 "Математические методы анализа решений в экономике, бизнесе и политике". 2015. No. WP7/2015/09.
An algorithm of solution of the Automatic Classification (AC for brevity) problem is set forth in the paper. In the AC problem, it is required to find one or several partitions, starting with the given pattern matrix or dissimilarity / similarity matrix. The three-level scheme of the algorithm is suggested. The output of the procedure ...
Added: October 19, 2017
МОБИЛЬНОЕ ПРИЛОЖЕНИЕ ДЛЯ СИНХРОНИЗАЦИИ ИСПОЛЬЗОВАНИЯ АУДИО- И ТЕКСТОВЫХ ФАЙЛОВ ЭЛЕКТРОННЫХ КНИГ
Alexandrov D., Нестеркина А. О., В кн.: Вузовская наука – региону : материалы XV Всероссийской научной конференции с международным участием.: Вологда: ВоГУ, 2017. С. 84–86.
The paper is devoted to creation of Android applications "Read&Listen" for reading of the electronic text books and listening to the audiobooks. The program allows to combine these two processes and automate the switching between the two types of books. The search for matches in transcription of the audio books and e-books is carried out ...
Added: September 27, 2017
Использование вероятностного распределения над множеством классов в задаче классификации арабских диалектов
Durandin O., Zolotykh N., Хилал Н. Р. et al., Научно-технический вестник информационных технологий, механики и оптики 2017 № 1(107) С. 110–116
Subject of Research.We propose an approach for solving machine learning classification problem that uses the information about the probability distribution on the training data class label set. The algorithm is illustrated on a complex natural language processing task - classification of Arabic dialects. Method. Each object in the training set is associated with a probability distribution over ...
Added: February 8, 2017
Интеллектуализация сервисов элетронных библиотек на основе самообучаемой системы классификации контента
Kharlamov A. A., Жонин А. А., Сергиевский Н. А. et al., Вестник Московского государственного лингвистического университета. Языкознание. Междисциплинарный подход в теоретической и практической лингвистике 2013 № 1 С. 81–91
Рассмотрены тенденции в развитии цифровых библиотек и их сервисов. Показано, что основное направление развития адаптивных сервисов цифровых библиотек связано с введением персонализации, которая улучшает качество их функций за счет подстройки к интересам пользователя.Предлагается подход к автоматической классификации на основе технологии для автоматического смыслового анализа текстов TextAnalyst как основание для формирования механизма персонализации. Описывается реализация программной ...
Added: November 12, 2016
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit