• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • Baselines and Symbol N-Grams: Simple Part-Of-Speech Tagging of Russian?
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
June 11, 2026
Doctoral Student at HSE University Reveals Hidden Layout of Ancient Parion
İdil Malgil, a researcher at HSE University, conducted a UAV-based LiDAR survey of the ancient Roman city of Parion in present-day Turkey. The high density of the scans allowed the team to detect subtle terrain features concealed beneath the ground and vegetation. The survey revealed traces of entire neighbourhoods, terraced structures, and walls that had remained invisible during routine excavations and could not be identified through aerial photography. The findings have been published in Ancient Civilizations from Scythia to Siberia.
June 11, 2026
Mathematicians from Nizhny Novgorod and Shanghai Study System Stability
Mathematicians at HSE University–Nizhny Novgorod, in collaboration with colleagues from Tongji University in Shanghai, are investigating the fundamental causes of structural stability in systems and the mechanisms underlying its disruption. In this interview with the HSE News Service, Prof. Olga Pochinka, Head of the International Laboratory of Dynamical Systems and Applications at HSE University–Nizhny Novgorod and leader of the project ‘Qualitative Theory of Systems of Ordinary and Partial Differential Equations,’ discusses the project, which is being implemented as part of HSE University's International Academic Cooperation programme.
June 11, 2026
Neurolinguists Assist in Awake Surgery on 11-Year-Old Patient with Epilepsy
Researchers at the HSE Centre for Language and Brain took part in a rare awake neurosurgical procedure performed on an 11-year-old patient with drug-resistant epilepsy. Working alongside surgeons at the Voyno-Yasenetsky Centre of Specialised Medical Care for Children in Solntsevo, they monitored the resection of a portion of the left temporal lobe, where the epileptic focus had been identified.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Baselines and Symbol N-Grams: Simple Part-Of-Speech Tagging of Russian?

P. 9–19.
Arefyev, N.V., Ermolaev P.

We propose using NB-SVM over bag of character n-grams input representation for determining part-of-speech tags and grammatical categories like gender, number, etc. for words in Russian texts. Several methods are compared including CRF (Conditional Random Fields), SVM (Support Vector Machines) and NB-SVM (Naive Bayes SVM) and superiority of NB-SVM over other classifiers is shown. The proposed model is the 5th best among 12 other models in the first shared task of the MorphoRuEval-17 challenge. We also experimented with category grouping when a single classifier is used to determine several grammatical categories and showed that it improves the model per- formance even further.

Language: English
Text on another site
Keywords: POS-taggingmultilabel classificationSupport Vector Machines (SVM)

In book

Supplementary Proceedings of the Sixth International Conference on Analysis of Images, Social Networks and Texts (AIST-SUP 2017), Moscow, Russia, July 27-29, 2017
Vol. 1975. , Aachen: CEUR-WS.org, 2017.
Similar publications
Hybrid Fault Detection in Three-Phase Induction Motors
Ali S., Khizhik A., Ryzhikov A. et al., , in: 2025 IEEE Ural-Siberian Conference on Biomedical Engineering, Radioelectronics and Information Technology (USBEREIT), 12-13 May 2025.: IEEE, 2025. P. 357–360.
Three-phase induction motors play a crucial role in industrial applications due to their efficiency, durability, and reliability. However, effective fault detection remains challenging, primarily due to the scarcity of labeled failure data, which limits the performance of traditional machine learning (ML)-based diagnostic models and increases the risk of overfitting and poor generalization. Conventional methods, such ...
Added: July 3, 2025
Сборка, хранение и предобработка коллекции документов для обучения multi-label классификатора текстов на естественном русском языке
Krayushkin O., Смирнов М., Чернобай Ю., В кн.: 1st conference on Software Engineering and Information Management (SEIM-2016).: СПб.: [б.и.], 2016.
в работе были выявлены основные особенности организации сборки, хранения и предобработки датасета для формирования обучающей выборки multi-label классификатора текстов на естественном русском языке ...
Added: November 4, 2021
Native Language Identification for Russian
Remnev N., , in: 2019 International Conference on Data Mining Workshops (ICDMW).: IEEE, 2019. P. 1–7.
The task of recognizing the author’s native language based on a text (Native Language Identification - NLI) is the task of automatically recognizing native language (L1) based on texts written in a language that is not native to the author. The NLI task was studied in detail for the English language, and two shared tasks ...
Added: October 18, 2021
Native Language Identification For Russian Using Errors Types
Remnev N., , in: Компьютерная лингвистика и интеллектуальные технологии: по материалам ежегодной международной конференции «Диалог» (Москва, 17–20 июня 2020 г.)Issue 19(26): дополнительный том.: -, 2020. P. 1123–1133.
The task of recognizing the author’s native (Native Language Identification—NLI) language based on a texts, written in a language that is non-native to the author—is the task of automatically recognizing native language (L1). The NLI task was studied in detail for the English language, and two shared tasks were conducted in 2013 and 2017, where ...
Added: October 18, 2021
Intelligent Data Processing 11th International Conference, IDP 2016, Barcelona, Spain, October 10–14, 2016, Revised Selected Papers
Switzerland: Springer, 2019.
This book constitutes the refereed proceedings of the 11th International Conference on Intelligent Data Processing, IDP 2016, held in Barcelona, Spain, in October 2016.     The 11 revised full papers were carefully reviewed and selected from 52 submissions. The papers of this volume are organized in topical sections on machine learning theory with applications; intelligent data processing in life ...
Added: February 8, 2020
Multilabel Classification for Inflow Profile Monitoring
Ignatov D. I., Spesivtsev P., Kurgansky D. et al., , in: Proceedings of the MACSPro Workshop 2019Vol. 2478: CEUR Workshop Proceedings.: CEUR-WS.org, 2019. P. 177–184.
The purpose of this study is to identify the position of non- performing inflow zones (sources) in a wellbore by means of machine learning techniques. The training data are obtained using the transient multiphase simulators and represented as the following time-series: bottom- hole pressure, well-head pressure, flowrates of gas, oil, and water along with a ...
Added: November 1, 2019
A cross-genre morphological tagging and lemmatization of the Russian poetry: distinctive test sets and evaluation
Starchenko A., Lyashevskaya O., , in: Digital Transformation and Global Society. Fourth International Conference, DTGS 2019, St. Petersburg, Russia, June 19–21, 2019, Revised Selected Papers.: Springer, 2019. P. 732–743.
The poetic texts pose a challenge to full morphological tagging and lemmatization since the authors seek to extend the vocabulary, employ morphologically and semantically deficient forms, go beyond standard syntactic templates, use non-projective constructions and non-standard word order, among other techniques of the creative language game. In this paper we evaluate a number of probabilistic ...
Added: June 12, 2019
Morphological analysis for Russian: Integration and comparison of taggers
Kuzmenko E., , in: Analysis of Images, Social Networks and Texts. 5th International Conference, AIST 2016, Yekaterinburg, Russia, April 7-9, 2016, Revised Selected Papers. Communications in Computer and Information ScienceVol. 661.: Switzerland: Springer, 2017. P. 162–161.
In this paper we present a comparison of three morphological taggers for Russian with regard to the quality of morphological disambiguation performed by these taggers. We test the quality of the analysis in three different ways: lemmatization, POS-tagging and assigning full morphological tags. We analyze the mistakes made by the taggers, outline their strengths and ...
Added: April 22, 2017
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit