Визуализация данных для каталога русских лексических конструкций (на материале НКРЯ)

Митрофанова О. А.; Паничева П. В.

?

Визуализация данных для каталога русских лексических конструкций (на материале НКРЯ)

С. 465–477.

Митрофанова О. А., Паничева П. В.

Our research aims at automatic identification of constructions associated with particular lexical items and its subsequent use in building the catalogue of Russian lexical constructions. The study is based on the data extracted from the Russian National Corpus (RNC, http://ruscorpora.ru). The main accent is made on extensive use of morphological and lexico-semantic data drawn from the multi-level corpus annotation. Lexical constructions are regarded as the most frequent combinations of a target word and corpus tags which regularly occur within a certain left and/or right context and mark a given meaning of a target word. We focus on nominal constructions with target lexemes that refer to speech acts, emotions, and instruments. The toolkit that processes corpus samples and learns up the constructions is described. We provide analysis for the structure and content of extracted constructions (e.g. r:ord der:num t:ord r:qual|pervyj ‘first’ + LJUBOV’ ‘love’; LJUBOV’ ‘love’ + PR|s ‘from’ + ANUM m sg gen|pervyj ‘first’ + S f inan sg gen|vzgljad ‘sight’ = love at first sight). As regards their structure, constructions may be considered as n-grams (n is 2 to 5). The representation of constructions is bipartite as they may combine either morphological and lemma tags or lexical-semantic and lemma tags. We discuss the use of visualization module PATTERN.GRAPH that represents the inner structure of extracted constructions.

Language: Russian

Text on another site

Keywords: НКРЯ лексическая сочетаемость лексико-семантическая разметка визуализация данных Russian National Corpus nominal constructions word co-occurrence lexico-semantic annotation data vizualization именные конструкции лексико-грамматическая разметка

In book

Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной Международной конференции «Диалог» (Бекасово, 29 мая - 2 июня 2013 г.). В 2-х т.

Т. 1: Основная программа конференции. Вып. 12 (19). , М.: РГГУ, 2013.

Обэриуты в кругу Михаила Кузмина (сетевой анализ)

Pakhomova A., Вестник Московского университета. Серия 9: Филология 2026 № 1 С. 162–177

Beginning in the mid-1920s Aleksandr Vvedensky, Daniil Kharms, and Konstantin Vaginov became acquainted with the circle of the poet, writer, and playwright Mikhail Kuzmin, and by the close of this decade, they became regular visitors to his residence. The interactions between Kuzmin and the Oberiuts has been sufficiently developed; however, numerous studies have shifted the focus to the pragmatics of ...

Added: April 1, 2026

Проектирование инструментов визуализации данных на основе интеграции возможностей предметно-ориентированного моделирования и генеративного искусственного интеллекта

Джейранян А. Д., Ларионова Я. А., Lyadova L. N., В кн.: ГрафиКон 2025 : материалы 35-й Международной конференции по компьютерной графике и машинному зрению (Россия, Йошкар-Ола, 30 сентября – 2 октября 2025 г.).: Йошкар-Ола: Поволжский государственный технологический университет, 2025. С. 353–366.

Data visualization tools are key analytics tools that make it easier to identify dependencies, trends, and patterns. These tools are used by a broad range of specialists (analysts, scientists, business leaders, managers, teachers, and specialists in other fields where accurate and understandable presentation of information is critically important to improve the effectiveness of data analysis). ...

Added: February 21, 2026

Преодоление ограничений современных инструментов визуализации данных: новая методология

Джейранян А. Д., Lyadova L. N., В кн.: BIG DATA и анализ высокого уровня = BIG DATA and Advanced Analytics: сб. науч. ст. XI Междунар. науч.-практ. конф. (Республика Беларусь, Минск, 23–24 апреля 2025 года).: Мн.: БГУИР, 2025. С. 386–396.

An analysis of existing data visualization methods and tools was conducted, revealing their limitations associated with the need for advanced programming skills or insufficient configuration flexibility. A methodology for developing data visualization tools based on expert knowledge is proposed, enabling the effective creation of customizable visual representations. The methodology comprises three key approaches: (1) the ...

Added: February 21, 2026

Динамика восприятия площадей в пространстве города носителями русского языка (сравнительный анализ по данным НКРЯ)

Belova P., В кн.: Актуальные вопросы лингвистики и литературоведения: сборник научных статей по материалам международной научной конференции памяти доктора филологических наук, профессора Л.А. Араевой (6–8 февраля 2025).: Кемеровский государственный университет, 2025. С. 155–160.

This article contains research results on the dynamics of squares’ perception in the city space in the Russian language picture of the world over time, starting from the second half of the XXth century to the present. Turning to the subcorpus of literary texts of the second half of the XXth century and the XXIst ...

Added: February 4, 2026

Языковая концептуализация пространства в художественном тексте (по данным НКРЯ)

Belova P., В кн.: Когнитивные исследования языка. Вып. №1 (62): материалы Международной научной конференции по когнитивной лингвистике. 5-7 июня 2025. Ч. 2Ч. 2. Кн. 62. Вып. 1.: ТюмГУ-Press, 2025. С. 56–60.

Данная статья представляет результаты изучения содержания концепта ПРОСТРАНСТВО в русском языковом сознании на материале художественных прозаических текстов разных жанров, созданных во второй половине XX века и в XXI веке и представленных в НКРЯ. Анализ проведен с учетом таких культурно-языковых фильтров, как пропозициональные установки, предметно-понятийные корреляции и метафорические преобразования. ...

Added: February 4, 2026

Integrating an Ontology-Driven Approach to Data Visualization and AI Based Visualization with Plotly

A.D. Dzheiranian, L.N. Lyadova, Proceedings of the Institute for System Programming of the RAS 2025 Vol. 37 No. 4 P. 191–206

This study introduces an AI-driven assistant prototype that automates the generation of data visualization scripts from natural language queries, eliminating the need for users to have programming skills. The article examines research aimed at developing tools for effective data visualization, compares data visualization systems based on the use of artificial intelligence, and shows the limitations ...

Added: September 25, 2025

АВТОМАТИЗАЦИЯ СБОРА ДАННЫХ С САЙТОВ НЕДВИЖИМОСТИ И АНАЛИЗ МОСКОВСКОГО РЫНКА ЖИЛЬЯ С ИСПОЛЬЗОВАНИЕМ ЛОКАЛЬНОЙ СРЕДНЕЙ ЦЕНЫ КВАДРАТНОГО МЕТРА

Чурбанов Р. Р., Правовая информатика 2025 № 3 С. 79–89

The article describes an integrated solution for automated collection of detailed data on apartment listings in the primary and secondary housing markets of Moscow and their analytical processing. The solution combines web scraping tools (parsing HTML code of real estate websites using Python) with a data warehouse based on Microsoft SQL Server and an interactive ...

Added: August 28, 2025

Корпусная лингвистика на современном этапе

Plungian V., Вестник Российской академии наук 2024 Т. 94 № 9 С. 787–794

Даётся общее представление о корпусной лингвистике, её истории, методах и влиянии на современные представления об изучении языка, которое обычно обозначается как “корпусная революция”. ...

Added: December 16, 2024

Языково-ориентированный подход к разработке средств визуализации данных: генерация кода для интерактивной визуализации

Проскуряков К. А., Lyadova L. N., В кн.: Технологии разработки инструментальных средств (ТРИС-2024) : Материалы XIV Международной научно-технической конференции.: Таганрог: Издательство ЮФУ, 2024. С. 207–219.

Abstract: The goal of the project is to approve an approach to generating code implementing user data visualization models based on metamodels of visual domain-specific languages (DSL), created to describe visualization models, and descriptions of formal grammars of target textual programming languages presented in a multi-aspect ontology. The ontology also includes descriptions of “Model-Text” transformation ...

Added: December 11, 2024

Языково-ориентированный подход к разработке средств визуализации данных: автоматизация разработки DSL

Ермаков И. Д., Lyadova L. N., В кн.: Технологии разработки инструментальных средств (ТРИС-2024) : Материалы XIV Международной научно-технической конференции.: Таганрог: Издательство ЮФУ, 2024. С. 195–206.

Abstract: The purpose of the project is approbation of approach to the developing automation tools for designing domain-specific languages (DSL) to create data visualization tools customized to user needs. The main idea of the language-oriented approach is that visual domain-specific languages should be developed to describe new visualization models, the use of which should ensure ...

Added: December 11, 2024

On an approach to data analysis and visualization in the domain of employee-organization relationships

Kishankumar Bhimani, Saradva K., Системы и средства информатики 2024 Vol. 34 No. 4 P. 115–136

An increasing number of domains in science and industry rely on the intensive use of data. In such domains, obtaining new knowledge is almost impossible without the use of modern methods of data analysis and visualization. A typical example is the domain of human resource (HR) management. This paper proposes an approach to the application ...

Added: December 9, 2024

An Approach to Developing Data Visualization Tools Based on Domain Specific Modeling

A. D. Dzheiranian, Ermakov I. D., Proskuryakov K. A. et al., Scientific Vizualisation 2024 Vol. 16 No. 4 P. 82–101

An approach to the development of data visualization tools is described that provides the ability to customize to the needs of users and the specifics of the domains in which they work, based on domain-specific modeling. The results of the analysis of data visualization tools and the possibility of customizing them to subject area based ...

Added: November 24, 2024

Разработка инструментов визуализации данных на основе предметно-ориентированного моделирования

Джейранян А. Д., Ермаков И. Д., Проскуряков К. А. et al., В кн.: GraphiCon 2024: Материалы 34-й Международной конференции по компьютерной графике и машинному зрению (Россия, Омск, 17–19 сент. 2024 г.).: Омск: Издательство ОмГТУ, 2024. С. 300–314.

Added: November 15, 2024

Расхождение рейтингов ESG: международный и российский опыт

Murach A., Storchevoy M., Магдалена Алехандра Гаете Сепулведа, Экономическая политика 2024 Т. 19 № 4 С. 84–121

ESG rating is an important indicator of a company’s social responsibility, which is taken into account by investors and regulators. However, if the ESG rating is calculated incorrectly, investors and regulators will make erroneous decisions and companies will be given improper guidance about how to modify their operations. Currently, there is a serious methodological problem ...

Added: October 7, 2024

Merging Directly-Follows Graphs and Sankey Diagrams for Visualizing Acyclic Processes

Derezovskiy I., Shaimov N., Lomazova I. A. et al., Proceedings of the Institute for System Programming of the RAS 2024 Vol. 36 No. 4 P. 155–168

This paper proposes a method to visualize models of acyclic processes based on merging DirectlyFollows Graphs (DFG) and Sankey diagrams. DFG is a popular graphical model to visualize discrete process models, while Sankey diagrams are used to represent flows of any kind. Our approach, based on flow diagrams, allows us to highlight individual cases or ...

Added: October 3, 2024

Designing Data Visualization System Based on Language-Oriented Approach

A. D. Dzheiranian, Ermakov I. D., Proskuryakov K. A. et al., Proceedings of the Institute for System Programming of the RAS 2024 Vol. 36 No. 2 P. 127–140

The data visualization method based on a language-oriented approach is proposed. An analysis of data visualization tools and their customizability for subject areas based on user needs has been carried out. It is noted that these tools require highly qualified users to customize the data visualization format (users must have programming skills). It is proposed ...

Added: July 29, 2024

Effective removal of global tilt from atomically-resolved topography images of vicinal surfaces with narrow terraces

Aladyshkin A. Y., Chaika A. N., Semenov V. N. et al., Ultramicroscopy 2024 Vol. 267 Article 114053

The main feature of vicinal surfaces of crystals characterized by the Miller indices(ℎℎ𝑚)is rather small width(less than 10 nm) and substantially large length (more than 200 nm) of atomically-flat terraces. This makesdifficult to apply standard methods of image processing and correct visualization of crystalline lattices at theterraces and multiatomic steps. Here we consider two procedures ...

Added: June 5, 2024

Городские площади: взаимосвязь функций и пространства в русской языковой картине мира

Белова П.Е., В кн.: ЗНАЧЕНИЕ КАК ФЕНОМЕН АКТУАЛЬНОГО ЯЗЫКОВОГО СОЗНАНИЯ НОСИТЕЛЯ ЯЗЫКАВып. 9.: М.: ООО "Издательство Ритм", 2023. С. 84–93.

This article is devoted to the study of the square perception in the city space in the Russian language picture of the world by referring to the contexts of this lemma use in the Russian National Corpus. During the study the distributional possibilities of the word square were considered, subject-conceptual correlations were established, and propositional ...

Added: May 6, 2024

Национальный корпус русского языка 2.0: новые возможности и перспективы развития

Савчук С. О., Архангельский Т. А., Bonch-Osmolovskaya A. A. et al., Вопросы языкознания 2024 № 2 С. 7–34

The paper provides an overview of the results of the fundamental reconstruction and modernization project of the National Corpus of the Russian Language platform, carried out from 2020 to 2023. The focus of the paper is on the new opportunities that are opening up for linguists and a wider audience. This includes improving the representativeness ...

Added: March 21, 2024

Английские прилагательные со значением размера: Когнитивные модели формирования словосочетаний

Antonova M., Вестник Томского государственного университета 2023 № 488 С. 91–100

The article analyzes from the cognitive point of view the linguistic system factors that cause differences in lexical combinability of English parametric adjectives ample, extended, expanded, wide and broad. It is hypothesized that the combinability of these adjectives is conditioned by deep cognitive models underlying their semantics. It is shown that these models are inherited ...

Added: September 22, 2023

Disambiguation in context in the Russian National Corpus: 20 yeas later

Lyashevskaya O., Afanasev I., Stefan Rebrikov et al., , in: Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог». Вып. 22.Вып. 22.: [б.и.], 2023. P. 307–318.

An updated annotation of the Main, Media, and some other corpora of the Russian National Corpus (RNC) features the part-of-speech and other morphological information, lemmas, dependency structures, and constituency types. Transformer-based architectures are used to resolve the homonymy in context according to a schema based on the manually disambiguated subcorpus of the Main corpus (morphology ...

Added: September 15, 2023