Внедрение в TXM дополнительных инструментов автоматической обработки текста

Лаврентьев А. М.; Соловьев Ф. Н.; А. М. Чеповский

Publications

?

Внедрение в TXM дополнительных инструментов автоматической обработки текста

С. 55–62.

Лаврентьев А. М., Соловьев Ф. Н., Chepovskiy A.

Language: Russian

Full text

Keywords: глагольное управление stemming automated text analysis TXM platform noun phrases verbal dependencies платформа TXM псевдоосновы именные группы автоматический анализ текстов

In book

Труды международной конференции "Корпусная лингвистика - 2019"

СПб.: Издательство Санкт-Петербургского университета, 2019.

Сравнительный анализ переводов поэтических текстов

Аванесян Н. Л., Chepovskiy A., Правовая информатика 2025 № 3 С. 37–44

The applicability of the previously developed methodology for frequency analysis of various linguistic characteristics in Russian-language texts has been demonstrated for the translation of poetic texts. This methodology enables the comparison of different texts and the identification of distinguishing features for the identification of texts in natural language, using pairwise rank correlation coefficients of frequency ...

Added: October 13, 2025

The impact of innovation news coverage on illiquid stocks: the case of US market

Fedorova E., Stepanov V., European Journal of Innovation Management 2023 Vol. 27 No. 5 P. 1767–1792

The purpose of this study is to determine stock market reactions to the news about innovations and other types of publications for illiquid stocks. Design/methodology/approach (1) The authors opt for machine learning techniques and expert analysis and propose their own lexicon of innovations based on the news articles published on the professional website; (2) the dataset consists ...

Added: April 29, 2025

АНАЛИЗ КОРПУСА ПОЭТИЧЕСКИХ ТЕКСТОВ НА ПЛАТФОРМЕ TXM

Fokina A., Chepovskiy A., В кн.: Труды международной конференции «Корпусная лингвистика — 2023», 21–23 июня 2023 г., Санкт-Петербург.: СПб.: Издательство Санкт-Петербургского университета, 2024. С. 224–231.

The paper considers the results of correspondence analysis based on the TXM corpus analysis platform. The corpus of poetry of the Silver Age was created and studies, it includes subcorpuses of the main trends, authors of these trends and ages when the poems were written. This helps to analyze the influence of historical events of ...

Added: December 2, 2024

Применение вычислительных методов корпусного анализа к исследованию текстов литературных произведений

Аванесян Н. Л., Губина О. В., Chepovskiy A., Труды Института системного анализа Российской академии наук 2024 Т. 74 № 2 С. 25–32

This article is devoted to the application of corpora analysis mathematical methods for the research of Russian fiction texts. A corpus of prose texts of Russian XIX century fiction, consisting of five subcorpora, has been created for the research. Each subcorpora contains texts of one certain author. Using the example of the created corpora, the ...

Added: July 4, 2024

Применение платформы TXM для анализа текстов различного типа

Fokina A., Бурба А. В., В кн.: Межвузовская научно-техническая конференция студентов, аспирантов и молодых специалистов им. Е.В. Арменского 2023.: МИЭМ НИУ ВШЭ, 2023. С. 283–285.

В работе рассматриваются результаты исследования текстов на основе метода анализа соответствий платформы корпусного анализа TXM. Исследованы не связанные между собой корпусы противоправных и поэтических текстов для проверки применимости и эффективности методики на несходных наборах текстов. В результате выявлена результативность применения анализа соответствий для корпусов различного типа. Сделан вывод о возможности применения данного инструмента платформы TXM для оценки качества ...

Added: May 6, 2024

Использование платформы TXM корпусного анализа для анализа текстов сообществ социальных сетей

Fokina A., Chepovskiy A., Chepovskiy A., Вестник Новосибирского государственного университета. Серия: Информационные технологии 2023 Т. 21 № 2 С. 29–38

When forming graphs of interacting objects built when importing data from social networks and instant messaging networks, text data also act as vertex attributes. In this paper, the authors describe a text research methodology based on corpus analysis procedures. The purpose of this article is to test the methodological tools provided by the TXM software for the ...

Added: October 9, 2023

Анализ текстов сообществ социальных сетей

Аванесян Н. Л., Зенькова В. В., Chepovskiy A. et al., Успехи кибернетики 2023 Т. 4 № 2 С. 33–39

In this paper the authors describe the methodology for the statistical analysis of texts in the network of Telegram channels based on comparison of automatically generated frequency dictionaries by methods of correlation analysis. Coefficients of pairwise rank correlation are considered for comparing the frequency characteristics of texts in natural language. The method is proposed to ...

Added: July 19, 2023

Применение методов корпусной лингвистики к анализу текстов Telegram

Асеева Я. О., Fokina A., В кн.: Информационно-телекоммуникационные технологии и математическое моделирование высокотехнологичных систем: материалы Всероссийской конференции с международным участием, Москва, РУДН, 17–21 апреля 2023 г.: М.: Российский университет дружбы народов, 2023. С. 290–294.

Recently, the number of Telegram messenger users worldwide has exceeded 700 million people in monthly terms and continues to grow every day. Telegram is used not only to exchange personal messages, but it has also become a leading platform for political, cultural and news channels — an alternative to traditional media. The purpose of this ...

Added: June 8, 2023

Effort versus performance tradeoff in lemmatisation for Uralic languages

Tyers F. M., Bibaeva M., / Series 2020.iwclul-1.2 "Proceedings of the Sixth International Workshop on Computational Linguistics of Uralic Languages". 2020.

Lemmatisers in Uralic languages are required for dictionary lookup, an important task for language learners. We explore how to decide which of the rule-based and unsupervised categories is more efficient to invest in. We present a comparison of rule-based and unsupervised lemmatisers, derived from the Giellatekno finite-state morphology project and the Morfessor surface segmenter trained on Wikipedia, respectively. The comparison spanned ...

Added: April 20, 2021

Характеристики текстов сообществ социальных сетей

Аванесян Н. Л., Соловьев Ф. Н., Chepovskiy A., Вестник Новосибирского государственного университета. Серия: Информационные технологии 2021 Т. 19 № 1 С. 5–14

In this paper the authors describe the methodology for the statistical analysis of texts in social networks based on comparison of automatically generated frequency dictionaries by methods of correlation analysis. Psycholinguistic characteristics and coefficients of pairwise rank correlation are considered for comparing the frequency characteristics of texts in natural language ...

Added: April 14, 2021

Выявление значимых признаков противоправных текстов

Аванесян Н. Л., Соловьев Ф. Н., Тихомирова Е. А. et al., Вопросы кибербезопасности 2020 № 4(38) С. 76–84

Разработана методика частотного анализа лексики противоправных текстов, которая позволяет по частотным словарям сравнивать различные наборы текстов и выявлять дифференцирую-шие признаки; приведена методика вычисления коэффициента попарной ранговой корреляции для сравнения частотных словарей различных лексических характеристик; проведен сравнительный анализ различных по те-матике коллекций текстов противоправной направленности; показана возможность использования частотных лексических характеристик для исследования свойств текстов с целью ...

Added: December 4, 2020

PERSONAL BRANDS OF ESPORTS ATHLETES: AN EXPLORATION OF EVALUATION MECHANISMS

Musabirov I., Bulygin D., Марченко Е. Ю., / Series WP BRP "Basic research program". 2019. No. 90.

Added: December 12, 2019