A method of ontology-aided expertise matching for facilitating knowledge exchange

N. Karpov; T. S. Babkina; E. Babkin

?

A method of ontology-aided expertise matching for facilitating knowledge exchange

P. 43–54.

The paper proposes a new method for facilitating knowledge exchange by seeking relevant university experts for commenting actual information events arising in the open environment of a modern economical cluster. This method is based on a new mathematical model of ontology concepts matching. We propose to use in the formal core of our method a new modification of Latent Dirichlet allocation. The method and the mathematical model of ontology matching were validated in the form of a software-based solution: the newly designed decision support system titled EXPERTIZE. The system regularly monitors different text sources in the Internet, performs document analysis and provides university employees with critical information about relevant events according a developed matching algorithm. In the proposed solution we made several contributions to the advances of knowledge processing, including: new modifications of topic modeling method suitable for application in expert finding tasks, integration of new algorithms and existing ontology services to show feasibility of the solution.

Language: English

Full text

Text on another site

Keywords: моделирование natural language processing автоматическая обработка естественного языка topic modeling expert finding поиск эксперта

In book

CEUR Workshop Proceedings. 2nd International Workshop on Ontologies and Information Systems, WOIS 2014; Lund; Sweden; 22 September 2014 through 24 September 2014

Vol. 1230. , Lund: CEUR Workshop Proceedings, 2014.

Разработка сервиса поиска экспертов для актуальных информационных событий

Karpov N., Shadrina E. V., Алгоритмы, методы и системы обработки данных 2015 № 4(33) С. 33–47

In this paper, we propose a new way to develop a service for sharing knowledge in the university cluster by searching for appropriate experts. The method is based on a modern approach to the search for experts with the help of topic modeling. The service has been implemented in the form of a decision support ...

Added: February 4, 2016

Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning

Berlin: Association for Computational Linguistics, 2016.

The 2016 Conference on Computational Natural Language Learning is the twentieth in the series of annual meetings organized by SIGNLL, the ACL special interest group on natural language learning. CoNLL 2016 will be held on August 11-12, 2016, and is co-located with the 54th annual meeting of the Association for Computational Linguistics (ACL) in Berlin, ...

Added: November 12, 2016

Exploration of register-dependent lexical semantics using word embeddings

Kutuzov A. B., Kuzmenko E., Marakasova A., , in: Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH). Osaka: [б.и.], 2016. P. 26–34.

We present an approach to detect differences in lexical semantics across English language registers, using word embedding models from distributional semantics paradigm. Models trained on register-specific subcorpora of the BNC corpus are employed to compare lists of nearest associates for particular words and draw conclusions about their semantic shifts depending on register in which they ...

Added: November 12, 2016

Корпус татарского языка "Туган тел"

Arkhangelskiy T., Гильмуллин Р. А., Невзорова О. А. et al., Научно-техническая информация. Серия 2: Информационные процессы и системы 2013

В статье описывается электронный корпус татарского языка, созданный в рамках программы фундаментальных исследований Президиума РАН "Корпусная лингвистика", и методы, использованные авторами для создания этого корпуса. В частности, описываются текстовый состав и жанровая структура корпуса, принятые авторами решения о выделении морфологических характеристик, автоматическая морфологическая разметка текстов с помощью двухуровневой модели морфологии и анализатора PC-KIMMO и размещение ...

Added: October 25, 2013

NRU-HSE at SemEval-2017 Task 4: Tweet Quantification Using Deep Learning Architecture

Karpov N., , in: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Vancouver: Association for Computational Linguistics, 2017. P. 683–688.

In many areas, such as social science, politics or market research, people need to deal with dataset shifting over time. Distribution drift phenomenon usually appears in the field of sentiment analysis, when proportions of instances are changing over time. In this case, the task is to correctly estimate proportions of each sentiment expressed in the ...

Added: November 14, 2017

Language Exercise Generation: Emulating Cambridge Open Cloze

Malafeev A., International Journal of Conceptual Structures and Smart Applications (IJCSSA) 2014 Vol. 2 No. 2 P. 20–35

This article presents an approach to the automatic generation of open cloze exercises based on arbitrary English text. The exercise format is similar to the open cloze test used in Cambridge English certificate exams (FCE, CAE, CPE). The presented method also makes it possible to adjust the difficulty of the resulting exercises to better suit ...

Added: November 29, 2014

Daba: a model and tools for Manding corpora

Kirill Maslinsky, , in: TALN-RECITAL 2014 Workshop TALAf 2014 : Traitement Automatique des Langues Africaines (TALAf 2014: African Language Processing). Marseille: Association pour le Traitement Automatique des Langues, 2014. P. 114–122.

This article provides a brief overview of Daba software package created in the course of building corpora for Manding languages. Key software features are motivated by the tasks and problems characteristic of many African languages. The corpus-building model proposed here was initially developed for Bambara Reference Corpus which is available online and is freely accessible. ...

Added: March 26, 2015

Analysis of Images, Social Networks and Texts

NY: Springer, 2014.

This book constitutes the proceedings of the Third International Conference on Analysis of Images, Social Networks and Texts, AIST 2014, held in Yekaterinburg, Russia, in April 2014. The 11 full and 10 short papers were carefully reviewed and selected from 74 submissions. They are presented together with 3 short industrial papers, 4 invited papers and ...

Added: November 4, 2014

Redefining part-of-speech classes with distributional semantic models

Kutuzov A. B., Velldal E., Øvrelid L., , in: Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning. Berlin: Association for Computational Linguistics, 2016. P. 115–125.

This paper studies how word embeddings trained on the British National Corpus interact with part of speech boundaries. Our work targets the Universal PoS tag set, which is currently actively being used for annotation of a range of languages. We experiment with training classifiers for predicting PoS tags for words based on their embeddings. The ...

Added: November 12, 2016

Authorship Attribution in Russian with New High-Performing and Fully Interpretable Morpho-Syntactic Features

Pimonova E., Durandin O., Malafeev A., , in: Analysis of Images, Social Networks and Texts. 8th International Conference, AIST 2019, Lecture Notes in Computer Science, Revised Selected PapersVol. 11832. Cham: Springer, 2019. P. 193–204.

This work tackles the problem of modeling author style in Russian. In particular, we solve the task of authorship attribution using the collected dataset of 30 authors, 1506 texts written in the period of 18th – 21st century. We apply various approaches to solving the attribution problem: Random Forest, Logistic Regression, SVM Classifier. In terms ...

Added: November 7, 2019

Извлечение сценарной информации из текстов. Часть 1. Постановка задачи и обзор методов

Суворова М. И., Кобозева М. В., Toldova S. et al., Искусственный интеллект и принятие решений 2020 № 1 С. 17–26

В статье обсуждается важность автоматического сценарного анализа для понимания текстов на естественном языке. Дан широкий обзор методов и подходов к описанию и извлечению сценариев. Рассмотрены теоретические подходы к формализации сценариев. Приведен список задач, для решения которых используется информация о сценарной структуре текста. Представлены популярные подходы к автоматическому извлечению сценариев из текстов и методы оценки их ...

Added: April 22, 2020

Improving Distributional Semantic Models Using Anaphora Resolution during Linguistic Preprocessing

Kutuzov A. B., Козлова О. С., , in: Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог» (Москва,1–4 июля 2016 г.)Вып. 15. М.: Изд-во РГГУ, 2016. P. 288–300.

In natural language processing, distributional semantic models are known as an efficient data driven approach to word and text representation, which allows computing meaning directly from large text corpora into word embeddings in a vector space. This paper addresses the role of linguistic preprocessing in enhancing performance of distributional models, and particularly studies pronominal anaphora ...

Added: November 12, 2016

Использование информационной теории восприятия речи для анализа качества речи

Karpov N., В кн.: Современные проблемы информатизации в анализе и синтезе технологических и программно-телекоммуникационных систем: Сборник трудовВып. 17. Воронеж: Научная книга, 2012. С. 264–266.

Added: November 7, 2012

Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH)

Osaka: [б.и.], 2016.

Language resources are increasingly used not only in Language Technology (LT), but also in other subject fields, such as the digital humanities (DH) and in the field of education. Applying LT tools and data for such fields implies new perspectives on these resources regarding domain adaptation, interoperability, technical requirements, documentation, and usability of user interfaces. ...

Added: November 12, 2016

TALN-RECITAL 2014 Workshop TALAf 2014 : Traitement Automatique des Langues Africaines (TALAf 2014: African Language Processing)

Marseille: Association pour le Traitement Automatique des Langues, 2014.

Dans la suite du premier atelier TALAf qui s'est tenu le 8 juin 2012 à Grenoble, lors de la conférence JEP-TALN-RECITAL 2012 (voir les actes : http://aclweb.org/anthology//W/W12/#1300), nous proposons une nouvelle édition de cet atelier lors de la conférence TALN 2014 le premier juillet à Marseille. Cette deuxième édition montre l'intérêt d'un atelier francophone sur le traitement ...

Added: March 26, 2015

Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022)

Marseille: European Language Resources Association (ELRA), 2022.

The proceedings are organised on the basis of the 22 Tracks of the Conference on Language Resources and Evaluation (LREC) held in Marseille, France, from 20 to 25 June 2022. Major topics include corpora and annotation (including tools, systems, treebanks), information extraction and information retrieval (including ner, qa, text mining, document classification, text categorisation), applications involving lrs and evaluation (including ...

Added: February 22, 2023

8th Russian Summer School in Information Retrieval (RuSSIR 2014)

Braslavski P., Karpov Nikolay, Worring M. et al., ACM SIGIR Forum 2014 Vol. 48 No. 2 P. 105–110

The 8th Russian Summer School in Information Retrieval (RuSSIR 2014) was held on August 18-22, 2014 in Nizhniy Novgorod, Russia.1 The school was co-organized by the National Research University Higher School of Economics2 and the Russian Information Retrieval Evaluation Seminar (ROMIP) ...

Added: August 22, 2015

Think about what you’ve learned: аспектный анализ тональности для моделирования пользовательского опыта в сфере онлайн-образования

Kirina M., Человек: образ и сущность. Гуманитарные аспекты 2023 № 2(58) С. 176–204

The article focuses on the application of opinion mining techniques to evaluate user experience on the Hyperskill educational platform, using Python, Java, and Kotlin programming projects as the basis of analysis. The study utilizes sentiment analysis and keyword extraction methods to gauge users' attitudes towards the platform, learning process, and topics covered. To achieve this, ...

Added: December 9, 2023

Universal Dependencies for Russian: A New Syntactic Dependencies Tagset

Lyashevskaya O., Droganova K., Zeman D. et al., / NRU HSE. Series WP BRP "Linguistics". 2016. No. 44.

This paper presents the Universal Dependencies tagset (UD v1) as a new annotation scheme for Russian treebanks. The universal list of dependency relations was adopted and extended to comply with certain language-specific syntactic constructions. The tagset was validated, converting two Russian treebanks into the UD format, UD-Russian-SynTagRus and UD-Russian-Google. ...

Added: December 14, 2016

Latent Dirichlet Allocation: Stability and Applications to Studies of User-Generated content

Koltsov S., Koltsova O., Nikolenko S. I., , in: Proceedings of WebSci '14 ACM Web Science Conference, Bloomington, IN, USA — June 23 - 26, 2014. NY: ACM, 2014. P. 161–165.

Topic modeling, in particular the Latent Dirichlet Allocation (LDA) model, has recently emerged as an important tool for understanding large datasets, in particular, user-generated datasets in social studies of the Web. In this work, we investigate the instability of LDA inference, propose a new metric of similarity between topics and a criterion of vocabulary reduction. ...

Added: October 17, 2014

Оптимальное планирование загрузки ресурсов предприятия: базовая постановка задачи в непрерывном времени и ее расширения

Lazarev A. A., Некрасов И. В., Правдивец Н. А., В кн.: Танаевские чтения. Доклады Седьмой Международной научной конференции (28-29 марта 2016 года, Минск). Мн.: Объединенный институт проблем информатики НАН Беларуси, 2016. С. 108–113.

Рассматривается задача объемного планирования выпуска продукции промышленного предприятия. Строится целочисленная модель решения задачи и предлагаются ее расширения в виде дополнительных линейных ограничений, позволяющие учесть некоторые типовые сценарии загрузки ресурсов. Сформулированная задача разрешима полиномиально, так как является задачей ЛП. ...

Added: October 19, 2016

Моделирование переноса распыленных атомов в газоразрядных распылительных системах

Бондаренко Г.Г., Коржавый А. П., Кристя В. И. et al., Металлы 1997 № 3 С. 154–157

Получено аналитическое выражение, описывающее распределение концентрации распыления атомов в цилиндрической камере газоразрядного распылительного устройства. Рассчитанные на его основе величины потоков распыленных атомов на стенке камеры удовлетворительно согласуются с результатами экспериментальных измерений. ...

Added: December 6, 2013

Полевой принцип организации ментального лексикона и сценарии активации полей

Белоусов К. И., Leshchenko Y., Вопросы психолингвистики 2018 Т. 1 № 35 С. 39–53

Ментальный лексикон является сложной системой, которая в языковой форме отражает процессы структурирования человеком окружающей его действительности. Ментальный лексикон может быть представлен в виде многомерной сети, структурными элементами которой являются узлы (фрагменты информации, зафиксированной в сознании) и межузловые связи (способы взаимодействия элементов информации друг с другом). Связи между узлами лексикона могут иметь разную степень активации и ...

Added: January 15, 2022

Математические модели как инструмент для руководителя и команды управления программой

Frolkina E., Управление проектами и программами 2015 Т. 44 № 4 С. 264–278

Моделирование в управлении программой — это инструмент, с помощью которого можно повысить эффективность компании и достичь ее стратегических целей. В данной статье автор анализирует и обобщает различные математические модели управления программой, а также предлагает собственную классификацию существующих моделей, позволяющую выявить пробелы в данной области и определить направления, требующие дальнейшего исследования. ...

Added: November 14, 2015