Information Extraction Based on Deep Syntactic-Semantic Analysis

Skorinkin D.A.; Budnikov E. A.; Stepanova M. E.; Matavina P. V.; Chelombeeva A. N.

?

Information Extraction Based on Deep Syntactic-Semantic Analysis

Компьютерная лингвистика и интеллектуальные технологии. 2016. No. 15. P. 721–733.

Skorinkin D.A., Budnikov E. A., Stepanova M. E., Matavina P. V., Chelombeeva A. N.

This paper presents a rule-based approach to Information Extraction (IE) task within FactRuEval-2016 competition. Our system is based on ABBYY Compreno Technology. The technology uses the results of deep syntactic-semantic analysis, which leads to significant reduction of the number of necessary rules and makes them laconic. The evaluation was conducted on FactRuEval dataset. FactRuEval is an open evaluation of IE systems. The participants could take part in three tracks. The first track required to detect the boundaries and type of named entities in a text. The second track required to extract normalized attributes and perform local identification of named entities. The third track required to extract facts of certain types from a text. We took part in all three of the tracks with the nickname violet. Our method proved to be successful: we have achieved high F-measures in Named Entity Recognition tracks and the highest F-measure in Fact Extraction track.

Research target: Computer Science Philology and Linguistics

Priority areas: humanitarian IT and mathematics

Language: English

Full text

Text on another site

Keywords: компьютерная лингвистика natural language processing автоматическое извлечение фактов information extraction Извлечение информации из текста извлечение знаний из текстов

Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог» (Москва, 29 мая — 1 июня 2019 г.)

М.: Издательский центр «Российский государственный гуманитарный университет», 2019.

The book includes 64 papers submitted to the International conference in computer linguistics and intellectual technologies Dialogue 2019 and presents a broad spectrum of theoretical and applied research of natural language description, language simulation, and creation of applied computer technologies. ...

Added: October 16, 2019

Проблемы обработки естественного языка в диалоговых системах

Klyshinskiy E., Жеребцова Ю., Чижик А., Системный администратор 2019 № 10 С. 82–91

Nowadays, a field of dialogue systems and conversational agents is one of the rapidly growing research areas in artificial intelligence applications. Business and industry are showing increasing interest in implementing intelligent conversational agents into their products. Many recent studies has tended to focus on possibility of developing task-oriented systems which are able to have long ...

Added: October 26, 2019

Computational Linguistics and Intellectual Technologies Papers from the Annual International Conference “Dialogue” (2019)

M.: Russian State University for the Humanitie, 2019.

Added: October 16, 2019

Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной Международной конференции «Диалог» (Бекасово, 29 мая - 2 июня 2013 г.). В 2-х т.

М.: РГГУ, 2013.

Сборник включает 84 доклада международной конференции по компьютерной лингвистике и интеллектуальным технологиям «Диалог 2013», представляющих широкий спектр теоретических и прикладных исследований в области описания естественного языка, моделирования языковых процессов, создания практически применимых компьютерных лингвистических технологий. Для специалистов в области теоретической и прикладной лингвистики и интеллектуальных технологий. ...

Added: May 13, 2013

Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH)

Osaka: [б.и.], 2016.

Language resources are increasingly used not only in Language Technology (LT), but also in other subject fields, such as the digital humanities (DH) and in the field of education. Applying LT tools and data for such fields implies new perspectives on these resources regarding domain adaptation, interoperability, technical requirements, documentation, and usability of user interfaces. ...

Added: November 12, 2016

Национальный корпус русского языка как основа новаторских электронных учебников

Sibirtseva V., Khomenko A., Baranova J., Образовательные технологии и общество 2013 Т. 16 № 3 С. 508–521

The article reports about the students and teachers research group of National Research University Higher School of Economics entitled "Corplingui (Nizhny Novgorod-Moscow)"development. This work is about the research in the field of computer and corpus linguistics. Development primarily focuses on the creation of interactive resources based on the materials of The Russian National Corpus. The ...

Added: October 4, 2013

Universal Dependencies for Russian: A New Syntactic Dependencies Tagset

Lyashevskaya O., Droganova K., Zeman D. et al., / NRU HSE. Series WP BRP "Linguistics". 2016. No. 44.

This paper presents the Universal Dependencies tagset (UD v1) as a new annotation scheme for Russian treebanks. The universal list of dependency relations was adopted and extended to comply with certain language-specific syntactic constructions. The tagset was validated, converting two Russian treebanks into the UD format, UD-Russian-SynTagRus and UD-Russian-Google. ...

Added: December 14, 2016

Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing

Stroudsburg, PA: Association for Computational Linguistics, 2017.

This volume contains the papers presented at BSNLP-2017: the Sixth Workshop on Balto-Slavic Natural Language Processing. The Workshop is organized by SIGSLAV—Special Interest Group on NLP in Slavic Languages of the Association for Computational Linguistics. The Workshops have been convening for over a decade, with a clear vision and purpose. On one hand, the languages from ...

Added: June 13, 2017

Computational Linguistics and Intellectual Technologies

M.: Russian State University for the Humanitie, 2019.

The book includes 61 reports of the International conference on computer and intellectual technology "Dialogue-2019", representing a wide range of theoretical and applied research in the field of natural language description, modeling of language processes, creating practically applicable computer linguistic technologies. For specialists in the field of theoretical and applied linguistics and intellectual technologies. ...

Added: June 12, 2019

CLLS 2016. Computational Linguistics and Language Science. Proceedings of the Workshop on Computational Linguistics and Language Science. Moscow, Russia, April 26, 2016

Aachen: CEUR Workshop Proceedings, 2017.

As the number of digital texts increases rapidly, there is a pressing need for more advanced and diverse tools of natural language processing. While purely statistical approaches proved powerful and efficient for many NLP tasks, there are many applications that would benefit from the formal models and approaches traditional language science has to offer. With ...

Added: June 25, 2017

Компьютерная лингвистика и интеллектуальные технологии 2013: Доклады, принятые к публикации на сайте

[б.и.], 2013.

На сайте dialog-21.ru опубликованы тексты статей, принятых к электронной публикации. Они включают широкий спектр теоретических и прикладных исследований в области описания естественного языка, моделирования языковых процессов, создания практически применимых компьютерных лингвистических технологий. Для специалистов в области теоретической и прикладной лингвистики и интеллектуальных технологий. ...

Added: September 23, 2013

Actes de la conférence conjointe JEP-TALN-RECITAL

P.: [б.и.], 2016.

Added: October 5, 2017

Корпус татарского языка "Туган тел"

Arkhangelskiy T., Гильмуллин Р. А., Невзорова О. А. et al., Научно-техническая информация. Серия 2: Информационные процессы и системы 2013

В статье описывается электронный корпус татарского языка, созданный в рамках программы фундаментальных исследований Президиума РАН "Корпусная лингвистика", и методы, использованные авторами для создания этого корпуса. В частности, описываются текстовый состав и жанровая структура корпуса, принятые авторами решения о выделении морфологических характеристик, автоматическая морфологическая разметка текстов с помощью двухуровневой модели морфологии и анализатора PC-KIMMO и размещение ...

Added: October 25, 2013

Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной Международной конференции «Диалог» (Бекасово, 4 — 8 июня 2014 г.)

М.: Изд-во РГГУ, 2014.

Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference “Dialogue” (2014) ...

Added: August 20, 2014

Proceedings of the 4th workshop on NLP for Computer Assisted Language Learning at NODALIDA 2015, Vilnius, 11th May, 2015

Linköping University Electronic Press, 2015.

The workshop series on Natural Language Processing (NLP) for Computer-Assisted Language Learning (CALL) – NLP4CALL – is a meeting place for researchers working on the integration of Natural Language Processing and Speech Technologies in CALL systems and exploring the theoretical and methodological issues arising in this connection. ...

Added: May 31, 2015

Извлечение сценарной информации из текстов. Часть 1. Постановка задачи и обзор методов

Суворова М. И., Кобозева М. В., Toldova S. et al., Искусственный интеллект и принятие решений 2020 № 1 С. 17–26

В статье обсуждается важность автоматического сценарного анализа для понимания текстов на естественном языке. Дан широкий обзор методов и подходов к описанию и извлечению сценариев. Рассмотрены теоретические подходы к формализации сценариев. Приведен список задач, для решения которых используется информация о сценарной структуре текста. Представлены популярные подходы к автоматическому извлечению сценариев из текстов и методы оценки их ...

Added: April 22, 2020

Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог» (Москва, 17 июня — 20 июня 2020 г.)

М.: Изд-во РГГУ, 2020.

Papers from the Annual International Conference “Dialogue” (2020). Issue 19 ...

Added: June 26, 2020

Прикладная и компьютерная лингвистика

М.: Ленанд, 2017.

Вниманию читателей предлагается первое на русском языке практическое введение в современные лингвистические технологии. Из книги можно узнать о применении знаний о языке для решения прикладных задач. Монография позволяет найти ответы на базовые вопросы, возникающие у начинающего исследователя: как работают современные лингвистические технологии, где взять основные компоненты программ и что читать дальше для углубленного понимания. Многие сложные научно-технические проблемы станут намного ...

Added: December 31, 2017

Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference “Dialogue” (2015)

M.: Russian State University for the Humanitie, 2015.

Added: April 28, 2015

Количественная оценка грамматической неоднозначности некоторых европейских языков

Klyshinskiy E., Логачёва В. К., Карпик О. В. et al., Вестник Новосибирского государственного университета. Серия: Лингвистика и межкультурная коммуникация 2020 Т. 18 № 1 С. 5–21

The grammatical ambiguity (multiple sets of grammatical features for one word form or coinciding surface forms of different words) can be of different types. We describe six classes of grammatical ambiguity: unambiguous, ambiguous by grammatical features, by part of speech, by lemma, by lemma and part of speech, and out-of-vocabulary words. These classes are presented ...

Added: December 11, 2019

Computational Linguistics and Intelligent Text Processing, Lecture Notes in Computer Science

Springer, 2015.

16th International Conference, CICLing 2015, Cairo, Egypt, April 14-20, 2015, Proceedings, Part I ISBN: 978-3-319-18110-3 (Print) 978-3-319-18111-0 (Online) ...

Added: April 23, 2015

Современные проблемы и тенденции компьютерной лингвистики

Toldova S., Lyashevskaya O., Вопросы языкознания 2014 № 1 С. 120–145

This paper is an overview of the current issues and tendencies in Computational linguistics. The overview is based on the materials of the conference on computational linguistics COLING’2012. The modern approaches to the traditional NLP domains such as pos-tagging, syntactic parsing, machine translation are discussed. The highlights of automated information extraction, such as fact extraction, ...

Added: October 15, 2013

Applying statistical tagging to Russian poetry

Starchenko A., Kazakevich L., Lyashevskaya O., / NRU HSE. Series WP BRP "Linguistics". 2018. No. 76.

The poetic texts pose a challenge to full morphological tagging and lemmatization since the authors seek to extend the vocabulary, employ morphologically and semantically deficient forms, go beyond standard syntactic templates, use non-projective constructions and non-standard word order, among other techniques of the creative language game. In this paper we evaluate a number of probabilistic ...

Added: December 12, 2018

Computational Linguistics and Intellectual Technologies: papers from the Annual conference “Dialogue 2014”

M.: ., 2014.

The Conference Proceedings contain 64 papers from the international conference on Computational Linguistics and Intellectual Technologies “Dialogue 2014”, representing a large range of theoretical and applied research in the area of natural language description, language process description, creation of applied computer-linguistic technologies. For specialists in the field of theoretical and applied linguistics and intellectual technologies. ...

Added: July 7, 2014