Выявление искаженной информации: подход с использованием дискурсивных связей

B. Galitsky; D. Ilvovsky

?

Выявление искаженной информации: подход с использованием дискурсивных связей

P. 23–32.

A linguistic method for determining whether given text is a rumor or disinformation is proposed, based on web mining and linguistic technology comparing two text fragments. We hypothesize about a family of content generation algorithms which are capable of producing deception from a portion of genuine, original text. We then propose a disinformation detection algorithm which finds a candidate source of text on the web and compares it with the given text, applying parse thicket technology. Parse thicket is a graph combined from a sequence of parse trees augmented with inter-sentence relations for anaphora and rhetoric structures. We evaluate our algorithm in the domain of customer reviews, considering a product review as an instance of possible deception. It is confirmed as a plausible way to detect rumor and deception in a web document.

Language: English

Full text

Keywords: web mining parse thicket чаща разбора

Publication based on the results of:

Mining Data with Complex Structure and Semantic Technologies (2016)

In book

Пятнадцатая национальная конференция по искусственному интеллекту с международным участием КИИ-2016 (3-7 октября 2016г., г.Смоленск, Россия): Труды конференции

Т. 1. , Смоленск: Универсум, 2016.

Интеллектуальный анализ текстов в логистике и управлении цепями поставок

Морозова Ю. А., Логистика и управление цепями поставок 2018 № 4 (87) С. 10–18

Currently, the competitiveness of the company largely depends on how it uses the opportunities offered by modern information technologies. The Internet of things, big data, blockchain, artificial intelligence technologies - all of this brings companies to a new level of interaction and competition, gives new opportunities to build logistics processes, adjusts supply chain management. It ...

Added: October 15, 2018

12th International Summer School on Reasoning Web Summer School, RW 2016

[б.и.], 2017.

Added: September 18, 2017

Proceedings of the ISMW-FRUCT 2016

[б.и.], 2016.

Added: January 17, 2017

О проекте разработки системы мониторинга глобальных процессов на основе Интернет-новостей

Shalyaeva I., Lanin V., Lyadova L. N., В кн.: Технологии разработки информационных систем - ТРИС-2016: материалы VII Международной научно-технической конференции. Том 1Т. 1.: Таганрог: Издательство ЮФУ, 2016. С. 166–170.

On the Project of Development of Global Processes Monitoring System Based on Internet News. An approach to the processes analysis on the basis of the data on events mined from newsfeeds is described. Retrieved data are processed with the means of Process Mining allowing constructing the formal models of processes. ...

Added: November 3, 2016

Text integrity assessment: Sentiment profile vs rhetoric structure

Galitsky B., Ilvovsky D., Kuznetsov S., , in: Computational Linguistics and Intelligent Text Processing. 16th International Conference, CICLing 2015, Cairo, Egypt, April 14-20, 2015, Proceedings, Part II.Vol. 9042.: Berlin: Springer, 2015. P. 126–139.

We formulate the problem of text integrity assessment as learning thediscourse structure of text given the dataset of texts with high integrity and lowintegrity. We use two approaches to formalizing the discourse structures, sentimentprofile and rhetoric structures, relying on sentence-level sentiment classifierand rhetoric structure parsers respectively. To learn discourse structures, weuse the graph-based nearest neighbor ...

Added: November 7, 2015

Применение семантически связанных деревьев синтаксического разбора в задаче поиска ответов на вопросы, состоящие из нескольких предложений

Ilvovsky D., Научно-техническая информация. Серия 2: Информационные процессы и системы 2014 № 2 С. 28–37

Проблема нахождения релевантных ответов на вопросы, состоящие из нескольких предложений, является популярной и востребованной во многих прикладных областях. В частности, она возникает в промышленных системах, ориентированных на предоставление товаров и услуг. Один из основных подходов к данной проблеме заключается в том, что множество потенциальных ответов, полученное с помощью поиска по ключевым словам, повторно упорядочивается с ...

Added: June 9, 2014

A Web Mining Tool for Assistance with Creative Writing

Galitsky B., Kuznetsov S., , in: Proc. 35th European Conference on Information Retrieval (ECIR 2013): Advances in Information RetrievalVol. 7814.: Springer, 2013. P. 828–831.

Added: November 18, 2013

Parse thicket representations of text paragraphs

Galitsky B., Ilvovsky D., Kuznetsov S. et al., , in: Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной Международной конференции «Диалог» (Бекасово, 29 мая - 2 июня 2013 г.). В 2-х т.Т. 1: Основная программа конференции. Вып. 12 (19).: М.: РГГУ, 2013. P. 239–255.

We develop a graph representation and learning technique for parse structures for sentences and paragraphs of text. We introduce parse thicket as a set of syntactic parse trees augmented by a number of arcs for intersentence word-word relations such as coreference and taxonomies. These arcs are also derived from other sources, including Rhetoric Structure and Speech Act theory. We introduce ...

Added: November 1, 2013

Diagnostic Test Approaches to Machine Learning and Commonsense Reasoning Systems

Naidenova X., Ignatov D. I., Hershey: IGI Global, 2012.

The consideration of symbolic machine learning algorithms as an entire class will make it possible, in the future, to generate algorithms, with the aid of some parameters, depending on the initial users’ requirements and the quality of solving targeted problems in domain applications. Diagnostic Test Approaches to Machine Learning and Commonsense Reasoning Systems surveys, analyzes, and ...

Added: December 3, 2012