Exploring Pattern Structures of Syntactic Trees for Relation Extraction

Leeuwenberg A.; A. V. Buzmakov; Toussaint Y.; Napoli A.

?

Exploring Pattern Structures of Syntactic Trees for Relation Extraction

P. 153-168.

Leeuwenberg A., Buzmakov A. V., Toussaint Y., Napoli A.

In this paper we explore the possibility of defining an original pattern structure for managing syntactic trees. More precisely, we are interested in the extraction of relations such as drug-drug interactions (DDIs) in medical texts where sentences are represented as syntactic trees. In this specific pattern structure, called STPS, the similarity operator is based on rooted tree intersection. Moreover, we introduce “Lazy Pattern Structure Classification” (LPSC), which is a symbolic method able to extract and classify DDI sentences w.r.t. STPS. To decrease computation time, a projection and a set of tree-simplification operations are proposed. We evaluated the method by means of a 10-fold cross validation on the corpus of the DDI extraction challenge 2011, and we obtained very encouraging results that are reported at the end of the paper.

Language: English

Full text

Text on another site

Keywords: анализ формальных понятий FCA (Formal Concept Analysis)pattern structures automatic text analysis автоматический анализ текста узорные структуры relation extraction выделение отношений

In book

Formal Concept Analysis. 13th International Conference, ICFCA 2015, Nerja, Spain, June 23-26, 2015, Proceedings

Vol. 9113. , Springer, 2015

On Scaling of Fuzzy FCA to Pattern Structures?

Buzmakov A. V., Napoli A., , in : CLA 2016: Proceedings of the Thirteenth International Conference on Concept Lattices and Their Applications. CEUR Workshop Proceedings. Vol. 1624.: M. : Higher School of Economics, National Research University, 2016. P. 85-96.

FCA is a mathematical formalism having many applications in data mining and knowledge discovery. Originally it deals with binary data tables. However, there is a number of extensions that enrich stan- dard FCA. In this paper we consider two important extensions: fuzzy FCA and pattern structures, and discuss the relation between them. In particular we ...

Added: October 14, 2016

Exploratory Knowledge Discovery over Web of Data

Alam M., Buzmakov A. V., Napoli A., Discrete Applied Mathematics 2018 Vol. 249 P. 2-17

With an increased interest in machine processable data and with the progress of semantic technologies, many datasets are now published in the form of RDF triples for constituting the so-called Web of Data. Data can be queried using SPARQL but there are still needs for integrating, classifying and exploring the data for data analysis and ...

Added: September 26, 2017

On Projections of Sequential Pattern Structures (with an application on care trajectories)

Buzmakov A. V., Egho E., Jay N. et al., , in : CLA 2013 Proceedings of the Tenth International Conference on Concept Lattices and Their Applications. : La Rochelle : Laboratory L3i, University of La Rochelle, 2013. P. 199-210.

In this paper, we are interested in the analysis of sequential data and we propose an original framework based on FCA. For that, we introduce sequential pattern structures, an original specification of pattern structures for dealing with sequential data. Sequential pattern structures are given by a subsumption operation between set of sequences, based on subsequence ...

Added: October 22, 2015

Fast Generation of Best Interval Patterns for Nonmonotonic Constraints

Buzmakov A. V., Kuznetsov S., Napoli A., , in : Machine Learning and Knowledge Discovery in Databases. European Conference, ECML PKDD 2015, Porto, Portugal, September 7-11, 2015, Proceedings. * 2. Vol. 9285.: Dordrecht, L., Cham, Heidelberg, NY : Springer, 2015. P. 157-172.

In pattern mining, the main challenge is the exponential explosion of the set of patterns. Typically, to solve this problem, a constraint for pattern selection is introduced. One of the first constraints proposed in pattern mining is support (frequency) of a pattern in a dataset. Frequency is an anti-monotonic function, i.e., given an infrequent pattern, ...

Added: October 22, 2015

Pattern Structures and Concept Lattices for Data Mining and Knowledge Processing

Kaytoue M., Codocedo V., Buzmakov A. V. et al., , in : Machine Learning and Knowledge Discovery in Databases. European Conference, ECML PKDD 2015, Porto, Portugal, September 7-11, 2015, Proceedings. * III. Vol. 9286.: Dordrecht, L., Heidelberg, NY, Cham : Springer, 2015. P. 227-231.

This article aims at presenting recent advances in Formal Concept Analysis (2010-2015), especially when the question is dealing with complex data (numbers, graphs, sequences, etc.) in domains such as databases (functional dependencies), data-mining (local pattern discovery), information retrieval and information fusion. As these advances are mainly published in artificial intelligence and FCA dedicated venues, a ...

Added: October 23, 2015

On mining complex sequential data by means of FCA and pattern structures

Buzmakov A. V., Egho E., Jay N. et al., International Journal of General Systems 2016 Vol. 45 No. 2 P. 135-159

Nowadays data-sets are available in very complex and heterogeneous ways. Mining of such data collections is essential to support many real-world applications ranging from healthcare to marketing. In this work, we focus on the analysis of “complex” sequential data by means of interesting sequential patterns. We approach the problem using the elegant mathematical framework of ...

Added: February 25, 2016

Efficient Mining of Subsample-Stable Graph Patterns

Buzmakov A. V., Kuznetsov S., Napoli A., , in : 2017 IEEE 17th International Conference on Data Mining (ICDM). : New Orleans : IEEE, 2017. Ch. 89. P. 757-762.

A scalable method for mining graph patterns stable under subsampling is proposed. The existing subsample stability and robustness measures are not antimonotonic according to definitions known so far. We study a broader notion of antimonotonicity for graph patterns, so that measures of subsample stability become antimonotonic. Then we propose gSOFIA for mining the most subsample-stable graph patterns. The ...

Added: September 26, 2017

How Fuzzy FCA and Pattern Structures are Connected?

Buzmakov A. V., Napoli A., , in : Proceedings of the International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI at ECAI 2016). : M. : [б.и.], 2016. P. 89-96.

FCA is a mathematical formalism having many applications in data mining and knowledge discovery. Originally it deals with binary data tables. However, there is a number of extensions that enrich stan dard FCA. In this paper we consider two important extensions: fuzzy FCA and pattern structures, and discuss the relation between them. In particular we introduce a scaling procedure that ...

Added: October 14, 2016

Mining Definitions from RDF Annotations Using Formal Concept Analysis

Alam M., Buzmakov A. V., Codocedo V. et al., , in : Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015. : Palo Alto : AAAI Press, 2015. P. 823-829.

The popularization and quick growth of Linked Open Data (LOD) has led to challenging aspects regarding quality assessment and data exploration of the RDF triples that shape the LOD cloud. Particularly, we are interested in the completeness of data and its potential to provide concept definitions in terms of necessary and sufficient conditions. In this ...

Added: October 22, 2015

Ensemble Techniques for Lazy Classification Based on Pattern Structures

Ilya Semenkov, Sergei O. Kuznetsov, , in : Proceedings of the 9th International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI 2021). Vol. 2972.: CEUR-WS, 2021. P. 105-112.

This paper presents different versions of classification ensemble methods based on pattern structures. Each of these methods is described and tested on multiple datasets (including datasets with exclusively numerical and exclusively nominal features). As a baseline model Random Forest generation is used. For some classification tasks the classification algorithms based on pattern structures showed better ...

Added: December 19, 2022

RAPS: A Recommender Algorithm Based on Pattern Structures

Ignatov D. I., Корнилов Д. И., , in : Proceedings of the International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI at IJCAI 2015). : Buenos Aires : [б.и.], 2015. P. 87-98.

We propose a new algorithm for recommender systems with numeric ratings which is based on Pattern Structures (RAPS). As the input the algorithm takes rating matrix, e.g., such that it contains movies rated by users. For a target user, the algorithm returns a rated list of items (movies) based on its previous ratings and ratings ...

Added: October 23, 2015

An FCA-based Boolean Matrix Factorisation for Collaborative Filtering

Nenova Elena, Ignatov D. I., Konstantinov A. V., , in : Formal Concept Analysis Meets Information Retrieval 2013. Vol. 977.: Aachen : CEUR Workshop Proceedings, 2013. P. 57-73.

We propose a new approach for Collaborative ltering which is based on Boolean Matrix Factorisation (BMF) and Formal Concept Analysis. In a series of experiments on real data (Movielens dataset) we compare the approach with the SVD- and NMF-based algorithms in terms of Mean Average Error (MAE). One of the experimental con- sequences is that ...

Added: October 10, 2013

Metric Generalization and Modification of Classification Algorithms Based on Formal Concept Analysis

Kolmakov E.A., Computational Mathematics and Modeling 2015 Vol. 26 No. 4 P. 566-576

Added: December 5, 2018

Методы анализа текста: методологические основания и программная реализация

Митина О. В., Evdokimenko A., Вестник Южно-Уральского государственного университета. Серия: Психология 2010 № 40 (216) С. 29-48

Изложена систематизация представлений о методологических принципах анализа текста и программной реализации уже разработанных методик. Методики анализа текста были систематизированы в 10 групп: интент-анализ, контент-анализ, фоносемантический анализ, дискурс-анализ, нарративный анализ, экспертная оценка текста, графематический анализ, морфологический анализ, синтаксический анализ, семантический анализ. Для каждой из групп приведены примеры программной реализации. ...

Added: November 14, 2013

Multimodal Clustering of Boolean Tensors on MapReduce: Experiments Revisited

Ignatov D. I., Egurnov D., Точилкин Д. С., , in : Supplementary Proceedings ICFCA 2019 Conference and Workshops. Vol. 2378.: CEUR Workshop Proceedings, 2019. P. 137-151.

This paper presents further development of distributed multimodal clustering. We introduce a new version of multimodal clustering algorithm for distributed processing in Apache Hadoop on computer clusters. Its implementation allows a user to conduct clustering on data with modality greater than two. We provide time and space complexity of the algorithm and justify its relevance. ...

Added: October 31, 2019

Some Musings on New FCA-based Consensus Clustering Algorithm

Ignatov D. I., Shestakoff A., Lecture Notes in Computer Science 2013

We propose a new FCA-based algorithm for consensus clustering FCA-Consensus. As the input the algorithm takes $n$ partitions of a certain set of objects obtained by k-means algorithm after its $n$ different executions. The resulting consensus partition is extracted from a (partial) antichain of the concept lattice built on formal context $objects \times classes$, where ...

Added: October 26, 2013

Compyter-based processing of literary works and study of literature

Sibirtseva V., / НИУ ВШЭ. Series WP BRP "Linguistics". 2014. No. 7.

Currently many software applications, enabling text analysis, are being created for different purposes (semantic reference tools, concordancers, sentiment analysis etc.), but not used by literary researchers. Computer software allows to facilitate the search of the required information and to save time considerably. With such approach to the field of linguistic and literary analysis, a comparative ...

Added: April 25, 2014

Summation of Decision Trees

Dudyrev E., Kuznetsov S., , in : Proceedings of the 9th International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI 2021). Vol. 2972.: CEUR-WS, 2021. Ch. 9. P. 99-104.

Ensembles of decision trees, like Random Forests are efficient machine learning models with state-of-the-art prediction quality. However, their predictions are much less transparent than those of a single decision tree. In this paper, we describe a prediction model based on a single decision tree in terms of Formal Concept Analysis. We define a differential way ...

Added: December 8, 2021

Formal Concept Analysis Meets Information Retrieval 2013

Aachen : CEUR Workshop Proceedings, 2013

Formal Concept Analysis (FCA) is a mathematically well-founded theory aimed at data analysis and classication, introduced and detailed in the book of Bernhard Ganter and Rudolf Wille, \Formal Concept Analysis", Springer 1999. The area came into being in the early 1980s and has since then spawned over 10000 scientic publications and a variety of practically ...

Added: October 10, 2013

Узорные структуры для анализа сложных последовательностей

Buzmakov A. V., Научно-техническая информация. Серия 2: Информационные процессы и системы 2013 № 10 С. 27-39

Представлен метод поиска интересных паттернов в данных, описываемых сложными последовательностями, т. е. таких, у которых символы имеют структуру. Анализ формальных понятий и узорные структуры, относящиеся к прикладной теории решёток, позволяют решать рассматриваемую задачу в общем виде. Построение решётки узорных понятий представляет большую вычислительную сложность, поэтому исследуются возможности приближённого описания узорных структур, задаваемые через проекции. В ...

Added: October 24, 2015

Concept-based chatbot for interactive query refinement in product search

Goncharova E., Ilvovsky D., Galitsky B., , in : Proceedings of the 9th International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI 2021). Vol. 2972.: CEUR-WS, 2021. P. 51-58.

Added: October 28, 2021

Роль общей и специфической лексики при извлечении информации из текста на примере анализа события «Ввод новых технологий»

Klintsov V., Bonch-Osmolovskaya A. A., Kuznetsov I. et al., Вестник Новосибирского государственного университета. Серия: Информационные технологии 2012 Т. 10 № 4 С. 74-80

This paper discusses approaches to the selection of keywords, used for information extraction of event frames. In particular, the innovation event is associated with different lexical items in different areas of knowledge. The paper evaluated the contribution of general and specific vocabulary in the representation of the frame in a particular subject area. ...

Added: September 5, 2013

Is Concept Stability a Measure for Pattern Selection?

Buzmakov A. V., Kuznetsov S., Napoli A., Procedia Computer Science 2014 Vol. 31 P. 918-927

There is a lot of usefulness measures of patterns in data mining. This paper is focused on the measures used in Formal Concept Analysis (FCA). In particular, concept stability is a popular relevancy measure in FCA. Experimental results of this paper show that high stability of a pattern in a given dataset derived from the ...

Added: October 22, 2015

Visual analytics in FCA-based triclustering

Kashnitsky Y., , in : Supplementary Proceedings of the 3rd International Conference on Analysis of Images, Social Networks and Texts (AIST 2014). Vol. 1197: Supplementary Proceedings of AIST 2014.: Ekaterinburg : CEUR Workshop Proceedings, 2014. Ch. 12. P. 69-80.

Visual analytics is a subdomain of data analysis which combines both human and machine analytical abilities and is applied mostly in decision-making and data mining tasks. Triclustering, based on Formal Concept Analysis (FCA), was developed to detect groups of objects with similar properties under similar conditions. It is used in Social Network Analysis (SNA) and ...

Added: August 28, 2014