Family Matters: Company Relations Extraction from Wikipedia

Kuznetsov A.; P. Braslavski; Ivanov V.

doi:10.1007/978-3-319-45880-9_7

Publications

?

Family Matters: Company Relations Extraction from Wikipedia

P. 81–92.

Kuznetsov A., Braslavski P., Ivanov V.

The study described in the paper deals with the extraction of relations between organizations from the Russian Wikipedia. We experiment with two data sources for supervised methods – manual annotations made from scratch and relations from infoboxes with subsequent sentence matching, as well as different feature sets and learning methods – SVM, CRF, and UIMA Ruta. Results show that the automatically obtained training data delivers worse results than manually annotated data, but the former approach is promising due to its scalability. Evaluation of relations extracted from a subset of Wikipedia pages that are mapped to the Russian state company registry proves that external sources can enrich and complement official databases.

Language: English

DOI

Keywords: relation extraction

In book

Knowledge Engineering and Semantic Web

Springer, 2016.

Cross-Domain Limitations of Neural Models on Biomedical Relation Classification

Alimova I., Tutubalina E., Nikolenko S. I., IEEE Access 2022 Vol. 10 P. 1432–1439

Relation extraction (RE) aims to extract relational facts from plain text, which is essential to the biomedical research field with the rapid growth of biomedical literature and generally large volumes of biomedicine-related text coming from various sources. Numerous annotated corpora and state-of-the-art models have been introduced in the past five years. However, there are no ...

Added: April 10, 2023

Multiple features for clinical relation extraction: A machine learning approach

Alimova l., Tutubalina E., Journal of Biomedical Informatics 2020 Vol. 103 P. 1–9

Relation extraction aims to discover relational facts about entity mentions from plain texts. In this work, we focus on clinical relation extraction; namely, given a medical record with mentions of drugs and their attributes, we identify relations between these entities. We propose a machine learning model with a novel set of knowledge-based and BioSentVec embedding ...

Added: October 28, 2020

RUREBUS-2020 Shared Task: Russian Relation Extraction for Business

Ivanin V., Artemova E., Batura T. et al., , in: Компьютерная лингвистика и интеллектуальные технологии: по материалам ежегодной международной конференции «Диалог» (Москва, 17–20 июня 2020 г.)Issue 19(26): дополнительный том.: -, 2020. P. 401–416.

In this paper, we present a shared task on core information extraction prob- lems, named entity recognition and relation extraction. In contrast to popular shared tasks on related problems, we try to move away from strictly aca- demic rigor and rather model a business case. As a source for textual data we choose the corpus ...

Added: June 21, 2020

RuREBus-2020 Shared Task: Russian Relaton Extraction for Business

Artemova E., Batura T., Sarkisyan V. et al., , in: Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог» (Москва, 17 июня — 20 июня 2020 г.)Вып. 19(26).: М.: Изд-во РГГУ, 2020. P. 416–432.

В статье представлены результаты соревнования по распознаванию именованных сущностей и извлечению отношений. Целью соревнования является сравнение методов извлечения сущностей и отношений на русском языке в постановке, приближенной к индустриальным задачам. В качестве исходной коллекции текстов использовался корпус Минэкономразвития РФ, содержащий программы стратегического развития. Корпус был размечен в соответствии с инструкцией, разработанной авторами статьи. В процессе ...

Added: June 11, 2020

FactRuEval 2016: Evaluation of Named Entity Recognition and Fact Extraction Systems for Russian

Starostin A. S., Bocharov V. V., Alexeeva S. V. et al., , in: Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог» (Москва,1–4 июля 2016 г.)Вып. 15.: М.: Изд-во РГГУ, 2016. P. 688–705.

In this paper, we describe the rules and results of the FactRuEval informa- tion extraction competition held in 2016 as part of the Dialogue Evaluation initiative in the run-up to Dialogue 2016. The systems were to extract in- formation from Russian texts and competed in two named entity extraction tracks and one fact extraction track. ...

Added: October 7, 2016

Exploring Pattern Structures of Syntactic Trees for Relation Extraction

Leeuwenberg A., Buzmakov A. V., Toussaint Y. et al., , in: Formal Concept Analysis. 13th International Conference, ICFCA 2015, Nerja, Spain, June 23-26, 2015, ProceedingsVol. 9113.: Springer, 2015. P. 153–168.

In this paper we explore the possibility of defining an original pattern structure for managing syntactic trees. More precisely, we are interested in the extraction of relations such as drug-drug interactions (DDIs) in medical texts where sentences are represented as syntactic trees. In this specific pattern structure, called STPS, the similarity operator is based on ...

Added: October 22, 2015