CDUD 2016 – The 3rd International Workshop on Concept Discovery in Unstructured Data

Publications

?

CDUD 2016 – The 3rd International Workshop on Concept Discovery in Unstructured Data

CEUR Workshop Proceedings, 2016.

Under the general editorship: J. Baixeries, Ignatov D. I., Ilvovsky D., Panchenko A.

Concept discovery is a subdomain of Knowledge Discovery (KDD) that uses

human-centered techniques such as Formal Concept Analysis (FCA), Topic Mod-

eling, Visual Text Representations, Conceptual Graphs etc. for gaining insight

into the underlying conceptual structure of the data. Traditional machine learn-

ing techniques are mainly focusing on structured data whereas most data avail-

able resides in unstructured, often textual, form. Compared to traditional data

mining techniques, human-centered instruments actively engage the domain ex-

pert in the discovery process.

This volume contains the papers presented at the 3rd International Workshop

on Concept Discovery in Unstructured Data (CDUD 2016) held on July 18,

2018 at the National Research University Higher School of Economics, Moscow,

Russia. This workshop welcomes papers describing innovative research on data

discovery in complex data. It particular, it provides a forum for researchers and

developers of text mining instruments, whose research is related to the analysis

of linguistic and text data.

This year 15 papers had been submitted. Each submission has been reviewed,

at least, by 2 program committee members. Seven papers have been accepted

for regular publication in the proceedings, and three more submissions for pub-

lication as project proposals or abstracts.

Papers included in this volume cover a wide range of topics related to text

mining and structures for text representation: text navigation, statistical learning

models, automatic author or field identification in texts, among others.

An invited talk given by Natalia Loukachevitch from Moscow State Univer-

sity has opened the workshop program. She has surveyed modern tasks and

approaches in sentiment analysis of Twitter messages.

Our deep gratitude goes to all the authors of submitted papers, as well as

to the Program Committee members for their commitment. We also would like

to thank our invited speaker and our sponsors: National Research University

Higher School of Economics (Moscow, Russia), Russian Foundation for Basic

Research, and ExactPro. Finally, we would like to acknowledge the EasyChair

system which helped us to manage the reviewing process.

Verb-noun collocation and government model extraction from large corpora

Dereza O., Тушканов В. Н., , in : CDUD 2016 – The 3rd International Workshop on Concept Discovery in Unstructured Data. : CEUR Workshop Proceedings, 2016. P. 72.

A study on verb-noun collocation and government model extraction from large corpora ...

Added: October 5, 2017

Priority areas: IT and mathematics

Language: English

Full text

Keywords: natural language processing machine learning concept discovery

CDUD 2016 – The 3rd International Workshop on Concept Discovery in Unstructured Data

8th Russian Summer School in Information Retrieval (RuSSIR 2014)

Braslavski P., Karpov Nikolay, Worring M. et al., ACM SIGIR Forum 2014 Vol. 48 No. 2 P. 105-110

The 8th Russian Summer School in Information Retrieval (RuSSIR 2014) was held on August 18-22, 2014 in Nizhniy Novgorod, Russia.1 The school was co-organized by the National Research University Higher School of Economics2 and the Russian Information Retrieval Evaluation Seminar (ROMIP) ...

Added: August 22, 2015

Analysis of Images, Social Networks and Texts Third International Conference, AIST 2014, Yekaterinburg, Russia, April 10-12, 2014, Revised Selected Papers

Berlin : Springer, 2014

This book constitutes the proceedings of the Third International Conference on Analysis of Images, Social Networks and Texts, AIST 2014, held in Yekaterinburg, Russia, in April 2014. The 11 full and 10 short papers were carefully reviewed and selected from 74 submissions. They are presented together with 3 short industrial papers, 4 invited papers and ...

Added: November 13, 2014

Supplementary Proceedings of the 3rd International Conference on Analysis of Images, Social Networks and Texts (AIST 2014)

Ekaterinburg : CEUR Workshop Proceedings, 2014

AIST'2014 is an international data science conference on Analysis of Images, Social Networks, and Texts. Traditionally, the conference is held annually in Yekaterinburg, Russia. The conference is intended for computer scientists and practitioners whose research interests involve Internet mathematics and other related fields of data science. LIST OF TOPICS (NON EXHAUSTIVE) Applications of Data Mining and Machine ...

Added: August 28, 2014

Analysis of Images, Social Networks and Texts. 4th International Conference, AIST 2015, Yekaterinburg, Russia, April 9–11, 2015, Revised Selected Papers

Switzerland : Springer, 2015

This book constitutes the proceedings of the Fourth International Conference on Analysis of Images, Social Networks and Texts, AIST 2015, held in Yekaterinburg, Russia, in April 2015. The 24 full and 8 short papers were carefully reviewed and selected from 140 submissions. The papers are organized in topical sections on analysis of images and videos; ...

Added: October 12, 2015

Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning

Berlin : Association for Computational Linguistics, 2016

The 2016 Conference on Computational Natural Language Learning is the twentieth in the series of annual meetings organized by SIGNLL, the ACL special interest group on natural language learning. CoNLL 2016 will be held on August 11-12, 2016, and is co-located with the 54th annual meeting of the Association for Computational Linguistics (ACL) in Berlin, ...

Added: November 12, 2016

Recent Trends in Analysis of Images, Social Networks and Texts. 9th International Conference, AIST 2020, Skolkovo, Moscow, Russia, October 15–16, 2020 Revised Supplementary Proceedings

Springer, 2021

This book constitutes revised selected papers from the 9th International Conference on Analysis of Images, Social Networks and Texts, AIST 2020, held during October 15-16, 2020. The conference was planned to take place in Moscow, Russia, but changed to an online format due to the COVID-19 pandemic. The 27 full papers and 4 short papers presented ...

Added: October 7, 2020

Texterra: инфраструктура для анализа текстов

Денис Турдаков, Астраханцев Н. А., Недумов Я. Р. et al., Труды Института системного программирования РАН 2014 Т. 26 С. 421-438

he paper presents a framework for fast text analytics developed during the Texterra project. Texterra is a technology for multilingual text mining based on novel text processing methods that exploit knowledge extracted from user-generated content. It delivers a fast scalable solution for text mining without the expensive customization. Depending on use-cases Texterra could be utilized ...

Added: November 6, 2017

Breaking Sticks and Ambiguities with Adaptive Skip-gram

Bartunov S., Кондрашкин Д. А., Osokin A. et al., / Arxiv.org. Series arXiv:1502.07257 "Computation and language". 2015.

Recently proposed Skip-gram model is a powerful method for learning high-dimensional word representations that capture rich semantic relationships between words. However, Skip-gram as well as most prior work on learning word representations does not take into account word ambiguity and maintain only single representation per word. Although a number of Skip-gram modifications were proposed to ...

Added: November 5, 2015

Analysis of Images, Social Networks and Texts. 5th International Conference, AIST 2016, Yekaterinburg, Russia, April 7-9, 2016, Revised Selected Papers. Communications in Computer and Information Science

Switzerland : Springer, 2017

This book constitutes the proceedings of the 5th International Conference on Analysis of Images, Social Networks and Texts, AIST 2016, held in Yekaterinburg, Russia, in April 2016. The 23 full papers, 7 short papers, and 3 industrial papers were carefully reviewed and selected from 142 submissions. The papers are organized in topical sections on machine ...

Added: October 19, 2016

Generalized approach to sentiment analysis of short text messages in natural language processing

Polyakov E. V., Voskov L., Abramov P. et al., Informatsionno-upravliaiushchie sistemy [Information and Control Systems] 2020 No. 1 P. 2-14

Introduction: Sentiment analysis is a complex problem whose solution essentially depends on the context, field of study and amount of text data. Analysis of publications shows that the authors often do not use the full range of possible data transformations and their combinations. Only a part of the transformations is used, limiting the ways to ...

Added: February 20, 2020

Artificial Intelligence and Natural Language. AINL 2020. Communications in Computer and Information Science

Springer, 2020

Added: September 8, 2020

Intelligent Data Processing 11th International Conference, IDP 2016, Barcelona, Spain, October 10–14, 2016, Revised Selected Papers

Switzerland : Springer, 2019

This book constitutes the refereed proceedings of the 11th International Conference on Intelligent Data Processing, IDP 2016, held in Barcelona, Spain, in October 2016. The 11 revised full papers were carefully reviewed and selected from 52 submissions. The papers of this volume are organized in topical sections on machine learning theory with applications; intelligent data processing in life ...

Added: February 8, 2020

The Applications of Sentiment Analysis for Russian Language Texts: Current Challenges and Future Perspectives

Smetanin S., IEEE Access 2020 Vol. 8 P. 110693-110719

Sentiment analysis has become a powerful tool in processing and analysing expressed opinions on a large scale. While the application of sentiment analysis on English-language content has been widely examined, the applications on the Russian language remains not as well-studied. In this survey, we comprehensively reviewed the applications of sentiment analysis of Russian-language content and ...

Added: June 24, 2020

Texterra: A framework for text analysis.

S.D. Kuznetsov, D.Yu. Turdakov, Астраханцев Н. А. et al., Programming and Computer Software 2014 Vol. 40 No. 5 P. 288-295

A framework for fast text analysis, which is developed as a part of the Texterra project, is described. Texterra provides a scalable solution for the fast text processing on the basis of novel methods that exploit knowledge extracted from the Web and text documents. For the developed tools, details of the project, use cases, and ...

Added: November 26, 2017

Classification of Short Scientific Texts

I. K. Kusakin, Fedorets O. V., A. Y. Romanov, Scientific and Technical Information Processing 2023 Vol. 50 No. 3 P. 176-183

This paper discusses modern approaches to natural language processing and the application of machine learning models to the task of classifying short scientific texts in Russian. This study is devoted to the analysis of methods for vectorization of textual information, selection of a model for scientific paper clas- sification, and training of linguistic model BERT ...

Added: November 4, 2023

Knowledge Engineering and Semantic Web

Switzerland : Springer, 2015

This book constitutes the refereed proceedings of the 6th Conference on Knowledge Engineering and the Semantic Web, KESW 2015, held in Moscow, Russia, in September/October 2015. The 17 revised full papers presented together with 6 short system descriptions were carefully reviewed and selected from 35 submissions. The papers address research issues related to semantic web, ...

Added: September 16, 2015

Artificial Intelligence. RCAI 2021. Lecture Notes in Computer Science

Springer, 2021

This book constitutes the proceedings of the 19th Russian Conference on Artificial Intelligence, RCAI 2021, held in Moscow, Russia, in October 2021. The 19 full papers and 7 short papers presented in this volume were carefully reviewed and selected from 80 submissions. The conference deals with a wide range of topics, categorized into the following topical ...

Added: October 28, 2021

Proceedings of the International Conference «Wave Electronics and its Application in Information and Telecommunication Systems (WECONF)».IEEE # 47647. Saint Petersburg State University of Aerospace Instrumentation. June 03-07, 2019

Nazarov A., Сычев А. К., Voronkov I. M., IEEE, 2019

The article describes the shortcomings of the modern datasets used in the development of next-generation intrusion detection systems and proposed new requirements for datasets. Based on the requirements, new software architecture has been proposed, which allows to model modern computer attacks and at the same time “mark up” logs generated on hosts and by network ...

Added: May 8, 2020

Multiple features for clinical relation extraction: A machine learning approach

Alimova l., Tutubalina E., Journal of Biomedical Informatics 2020 Vol. 103 P. 1-9

Relation extraction aims to discover relational facts about entity mentions from plain texts. In this work, we focus on clinical relation extraction; namely, given a medical record with mentions of drugs and their attributes, we identify relations between these entities. We propose a machine learning model with a novel set of knowledge-based and BioSentVec embedding ...

Added: October 28, 2020

Array DBMS: Past, Present, and (Near) Future

Rodriges Zalipynis R. A., PROCEEDINGS OF THE VLDB ENDOWMENT 2021 Vol. 14 No. 12 P. 3186-3189

Array DBMSs strive to be the best systems for managing, processing, and even visualizing big N-d arrays. The last decade blossomed with R&D in array DBMS, making it a young and fast-evolving area. We present the first comprehensive tutorial on array DBMS R&D. We start from past impactful results that are still relevant today, then ...

Added: June 4, 2021

Models, Algorithms, and Technologies for Network Analysis. Springer Proceedings in Mathematics & Statistics

Springer, 2017

This valuable source for graduate students and researchers provides a comprehensive introduction to current theories and applications in optimization methods and network models. Contributions to this book are focused on new efficient algorithms and rigorous mathematical theories, which can be used to optimize and analyze mathematical graph structures with massive size and high density induced ...

Added: June 26, 2017

Machine Learning and Data Mining in Pattern Recognition

Springer, 2014

This book constitutes the refereed proceedings of the 10th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2014, held in St. Petersburg, Russia in July 2014. The 40 full papers presented were carefully reviewed and selected from 128 submissions. The topics range from theoretical topics for classification, clustering, association rule and ...

Added: September 30, 2014

Meta-Learning with Memory-Augmented Neural Networks

Santoro A., Bartunov S., Botvinick M. et al., Journal of Machine Learning Research 2016 Vol. 48

Despite recent breakthroughs in the applications of deep neural networks, one setting that presents a persistent challenge is that of "one-shot learning." Traditional gradient-based networks require a lot of data to learn, often through extensive iterative training. When new data is encountered, the models must inefficiently relearn their parameters to adequately incorporate the new information ...

Added: October 19, 2016

Faster variational inducing input Gaussian process classification

Izmailov P., Kropotov D., Journal of machine learning and data analysis 2017 Vol. 3 No. 1 P. 20-35

Background: Gaussian processes (GP) provide an elegant and effective approach to learning in kernel machines. This approach leads to a highly interpretable model and allows using the Bayesian framework for model adaptation and incorporating the prior knowledge about the problem. The GP framework is successfully applied to regression, classification, and dimensionality reduction problems. Unfortunately, the ...

Added: December 6, 2018