Book
Analysis of Images, Social Networks and Texts. 8th International Conference AIST 2019
Creating a process model (PM) is a convenient means to depict the behavior of a particular information system. However, user behavior is not static and tends to change over time. In order for them to sustain relevant, PMs have to be adjusted to ever-changing behavior. Sometimes the existing PM may be of high value (e.g. it is well-structured, or has been developed continuously by experts to later work with), which makes the approach to create a brand-new model using discovery algorithms less preferable. In this case, a different and better suitable approach to adjust PM to new behavior is to work with an existing model through repairing only such PM fragments that do not fit the actual behavior stated in sub-log. This article is to present a method for efficient decomposition of PMs for their future repair. It aims to improve the accuracy of model repair. Unlike the ones introduced earlier, this algorithm suggests finding the minimum spanning tree of undirected graph’s vertices subset. It helps to reduce the size of a fragment to be repaired in a model and enhances the quality of a repaired model according to various conformance metrics.
A large number of methods are being developed in the deep reinforcement learning area recently, but the scope of their application is limited. The number of environments does not always allow for a comprehensive assessment of a new agent training algorithm. The main purpose of this article is to present another environment for Match-3 game that could be expanded, which would have a connection with the real business. The results for the most popular deep reinforcement learning algorithms are presented as a baseline.
In this paper, we present the first gold-standard corpus of Russian noun compounds annotated with compositionality information. We used Universal Dependency treebanks to collect noun compounds according to part of speech patterns, such as ADJ-NOUN or NOUN-NOUN and annotated them according to the following schema: a phrase can be either compositional, non-compositional, or ambiguous (i.e., depending on the context it can be interpreted both as compositional or non-compositional). Next, we conduct a series of experiments to evaluate both unsupervised and supervised methods for predicting compositionality. To expand this manually annotated dataset with more non-compositional compounds and streamline the annotation process we use active learning. We show that not only the methods, previously proposed for English, are easily adapted for Russian, but also can be exploited in active learning paradigm, that increases the efficiency of the annotation process.
We investigate the performance of sentence embeddings models on several tasks for the Russian language. In our comparison, we include such tasks as multiple choice question answering, next sentence prediction, and paraphrase identification. We employ FastText embeddings as a baseline and compare it to ELMo and BERT embeddings. We conduct two series of experiments, using both unsupervised (i.e., based on similarity measure only) and supervised approaches for the tasks. Finally, we present datasets for multiple choice question answering and next sentence prediction in Russian.
Sign language is the main way to communicate for people from deaf community. However, common people mostly do not know sign language. In this paper, we overview several real-time sign language dactyl recognition systems using deep convolutional neural networks. These systems are able to recognize dactylized words gestured by signs for each letter. We evaluate our approach on American (ASL) and Russian (RSL) sign languages. This solution may help fasten the process of communication for deaf people. On the contrary, we also present the algorithm for generating sign animation from text information using text-to-sign video vocabulary, which helps to integrate sign language in dubbed TV and combining with speech recognition tool provide full translation from natural language to sign language.

This book constitutes the refereed proceedings of the 10th International Conference on Formal Concept Analysis, ICFCA 2012, held in Leuven, Belgium in May 2012. The 20 revised full papers presented together with 6 invited talks were carefully reviewed and selected from 68 submissions. The topics covered in this volume range from recent advances in machine learning and data mining; mining terrorist networks and revealing criminals; concept-based process mining; to scalability issues in FCA and rough sets.
In this paper we propose the software system CORDIET-Helthcare which we are currently developing in collaboration with the Katholieke Universiteit Leuven, Moscow Higher School of Economics and the GZA-hospital group located in Antwerp. The main aim of this system is to offer healthcare management staff a user-friendly and powerful data analysis environment. Using state of the art techniques from computer science and mathematics we show how CORDIET-Helthcare can be used to gain insight in existing care processes and reveal actionable knowledge which can be used to improve the current way of working.
This book constitutes the second part of the refereed proceedings of the 10th International Conference on Formal Concept Analysis, ICFCA 2012, held in Leuven, Belgium in May 2012. The topics covered in this volume range from recent advances in machine learning and data mining; mining terrorist networks and revealing criminals; concept-based process mining; to scalability issues in FCA and rough sets.
This is a textbook in data analysis. Its contents are heavily influenced by the idea that data analysis should help in enhancing and augmenting knowledge of the domain as represented by the concepts and statements of relation between them. According to this view, two main pathways for data analysis are summarization, for developing and augmenting concepts, and correlation, for enhancing and establishing relations. Visualization, in this context, is a way of presenting results in a cognitively comfortable way. The term summarization is understood quite broadly here to embrace not only simple summaries like totals and means, but also more complex summaries such as the principal components of a set of features or cluster structures in a set of entities.
The material presented in this perspective makes a unique mix of subjects from the fields of statistical data analysis, data mining, and computational intelligence, which follow different systems of presentation.
We describe FCART software system, a universal integrated environment for knowledge and data engineers with a set of research tools based on Formal Concept Analysis. The system is intended for knowledge discovery from big dynamic data collections, including text collections. FCART allows the user to load structured and unstructured data (texts and various metainformation) from heterogeneous data sources, build data snapshots, compose queries, generate and visualize concept lattices, clusters, attribute dependencies, and other useful analytical artifacts. Full preprocessing scenario is considered.
Formal Concept Analysis Research Toolbox (FCART) is an integrated environment for knowledge and data engineers with a set of research tools based on Formal Concept Analysis. FCART allows a user to load structured and unstructured data (including texts with various metadata) from heterogeneous data sources into local data storage, compose scaling queries for data snapshots, and then research classical and some innovative FCA artifacts in analytic sessions.