### Book

## Тезисы докладов 12-й Международной конференции Интеллектуализация обработки информации

The volume contains the abstracts of the 12th International Conference "Intelligent Data Processing: Theory and Applications". The conference is organized by the Russian Academy of Sciences, the Federal Research Center "Informatics and Control" of the Russian Academy of Sciences and the Scientific and Coordination Center "Digital Methods of Data Mining". The conference has being held biennially since 1989. It is one of the most recognizable scientific forums on data mining, machine learning, pattern recognition, image analysis, signal processing, and discrete analysis.

The Organizing Committee of IDP-2018 is grateful to Forecsys Co. and CFRS Co. for providing assistance in the conference preparation and execution. The conference is funded by RFBR, grant 18-07-20075.

The conference website http://mmro.ru/en/.

For equations of mathematical physics, which are the Euler-Lagrange equation of the corresponding variational problems, an important class of solutions are soliton solutions. The study of soliton solutions is based on the existence of a one-to-one correspondence between soliton solutions for initial systems and solutions of induced functional- differential equations of pointwise type (FDEPT). The existence and uniqueness theorem for an induced FDEPT guarantees the existence and uniqueness of a soliton solution with given initial values for systems with a quasilinear potential. For systems with a quasilinear potential, one can also formulate the conditions for the existence of a periodic solution. A system with a polynomial potential can be redefined so that the resulting potential turns out to be quasilinear. If a guaranteed periodic soliton solution for such an overdetermined system lies in a sphere, outside which the potential is redefined, then we obtain the conditions for the existence of a periodic soliton solution for the initial system with a polynomial potential. An important task is the numerical realization of periodic soliton solutions for systems with a polynomial potential, which has been successfully solved.

The paper makes a brief introduction into multiple classifier systems and describes a particular algorithm which improves classification accuracy by making a recommendation of an algorithm to an object. This recommendation is done under a hypothesis that a classifier is likely to predict the label of the object correctly if it has correctly classified its neighbors. The process of assigning a classifier to each object involves here the apparatus of Formal Concept Analysis. We explain the principle of the algorithm on a toy example and describe experiments with real-world datasets.

The article is devoted to the history and problems of creating interfaces. Shows the complexity and importance of effective interfaces, noted that this problem is a system of multilevel interdisciplinary. The new systems should be given serious attention to issues of human efficiency level. Man is still the leading element in determining the efficiency of any ergatic system. The main means of control in ergatic systems including computers, is the graphic manipulator (GM), with which to control the on-screen controls. Are the main styles of user interface. The most popular are GUI-interface (GUI - GraphicalUserInterface) and based on them WUI-interface (WUI-WebUserInterface). The development of equipment and technology of computer modeling led to the active introduction of virtual reality technology to ensure the inclusion of people in artificial worlds. Their main feature - full control of all the parameters of the development and the emergence of a sense of presence in people who live in these environments, which are called immersive. Technology induced environments allow a number of new, not generally applicable to the present, of interfaces using specially engineered virtual environments. Much attention is paid to creating the most advanced systems - systems contact management, which are the camera and sophisticated software. The drawbacks of modern non-contact control. Is being developed to create a contactless intelligent interface, which will allow: to control with data from a video camera, which is installed on your computer have a high noise immunity, clearly identify the user to recognize the situational environment, have an acceptable cost.

The paper deals with the problems of creating and tuning a system of automated anaphora resolution for Russian. Such a system is introduced, combining rule-based and machine learning approaches. It shows F-measure from 0.51 to 0.59. Freeling serves as an underlying morphological layer and an account of its quality is given, with its influence on anaphora resolution workflow. The anaphora resolution system itself is available to download and use, coming with online demo.

This proceedings publication is a compilation of selected contributions from the “Third International Conference on the Dynamics of Information Systems” which took place at the University of Florida, Gainesville, February 16–18, 2011. The purpose of this conference was to bring together scientists and engineers from industry, government, and academia in order to exchange new discoveries and results in a broad range of topics relevant to the theory and practice of dynamics of information systems. Dynamics of Information Systems: Mathematical Foundation presents state-of-the art research and is intended for graduate students and researchers interested in some of the most recent discoveries in information theory and dynamical systems. Scientists in other disciplines may also benefit from the applications of new developments to their own area of study.

In an effort to make reading more accessible, an automated readability formula can help students to retrieve appropriate material for their language level. This study attempts to discover and analyze a set of possible features that can be used for single-sentence readability prediction in Russian. We test the influence of syntactic features on predictability of structural complexity. The readability of sentences from SynTagRus corpus was marked up manually and used for evaluation.

In Experimental Economics, laboratory and eld experiments are conducted on subjects in order to improve theoretical knowledge about human behavior in interactions. Although paying dierent amounts of money restricts the preferences of the subjects in experiments, the exclusive application of analytical game theory does not suce to explain the recorded data. It exacts the development and evaluation of more sophisticated models. In some experiments, human subjects are involved into an interaction with automated agents and these agents are used for simulating human interactions. The more data is used for the evaluation, the more of statistical signicance can be achieved. Since huge amounts of behavioral data are required to be scanned for regularities and automated agents are required to simulate and to intervene human interactions, Machine Learning is the tool of choice for the research in Experimental Economics. Moreover modern economics extensively involves network structures, which can be modeled as graphs or more complicated relational structures. This volume contains the papers presented at the inaugural International Workshop on Experimental Economics and Machine Learning (EEML 2012) held on May 9, 2012 at the Katholieke Universiteit Leuven, Belgium. This year the committee decided to accept 8 full papers for publication in the proceedings and two abstracts for presentation at the conference. Each submission was reviewed by on average 3 program committee members. R. Tagiew proposes a new method for mining determinism in human strategic behavior. N. Buzun et al. present a comparison of methods and measures for overlapping community detection. A. Fishkov et al. discuss a new click model for relevance prediction inWeb search. A. Drutsa et al. applied novel data visualisation techniques to socio-semantic network data. Gilabert et al. made an experimental study on the relationship between trust and budgetary slack. O. Barinova et al. proposed using online random forest for interactive image segmentation. A. Bezzubtseva et al. built a new typology of collaboration platform users. V. Zaharchuk et al. proposed a new recommender system for interactive radio network services. D. Ignatov et al. designed a prototype system for collaborative platform data analysis.

This paper is an overview of the current issues and tendencies in Computational linguistics. The overview is based on the materials of the conference on computational linguistics COLING’2012. The modern approaches to the traditional NLP domains such as pos-tagging, syntactic parsing, machine translation are discussed. The highlights of automated information extraction, such as fact extraction, opinion mining are also in focus. The main tendency of modern technologies in Computational linguistics is to accumulate the higher level of linguistic analysis (discourse analysis, cognitive modeling) in the models and to combine machine learning technologies with the algorithmic methods on the basis of deep expert linguistic knowledge.

Most of today’s machine learning techniques requires large manually labeled data. This problem can be solved by using synthetic images. Our main contribution is to evaluate methods of traffic sign recognition trained on synthetically generated data and show that results are comparable with results of classifiers trained on real dataset. To get a representative synthetic dataset we model different sign image variations such as intra-class variability, imprecise localization, blur, lighting, and viewpoint changes. We also present a new method for traffic sign segmentation, based on a nearest neighbor search in the large set of synthetically generated samples, which improves current traffic sign recognition algorithms.

A form for an unbiased estimate of the coefficient of determination of a linear regression model is obtained. It is calculated by using a sample from a multivariate normal distribution. This estimate is proposed as an alternative criterion for a choice of regression factors.