This paper provides a comprehensive overview of the gapping dataset for Russian that consists of 7.5k sentences with gapping (as well as 15k relevant negative sentences) and comprises data from various genres: news, fiction, social media and technical texts. The dataset was prepared for the Automatic Gapping Resolution Shared Task for Russian (AGRR-2019) - a competition aimed at stimulating the development of NLP tools and methods for processing of ellipsis. In this paper, we pay special attention to the gapping resolution methods that were introduced within the shared task as well as an alternative test set that illustrates that our corpus is a diverse and representative subset of Russian language gapping sufficient for effective utilization of machine learning techniques.
This book concentrates on in-depth explanation of a few methods to address core issues, rather than presentation of a multitude of methods that are popular among the scientists. An added value of this edition is that I am trying to address two features of the brave new world that materialized after the first edition was written in 2010. These features are the emergence of “Data science” and changes in student cognitive skills in the process of global digitalization. The birth of Data science gives me more opportunities in delineating the field of data analysis. An overwhelming majority of both theoreticians and practition-ers are inclined to consider the notions of ‘data analysis” (DA) and “machine learning” (ML) as synonymous. There are, however, at least two differences between the two. First comes the difference in perspectives. ML is to equip computers with methods and rules to see through regularities of the environment - and behave accordingly. DA is to enhance conceptual understanding. These goals are not inconsistent indeed, which explains a huge overlap between DA and ML. However, there are situations in which these perspectives are not consistent. Regarding the current students’ cognitive habits, I came to the conclusion that they prefer to immediately get into the “thick of it”. Therefore, I streamlined the presentation of multidimensional methods. These methods are now organized in four Chapters, one of which presents correlation learning (Chapter 3). Three other Chapters present summarization methods both quantitative (Chapter 2) and categorical (Chapters 4 and 5). Chapter 4 relates to finding and characterizing partitions by using K-means clustering and its extensions. Chapter 5 relates to hierarchical and separative cluster structures. Using encoder-decoder data recovery approach brings forth a number of mathematically proven interrelations between methods that are used for addressing such practical issues as the analysis of mixed scale data, data standardization, the number of clusters, cluster interpretation, etc. An obvious bias towards summarization against correlation can be explained, first, by the fact that most texts in the field are biased in the opposite direction, and, second, by my personal preferences. Categorical summarization, that is, clustering is considered not just a method of DA but rather a model of classification as a concept in knowledge engineering. Also, in this edition, I somewhat relaxed the “presentation/formulation/computation” narrative struc-ture, which was omnipresent in the first edition, to be able do things in one go. Chapter 1 presents the author’s view on the DA mainstream, or core, as well as on a few Data science issues in general. Specifically, I bring forward novel material on the role of DA, including its successes and pitfalls (Section 1.4), and classification as a special form of knowledge (Section 1.5). Overall, my goal is to show the reader that Data science is not a well-formed part of knowledge yet but rather a piece of science-in-the-making.
The materials of The International Scientific – Practical Conference is presented below. The Conference reflects the modern state of innovation in education, science, industry and social-economic sphere, from the standpoint of introducing new information technologies. It is interesting for a wide range of researchers, teachers, graduate students and professionals in the field of innovation and information technologies.
This volume collects the referred papers based on plenary, invited, and oral talks, as well on the posters presented at the Third International Conference on Computer Simulations in Physics and beyond (CSP2018), which took place September 24-27, 2018 in Moscow. The Conference continues the tradition started by an inaugural conference in 2015. It took place on the campus of A.N. Tikhonov Moscow Institute of Electronics and Mathematics in Strogino, was jointly organized by the National Research University Higher School of Economics, the Landau Institute for Theoretical Physics and Science Center in Chernogolovka.
The Conference is a multidisciplinary meeting, with a focus on computational physics and related subjects. Indeed, methods of computational physics prove useful in a broad spectrum of research in multiple branches of natural sciences, and this volume provides a sample.
We hope that this volume will interest readers, and we are already looking forward to the next conference in the series.
CSP2018 Conference Chair and Volume Editor
2019 International Siberian Conference on Control and Communications (SIBCON). Proceedings
This book constitutes the refereed proceedings of the 9th International Conference on Optimization and Applications, OPTIMA 2018, held in Petrovac, Montenegro, in October 2018.The 35 revised full papers and the one short paper presented were carefully reviewed and selected from 103 submissions. The papers are organized in topical sections on mathematical programming; combinatorial and discrete optimization; optimal control; optimization in economy, finance and social sciences; applications.
This book covers the classical theory of Markov chains on general state-spaces as well as many recent developments. The theoretical results are illustrated by simple examples, many of which are taken from Markov Chain Monte Carlo methods. The book is self-contained, while all the results are carefully and concisely proven. Bibliographical notes are added at the end of each chapter to provide an overview of the literature.
Proceedings of Third Workshop "Computational linguistics and language science"
Sustaining a competitive edge in today’s business world requires innovative approaches to product, service, and management systems design and performance. Advances in computing technologies have presented managers with additional challenges as well as further opportunities to enhance their business models.
Software Engineering for Enterprise System Agility: Emerging Research and Opportunities is a collection of innovative research that identifies the critical technological and management factors in ensuring the agility of business systems and investigates process improvement and optimization through software development. Featuring coverage on a broad range of topics such as business architecture, cloud computing, and agility patterns, this publication is ideally designed for business managers, business professionals, software developers, academicians, researchers, and upper-level students interested in current research on strategies for improving the flexibility and agility of businesses and their systems.
Computer simulations are nowadays a rmly established third pillar of modern natural sciences, complementing experimentation and paper-and-pencil theoret- ical studies. Simulations, experiments in silico, prove indispensable in diverse areas of research in physics and other natural sciences. This volume collects papers based on presentations delivered at the Sec- ond International Conference on Computer Simulations in Physics and beyond (CSP2017), which took place October 9-12, 2017 in Moscow. The Conference, which continues a biannual tradition started by an innaugural conference in 2015, took place on campus of A.N. Tikhonov Moscow Institute of Electronics and Mathematics, was jointly organized by the National Research University Higher School of Economics, the Landau Insitute for Theoretical Physics and Science Center in Chernogolovka. As the name implies, the Conference is a multidisciplinary meeting, with a focus on computational physics and related subjects. Indeed, methods of computational physics prove useful in a broad spectrum of research in multiple branches of natural sciences, and this volume provides a sample. We hope that this volume will interest a wide range of readers, and we are already looking forward for the next conference in this biannual series.
The 29th DAAAM International Symposium on Intelligent Manufacturing and Automation took place in Zadar, Croatia between the 24th and 27th October 2018, during the DAAAM International Week. The Symposium was organized by DAAAM International Vienna in cooperation with ÖIAV 1848, Vienna University of Technology, International Academy of Engineering and University of Applied Sciences – Technikum Wien and Under the Auspices of the Danube Rectors’ Conference & Rectors’ and Presidents’ Honor Committee of DAAAM International for 2018. The Symposium took place in Zadar, Croatia. This year’s symposium aimed at continuing the success of the previous years, focusing on the five-fold traditional objectives of the symposium: the presentation of the most recent high-quality results, support of development of young scientists and researchers, organization of international (summer) doctoral school, inauguration of new members of Central European Branch of International Academy of Engineering and the provision of the necessary setting for stimulating discussions, brainstorming and networking among European and international researchers coming both from the academia government agencies and industry.
The IEEE Russia North West Section, Saint Petersburg Electrotechnical University “LETI”, and the European Centre for Quality (Moscow) are pleased to present the Proceedings of the 2018 IEEE International Conference "Quality Management, Transport and Information Security, Information Technologies" (IT&QM&IS). The Conference was held in St. Petersburg, Russia on September 24–29, 2018, and it was proudly hosted by Saint Petersburg Electrotechnical University “LETI”. The Organizing Committee believes and trusts that we have been true to the spirit of collegiality that members of IEEE value whilst also maintaining a high standard as we reviewed papers, provided feedback and now present a strong body of published work in this collection of proceedings. The themes for this year's conference were chosen as a means of bringing together academics and industrialists, engineering and management research, manufacturing and teaching, and providing a basis for discussion of issues arising across the engineering and business community in relation to Quality Management, Information Technologies, Transport and Information Security aimed at developing engineers and managers for the future. The goal of these proceedings has been to present high quality work in an accessible medium, for use in a wide community of academics, engineers, managers, and industrialists, the community united by the key words Science, Education, Quality, Innovations in engineering. To achieve this aim, all abstracts were blind reviewed, and full papers submitted for publication in this journal of proceedings were subjected to a rigorous reviewing process.
Workshop on Program Semantics, Specification and Verification: Theory and Applications is the leading event in Russia in the field of applying of the formal methods to software analysis. Proceedings of the ninth workshop dedicated to formalisms for program semantics, formal models and verification, programming and specification languages, algebraic and logical aspects of programming.
This book constitutes the refereed proceedings of the 14th International Workshop on Enterprise and Organizational Modeling and Simulation, EOMAS 2018, held in Tallinn, Estonia, in June 2018. The main focus of EOMAS is on the role, importance, and application of modeling and simulation within the extended organizational and enterprise context. The 11 full papers presented in this volume were carefully reviewed and selected from 22 submissions. They were organized in topical sections on conceptual modeling, enterprise engineering, and formal methods.
This state-of-the-art survey is dedicated to the memory of Emmanuil Markovich Braverman (1931-1977), a pioneer in developing the machine learning theory. The 12 revised full papers and 4 short papers included in this volume were presented at the conference "Braverman Readings in Machine Learning: Key Ideas from Inception to Current State" held in Boston, MA, USA, in April 2017, commemorating the 40th anniversary of Emmanuil Braverman's decease. The papers present an overview of some of Braverman's ideas and approaches. The collection is divided in three parts. The first part bridges the past and the present. Its main contents relate to the concept of kernel function and its application to signal and image analysis as well as clustering. The second part presents a set of extensions of Braverman's work to issues of current interest both in theory and applications of machine learning. The third part includes short essays by a friend, a student, and a colleague.
This book constitutes the proceedings of the 7th International Conference on Analysis of Images, Social Networks and Texts, AIST 2018, held in Moscow, Russia, in July 2018.
The 29 full papers were carefully reviewed and selected from 107 submissions (of which 26 papers were rejected without being reviewed). The papers are organized in topical sections on natural language processing; analysis of images and video; general topics of data analysis; analysis of dynamic behavior through event data; optimization problems on graphs and network structures; and innovative systems.
The number of space objects will grow several times in a few years due to the planned launches of constellations of thousands microsatellites. It leads to a significant increase in the threat of satellite collisions. Spacecraft must undertake collision avoidance maneuvers to mitigate the risk. According to publicly available information, conjunction events are now manually handled by operators on the Earth. The manual maneuver planning requires qualified personnel and will be impractical for constellations of thousands satellites. In this paper we propose a new modular autonomous collision avoidance system called "Space Navigator". It is based on a novel maneuver optimization approach that combines domain knowledge with Reinforcement Learning methods.
In this paper we propose a novel variance reduction approach for additive functionals of Markov chains based on minimization of an estimate for the asymptotic variance of these functionals over suitable classes of control variates. A distinctive feature of the proposed approach is its ability to significantly reduce the overall finite sample variance. This feature is theoretically demonstrated by means of a deep non asymptotic analysis of a variance reduced functional as well as by a thorough simulation study. In particular we apply our method to various MCMC Bayesian estimation problems where it favourably compares to the existing variance reduction approaches.
Two approaches for the synthesis of D-A chromophores containing arylhydrazonocyclopentadiene acceptor moieties were developed. The first approach includes the decarboxylative azo coupling reaction between penta(methoxycarbonyl)cyclopentadienyl potassium or sodium and aryldiazonium salts to give products containing four ester groups at the acceptor moiety. The second one includes the reaction of 1,3-dimethoxycarbonyl-4,5-diphenylcyclopentadienone with arylhydrazine hydrochlorides into arylhydrazonocyclopentadienes with two ester and two phenyl groups. Both series of compounds were investigated by means of absorption spectroscopy and the solvatochromic behavior of two representatives of each series was investigated in various dielectric environments. Both compounds demonstrated relative independence on the environment although the product with the stronger acceptor part was less stable and exhibited a slight hypsochromic shift in polar media. However, the optical properties of this product were strongly affected by the basicity of the medium due to the deprotonation of the NH-group. Quantum chemical modeling of the synthesized products adsorption spectra using different density functionals has shown that PBE0-D3/def2TZVP is an optimal method (out of three tested) for all compounds both in non-vibrationally-resolved and vibrationally-resolved TD-DFT calculations. Accounting for vibronic coupling in TD-DFT calculations is necessary to achieve good agreement with the experiment for compounds synthesized herein.
Simulation is one of the key components in high energy physics. Historically it relies on the Monte Carlo methods which require a tremendous amount of computation resources. These methods may have difficulties with the expected High Luminosity Large Hadron Collider (HL-LHC) needs, so the experiments are in urgent need of new fast simulation techniques. We introduce a new Deep Learning framework based on Generative Adversarial Networks which can be faster than traditional simulation methods by 5 orders of magnitude with reasonable simulation accuracy. This approach will allow physicists to produce a sufficient amount of simulated data needed by the next HL-LHC experiments using limited computing resources.
In HEP experiments CPU resources required by MC simulations are constantly growing and become a very large fraction of the total computing power (greater than 75\%). At the same time the pace of performance improvements from technology is slowing down, so the only solution is a more efficient use of resources. Efforts are ongoing in the LHC experiments to provide multiple options for simulating events in a faster way when higher statistics is needed. A key of the success for this strategy is the possibility of enabling fast simulation options in a common framework with minimal action by the final user. In this talk we will describe the solution adopted in Gauss, the LHCb simulation software framework, to selectively exclude particles from being simulated by the Geant4 toolkit and to insert the corresponding hits generated in a faster way. The approach, integrated within the Geant4 toolkit, has been applied to the LHCb calorimeter but it could also be used for other subdetectors. The hits generation can be carried out by any external tool, e.g. by a static library of showers or more complex machine-learning techniques. In LHCb generative models, which are nowadays widely used for computer vision and image processing are being investigated in order to accelerate the generation of showers in the calorimeter. These models are based on maximizing the likelihood between reference samples and those produced by a generator. The two main approaches are Generative Adversarial Networks (GAN), that takes into account an explicit description of the reference, and Variational Autoencoders (VAE), that uses latent variables to describe them. We will present how both approaches can be applied to the LHCb calorimeter simulation, their advantages as well as their drawbacks.
We propose a novel algorithm portfolio model that incorporates time series forecasting techniques to predict online the performance of its constituent algorithms. The predictions are used to allocate computational resources to the algorithms, accordingly. The proposed model is demonstrated on parallel algorithm portfolios consisting of three popular metaheuristics, namely tabu search, variable neighbourhood search, and multistart local search. Moving average and exponential smoothing techniques are employed for forecasting purposes. A challenging combinatorial problem, namely the detection of circulant weighing matrices, is selected as the testbed for the analysis of the proposed approach. Experimental evidence and statistical analysis provide insight on the performance of the proposed algorithms and reveal the benefits of using forecasting techniques for resource allocation in algorithm portfolios.
The development of wireless communication technologies attracts increased interest to scenarios that impose severe restrictions on data transmission reliability and latency. Such scenarios include real-time applications, such as industrial automation, remote control, video streaming, and virtual reality. It is very difficult to satisfy the requirements imposed on the quality of service with the currently widespread communication technologies. Specifically, it is currently impossible to guarantee a low delay in Wi-Fi networks due to some peculiarities of the applied channel access methods. In this work, we study an approach that provides a low latency and high reliability of communications in Wi-Fi networks on the basis of an additional radio air interface. This approach is studied using the mathematical model of a heterogeneous network, which consists of devices that generate prioritized and non-prioritized data packets. The results of studies show that this approach provides the ability to satisfy the requirements of real-time applications, when certain restrictions on the intensity of prioritized traffic are met. In this case, a decrease in the throughput for non-prioritized traffic is insignificant.
LoRaWAN infrastructure has become widely deployed to provide wireless communications for various sensor applications. These applications generate different traffic volumes and require different quality of service (QoS). The paper presents an accurate mathematical model of low-power data transmission in a LoRaWAN sensor network, which allows accurate validation of key QoS indices, such as network capacity and packet loss ratio. Since LoRaWAN networks operate in the unlicensed spectrum, the model takes into account transmission attempt failures caused by random noise in the channel. Given QoS requirements, we can use the model to study how the performance of a LoRaWAN network depends on the traffic load and other scenario parameters. Since in LoRaWAN networks the transmissions at different modulation and coding schemes (MCSs) typically do not collide, we use the model to assign MCSs to the devices to satisfy their QoS requirements.
Being of high importance, real-time applications, such as online gaming, real-time video streaming, virtual reality, and remote-control drone and robots, introduce many challenges to the developers of wireless networks. Such applications pose strict requirements on the delay and packet loss ratio, and it is hardly possible to satisfy them in Wi-Fi networks that use random channel access. The article presents a novel approach to enable real-time communications by exploiting an additional radio. This approach was recently proposed by us in the IEEE 802.11 Working Group and attracted much attention. To evaluate its gain and to study how real-time traffic coexists with the usual one, a mathematical model is designed. The numerical results show that the proposed approach allows decreasing the losses and delays for the real-time traffic by orders of magnitude, while the throughput for the usual traffic is reduced insignificantly in comparison to existing networks.