Proceedings of the MACSPro Workshop 2019
Companies from various domains record their operational behavior in a form of event logs. These event logs can be analyzed and relevant process models representing the real companies’ behavior can be discovered. One of the main advantages of the process discovery methods is that they commonly produce models in a form of graphs which can be easily visualized giving an intuitive view of the executed processes. Moreover, the graph-based representation opens new challenging perspectives for the application of graph comparison methods to find and explicitly visualize differences between discovered process models (representing real behavior) and reference process models (representing expected behavior). Another important area where graph comparison algorithms can be used is the recognition of process modeling patterns. Unfortunately, exact graph comparison algorithms are computationally expensive. In this paper, we adapt an inexact tabu search algorithm to find differences between BPMN (Business Process Model and Notation) models. The tabu search and greedy algorithms were implemented within the BPMNDiffViz tool and were tested on BPMN models discovered from synthetic and real-life event logs. It was experimentally shown that inexact tabu search algorithm allows to find a solution which is close to the optimal in most of the cases. At the same, its computational complexity is significantly lower than the complexity of the exact A search algorithm investigated earlier.
Electronic trading systems provide the computational support for stock exchanges. Liquid markets use order-driven systems, i.e., where client requests, for trading financial instruments, are served through individual orders. This paper presents Petri net models assembling some crucial processes executed within order-driven systems such as orders submission, application of precedence rules, and the order matching mechanism. Such processes were modelled as types of agents running in a multi-agent system (MAS) using nested Petri nets (NP-nets) - a convenient formalism for modelling MAS. With NP-nets, we focus on the control-flow perspective (causal dependence between activities executed by agents) and in the synchronization between agents. Conversely, we have used coloured Petri nets to extend the model including orders as objects with attributes. Thus, this work with Petri nets represents an experimental & initial research phase to validate trading systems using related methods such as process mining, simulations and model checking.
The purpose of this study is to identify the position of non- performing inflow zones (sources) in a wellbore by means of machine learning techniques. The training data are obtained using the transient multiphase simulators and represented as the following time-series: bottom- hole pressure, well-head pressure, flowrates of gas, oil, and water along with a target vector of size N, where each element is a binary variable indicating the productivity of the respective inflow zone. The goal is to predict the target vector of active and non-active inflow sources given the surface parameters for an unseen well. A variety of machine learning techniques has been applied to solve this task including feature extrac- tion and generation, dimensionality reduction, ensembles and cascades of learning algorithms, and deep learning. The results of the study can be used to provide more efficient and accurate monitoring of gas and oil production and informed decision making.
Process models discovered from event logs of multi-agent systems may be complicated and unreadable. To overcome this problem, we suggest using a compositional approach. A system model is composed from agent models w.r.t. an interface. Morphisms guarantee that composition of correct models is correct. This study contributes to the practical implementation of the morphism-based compositional approach. We use interaction patterns to model typical interfaces. Experimental evaluation justifies the practical value of the compositional approach.
Mobile (cellular) networks represent a fast evolving research field that take advantages of recent technological advances such as Big Data and distributed computing to provide extensive network monitoring for network operation planning and management purposes. Challenges related to making use of the large volume of streaming data generated by mobile networks include extracting relevant elements within massive amounts of signals possibly spread across different sources (data bases), reducing dimensionality, summarizing dynamic information in a comprehensible way and displaying it for interpretation purposes. The adequate network modeling provides both statistical data of network performance and important networking QoS related insights. Comprehensible mobile network information is needed that uncovers the role of each attribute and variable. To harness the complexity of mobile network data and to extract relevant information a dedicated distributed computing platforms and Big Data frameworks are needed, able to discover and deal with the inherent properties and complexities of these datasets. Smart network monitoring service plays a central role here.
DNAsecondary structures are important functional elements thatmay influence cellular processes. One of theirpossible functions is regulation of nucleosome positioning. Here MNAse-seq and ssDNA-seq data were used to define patterns of positional relationship of DNA structures such as Z-DNA, H-DNA and G-quadruplexes with nucleosomes. Three types of patterns werefound: a structure is surrounded by nucleosomes from both sides, from one side, or nucleosome free region. Machine-learning models based on Random forest algorithm and XGBoost weretrained to recognize DNA region of 500 bp length containing a pattern of nucleosome positioning for three types of DNA struc-tures (Z-DNA, H-DNA and G-quadruplexes) based on DNAsequence composi-tional properties. The best performance (more than 86% for ROC-AUC, accu-racy, recall and presicion scores) wasreached for G-quadruplexes. 500 bp re-gions containing G-quadruplexes have distinct compositional properties and point to the preferential locations of the defined patterns, which regulatory functions require further investigation. For other DNA structures a region com-position is less powerful predictive factor and one should take into account oth-er physical and structural DNA properties to improve nucleosome-DNA-structure pattern recognition.
We construct a structural model of the occupational choice under unemployment, giving a natural definition of necessity entrepreneurs as individuals who fail to find a salaried job, but running business, they are able to earn more than the unemployment benefit. The existence of the necessity entrepreneurs shrinks unemployment in the economy and positively affects the welfare.
In 1987, Bak, Tang, and Wiesenfeld introduced a mechanism (hereafter, the BTW mechanism) that underlies self-organized critical systems. Extreme events generated by the BTW mechanism are be- lieved to exhibit an unpredictable occurrence. In spite of this general opinion, the largest events in the original BTW model are efficiently predictable by algorithms that exploit information that is hidden in ap- plications. Intending to relate the predictability of self-organized critical systems with the level of its asymmetry, we examine the inter-event dis- tribution of extreme avalanches generated by the BTW mechanism on symmetrical and asymmetrical self-similar lattices. Initially, we claim that the main part of the size-frequency relationship is power-law in- dependent of the asymmetry, but the asymmetry reduces the range of scale-free avalanches in the domain of small avalanches. Further, we turn to extremes and claim that they are located on the downward bend of the distribution of the avalanches over their sizes. Finally, we compare the probability distribution of waiting time between two successive extremes with the exponential distribution. The latter gives the reference point of the complete unpredictability naturally measured in terms of the sum of two rates related to type I and II statistical errors: the rate of the unpredicted avalanches and the alarm time rate. We posit that the devi- ations of the observed probability distribution from the exponential one do not affect the unpredictability of extremes drawn from the waiting time between them.
The article examines the influence of human temperament on academic performance and predictions of "risky" students. Analysis was held with the help of statistics methods and methods of data mining. The baseline data for the study is information about students, collected using the online support system for the educational process at HSE - LMS (Learning Management System). The study found a relationship between temperament and academic success, making it possible to predict "risky" students. The result of the study was the recommendations of the education office to draw the attention of "risky" students, to carry out preventive measures: the organization of electives, the assignment of a curator, a check of the student's readiness for classes.