### Book

## Proceedings of the 7th Spring/Summer Young Researchers’ Colloquium on Software Engineering, SYRCoSE 2013

The issue contains the papers presented at the 7th Spring/Summer Young Researchers' Соllоquium оn Software Engineering (SYRCoSE 2013) held in Kazan, Russia on 30th and З1st оf Мay, 2013. Paper selection was based on a competitive peer review process being done by the program committee. Both regular and reseаrсh-in-рrogrеss papers were соnsidered ассeрtable for the colloquium.

The topics of the colloquium include modeling of computer systems, software testing and verification, parallel and distributed systems, information search and data mining, image and speech processing and others.

ISBN 978-5-91474-020-4

Different specialists are involved in software development at once: databases designers, business analysts, user interface designers, programmers, testers, etc. It leads to creation and usage in systems designing of various models fulfilled from the different points of view, with different levels of details, which use different modeling languages for the description. Thus there is a necessity of models transformation as between different levels of hierarchy, and within the same level between different modeling languages for creation of united model of system and exporting of models to external systems. The MetaLanguage system is intended to visual domain-specific languages creation. The approaches to development of a model transformation component of MetaLanguage system are considered. This component allows to fulfill vertical and horizontal model transformations of “model-text” and “model-model” types. These transformations are based on graph grammars described by production rules. Each rule contains the left- and right-hand sides. The algorithm of the left-hand side search in the source model and the algorithms of execution of a right-hand side of a rule are described. Transformations definitions for models in ERD notation are presented as example.

Nested Petri nets is an extension of Petri net formalism with net tokens for modelling multi-agent distributed systems with complex structure. While having a number of interesting properties, NP-nets have been lacking tool support. In this paper we present the NPNtool toolset for NP-nets which can be used to edit NP-nets models and check liveness in a compositional way. An algorithm to check m-bisimiliarity needed for compositional checking of liveness has been developed. Experimental results of the toolset usage for modelling and checking liveness of classical dinning philosophers problem are provided.

The work provides a specific approach to modeling and simulating Wireless Sensor Networks (WSN) via nested Petri Nets formalism. The tool for modeling/simulating WSN must take into account resources, time and sensors cost. Even though classical Petri Nets are well-suited for modeling dynamic concurrent systems, they do not have enough expressibility to model systems with distributed agents-sensors. The proposed tool enables user to conduct visual modelling, simulate WSN and find WSN defects on early stages of the WSN development.

*Abstract – *Nowadays approaches, based on models, are used in the development of the information systems. The models can be changed during the system development process by developers. They can be transformed automatically: visual model can be translated into program code; transformation from one modeling language to other can be done. The most appropriate way of the formal visual model presentation is metagraph. The best way to describe changes of visual models is the approach, based on graph grammars (graph rewriting). It is the most demonstrative way to present the transformation. But applying the graph grammar to the graph of model means to find the subgraph isomorphic to the left part of the grammar rule. This is an NP-complete task. There are some algorithms, developed for solving this task. They were designed for ordinary graphs and hypergraphs. In this article we consider some of them in case of using with the metagraphs representing models.

This paper describes our approach to document search based on the ontological resources and graph models. The approach is applicable in local networks and local computers. It can be useful for ontology engineering specialists or search specialists.

Today many problems that are dedicated to a particular problem domain can be solved using DSL. Thus to use DSL it must be created or it can be selected from existing ones. Creating a completely new DSL in most cases requires high financial and time costs. Selecting an appropriate existing DSL is an intensive task because such actions like walking through every DSL and deciding if current DSL can handle the problem are done manually. This problem appears because there are no DSL repository and no tools for matching suitable DSL with specific task. This paper observes an approach for implementing an automated detection of requirements for DSL (ontology-based structure) and automated DSL matching for specific task.

Volume of the data information system operate has been rapidly increasing. Data logs have long been known, as they are a useful tool to solve a range of tasks. The amount of information that is written to a log during a specified length of time leads to the so-called problem of “big data”. Process-aware information systems (PAIS) allow developing models of processes interaction, been monitoring accuracy of their performance and correctness of interaction with each other. Studying logs of PAIS in order to extract knowledge about the processes and construct their models has to do with the process mining discipline. There are available developed tools for process mining, both on a commercial and on a free basis. We are proposing a concept of a new DPMine tool for building a model of multistage process mining from individual processing units connected to each other in a processing graph. The resulting model is executed (simulated) by making an incremental process from the beginning to the end.

This article contains the implementation description of a real estate market offers aggregator service. Advertisement analysis is made with the aid of ontologies. A set of ontologies to describe specific websites can be extended, so the aggregator can be used for many diverse resources.

Creation of test programs and analysis of their execution is the main approach to system-level verification of microprocessors. A lot of techniques have been proposed to automate test program generation, ranging from completely random to well directed ones. However, no “silver bullet” has been found. In good industrial practices, various methods are combined complementing each other. Unfortunately, there is no solution that could integrate all (or at least most) of the techniques in a single framework. Engineers are forced to use a number of tools, which leads to the following problems: (1) it is required to maintain duplicating data (each tool uses its own representation of the target design); (2) to be used together, tools need to be integrated (engineers have to deal with different formats and interfaces). This paper proposes a concept of an extendable framework (MicroTESK) that follows a unified methodology for defining test program generation techniques. The framework supports random and combinatorial generation and (what is even more important) can be easily extended with new techniques being implemented as the framework’s plugins.

Process mining techniques relate observed behavior to modeled behavior, e.g., the automatic discovery of a process model based on an event log. Process mining is not limited to process discovery and also includes conformance checking and model enhancement. Conformance checking techniques are used to diagnose the deviations of the observed behavior as recorded in the event log from some process model. Model enhancement allows to extend process models using additional perspectives, conformance and performance information. In recent years, BPMN (Business Process Model and Notation) 2.0 has become a de facto standard for modeling business processes in industry. This paper presents the BPMN support current in ProM. ProM is the most known and used open-source process mining framework. ProM’s functionalities of discovering, analyzing and enhancing BPMN models are discussed. Support of the BPMN 2.0 standard will help ProM users to bridge the gap between formal models (such as Petri nets, causal nets and others) and process models used by practitioners.

Process mining is a new direction in the field of modeling and analysis of processes, where using information from event logs, describing the history of the system behavior, plays an important role. Methods and approaches used in the process mining are often based on various heuristics, and experiments with large event logs are crucial for the study and comparison of the developed methods and algorithms. Such experiments are very time consuming, so automation of experiments is an important task in the field of process mining. This paper presents the language DPMine developed specifically to describe and carry out experiments on the discovery and analysis of process models. The basic concepts of the DPMine language, as well as principles and mechanisms of its extension are described. Ways of integration of the DPMine language as dynamically loaded components into the VTMine modeling tool are considered. An illustrating example of an experiment to build a fuzzy model of the process discovered from the log data stored in a normalized database is given.

The 13rd IEEE International Conference on Data Mining (IEEE ICDM 2013) has solicited workshops on topics related to new research directions and novel applications of data mining. The goal of the ICDM workshops program (IEEE ICDMW) is to identify grand challenges in data mining, to explore the possible paths to address these urgent problems, and to solicit broad participation from the data mining community and other relevant research communities. IEEE ICDMW 2013 was held on December 7 in Dallas, Texas, USA, and was immediately followed by IEEE ICDM 2013. This year, we have received 41 workshop proposals, a 141% increase from the number of proposals in the previous year. Of those submissions, 26 workshop proposals were accepted through a thorough review by the ICDMW workshop organization committee. 18 workshops eventually made their way to prepare their workshop programs after a rigorous paper review process. The final program consisted of 13 full-day workshops and 5 halfday workshops. Overall, the ICDMW Program received 364 submissions, which is a 19% increase from the number of submissions in the previous year. Of those submissions, 183 papers were accepted. The workshop proposal acceptance rate is about 44%, and the workshop papers acceptance rate is about 50%. The highly competitive acceptance rates have resulted in the highquality and exciting ICDMW proceedings. IEEE ICDMW 2013 covered many new research and application areas as well as fundamental data mining topics. The traditional and fundamental disciplines included spatial and spatiotemporal data mining, optimization, concept drift, domain driven data mining, opinion mining, and sentiment analysis. Emerging disciplines included high-dimensional data mining, causal discovery, cloud and distributed computing, data mining in service applications, and of course, big data. IEEE ICDMW 2013 provided discussion forums for exciting applications including biological data mining in healthcare, data mining in networks, data privacy, and data mining case studies. The ICDMW Program also explored new areas of data markets in sciences and businesses, data mining in experimental economics, and data mining in astronomical problems. Many people worked together in organizing IEEE ICDMW 2013. We would like to thank all workshop organizers for the high-quality workshop proposals received. The workshop organizers are the key to the success of the ICDMW program. We should thank them all for their tremendous effort putting together 18 exciting workshops in the final program.

Process mining is a relatively new field of computer science which deals with process discovery and analysis based on event logs. In this work we consider the problem of discovering workflow nets with cancellation regions from event logs. Cancellations occur in the majority of real-life event logs. In spite of huge amount of process mining techniques little has been done on cancellation regions discovery. We show that the state-based region algorithm gives labeled Petri nets with overcomplicated control flow structure for logs with cancellations. We propose a novel method to discover cancellation regions from the transition systems built on event logs and show the way to construct equivalent workflow net with reset arcs to simplify the control flow structure.

Process-aware information systems (PAIS) enable developing models for interaction of processes, monitoring accuracy of their execution and checking if they interact with each other properly. PAIS can generate large data logs that contain the information about the interaction of processes in time. Studying PAIS logs with the purpose of data mining and modeling lies within the scope of Process Mining. There is a number of tools developed for Process Mining, including the most ubiquitous ProM, whose functionality is extended by plugins. To perform an object-aware experiment one has to sequentially run multiple plugins. This process becomes extremely time-consuming in the case of large-scale experiments involving a large number of plugins. The paper proposes a concept of DPMine/P language of process modeling and analysis to be implemented in ProM. The language under development aims at joining separate stages of the experiment into a single sequence, that is an experiment model. The implementation of the basic semantics of the language is done through the concept of blocks, ports, connectors and schemes. These items are discussed in detail in the paper, and examples of their use for specific tasks are presented ibid.

We consider certain spaces of functions on the circle, which naturally appear in harmonic analysis, and superposition operators on these spaces. We study the following question: which functions have the property that each their superposition with a homeomorphism of the circle belongs to a given space? We also study the multidimensional case.

We consider the spaces of functions on the m-dimensional torus, whose Fourier transform is p -summable. We obtain estimates for the norms of the exponential functions deformed by a C1 -smooth phase. The results generalize to the multidimensional case the one-dimensional results obtained by the author earlier in “Quantitative estimates in the Beurling—Helson theorem”, Sbornik: Mathematics, 201:12 (2010), 1811 – 1836.

We consider the spaces of function on the circle whose Fourier transform is p-summable. We obtain estimates for the norms of exponential functions deformed by a C1 -smooth phase.