The theme of code reuse in software development is still important. Sometimes it is hard to find out what exactly we need to reuse in isolation of context. However, there is an opportunity to narrow the context problem, if applications in one given domain are considered. Same features in different applications in one domain have the same context respectively so the common part must be reused. Hence, the problem of domain analysis arises. On the other hand, there is metaCASE-techonology that allows to generate code of an application using diagrams, which describe this application. The main objective of this article is to present the technology for application family creation which connects the metaCASE-techonology and results of domain analysis activity. We propose to use some ideas of FODA (feature-oriented domain analysis) approach for domain analysis and use feature diagrams for describing of variability in a domain. Then we suggest to generate metamodel of the domain-specific visual language, based on feature diagram. After that based on generated metamodel domain-specific visual language editor is generated with the aid of metaCASE-tool. With this language user can connect and configure existing feature implementations thus producing an application. This technology supposed to be especially useful for software product lines.
he paper presents a framework for fast text analytics developed during the Texterra project. Texterra is a technology for multilingual text mining based on novel text processing methods that exploit knowledge extracted from user-generated content. It delivers a fast scalable solution for text mining without the expensive customization. Depending on use-cases Texterra could be utilized as a library, extendable framework or scalable cloudbased service. This paper describes details of the project, use-cases and results of evaluation for all developed tools. Texterra utilizes Wikipedia as a primary knowledge source to facilitate text mining in arbitrary documents (news, blogs, etc). We mine the graph of Wikipedia’s links to compute semantic relatedness between all concepts described in Wikipedia. As a result, we build a semantic graph with more than 5 million concepts. This graph is exploited to interpret meanings and relationships of terms in text documents. In spite of large size, Wikipedia doesn’t contain information about many domain-specific concepts. In order to increase applicability of the technology we developed several automatic knowledge extraction tools. These tools include systems for knowledge extraction from MediaWiki resources and Linked Data resources, as well as system for knowledge base extension with concepts described in arbitrary text documents using original information extraction techniques. In addition, utilization of information from Wikipedia allows easily extend Texterra for support of new Natural languages. The paper presents evaluation of Texterra applied for different text processing tasks (part-of-speech tagging, word sense disambiguation, keyword extraction and sentiment analysis) for English and Russian.
State identification is a long standing problem in the area of Finite State Machine (FSM) based modeling and testing of discrete event systems. For the identification of the current state of the system, so-called homing and synchronizing experiments with FSMs are used whereas for the initial state identification one can perform a distinguishing experiment. The homing, synchronizing, and distinguishing experiments are known as “gedanken” experiments, and the sequences for these experiments can be derived for deterministic and nondeterministic, partial and complete specification FSMs that are used to formally represent the required behavior of systems under investigation. The problems of checking the existence and derivation of homing, synchronizing, and distinguishing sequences are known to become harder as a specification FSM turns to be nondeterministic and partial. It is also known that in some cases the complexity can be reduced through a ‘switch’ from preset to adaptive experiment derivation. In this paper, we study how the partiality and adaptivity affect the complexity of checking the existence of homing/synchronizing/distinguishing sequences for deterministic and nondeterministic FSMs and visualize the complexity issues via appropriate figures. We also mention that the existing solutions to state identification problems are widely used for verification and testing of finite state transition systems.
The concept of Z-numbers introduced by Zade in 2011 is discussed topically nowadays due to it aptitude to deal with nonlinearities and uncertainties whose are common in real life. It was a large step of representing fuzzy logic, however that numbers created much larger problems of how to calculate them or aggregate multiple numbers of that type. Z-numbers have a significant potential in the describing of the uncertainty of the human knowledge because both the expert assessment and the Z-number consists of restraint and reliability of the measured value. In this paper, a method of converting an expert opinion to Z-number is proposed according to set of specific questions. In addition, the approach to Z-numbers aggregation is introduced. Finally, submitted methods are demonstrated on a real example. The topicality of the research is developing a new algorithm and software (based on that development) which could help people make decision in a messy uncertainty with many parameters and factors that are also defined imprecisely. In this work, we present the research that is aimed to find the most efficient methods to operate them (aggregate, add, divide). Firstly, it is important to identify all existing methods of aggregating Z-numbers. Secondly, it is needed to invent new methods of how work with them. The most interesting techniques should be detailed and summarized. There is also a program that is developed to model the real-word task of decision-making.
The Metric Travelling Salesman Problem is a subcase of the Travelling Salesman Problem (TSP), where the triangle inequality holds. It is a key problem in combinatorial optimization. Solutions of the Metric TSP are generally used for costs minimization tasks in logistics, manufacturing, genetics and other fields. Since this problem is NP-hard, heuristic algorithms providing near optimal solutions in polynomial time will be considered instead of the exact ones. The aim of this article is to experimentally find Pareto optimal heuristics for Metric TSP under criteria of error rate and run time efficiency. Two real-life kinds of inputs are intercompared - VLSI Data Sets based on very large scale integration schemes and National TSPs that use geographic coordinates of cities. This paper provides an overview and prior estimates of seventeen heuristic algorithms implemented in C++ and tested on both data sets. The details of the research methodology are provided, the computational scenario is presented. In the course of computational experiments, the comparative figures are obtained and on their basis multi-objective optimization is provided. Overall, the group of Pareto-optimal algorithms for different consists of some of the MC, SC, NN, DENN, CI, GRD, CI + 2-Opt, GRD + 2-Opt, CHR and LKH heuristics.
The routing problems are important for logistic and transport sphere. Basically, the routing problems related to determining the optimal set of routes in the multigraph. The Chinese postman problem (CPP) is a special case of the routing problem, which has many potential applications. We propose to solve the MCPP (special NP-hard case of CPP, which defined on mixed multigraph) using the reduction of the original problem into General Travelling Salesman Problem (GTSP). The variants of CPP are pointed out. The mathematical formulations of some problems are presented. The algorithm for reduction the MCPP in multigraph into GTSP is shown. The experimental results of solving MCPP in multigraph through the reduction into GTSP are presented.
The set of outline and guidelines as well as tools for automation to ensure continuity of business-continuity in the lifecycle of mission critical systems are considered. This complex is called the Life Cycle Supporting System (LCSS). The aim of the system is to reduce the risk level of the realization of critical errors in the system and application software throughout the life cycle of mission critical system, reducing operational risks and total cost of ownership of mission critical system. LCSS in the ISO / IEC / IEEE 15288 terms is enabling system. LCSS is created for life cycle of mission critical system support in the organization-owner.
With the popularization of social media, a vast amount of textual content with additional geo-located and time-stamped information is directly generated by human every day. Both tweet meaning and extended message information can be analyzed in a purpose of exploration of public mood variations within a certain time periods. This paper aims at describing the development of the program for public mood monitoring based on sentiment analysis of Twitter content in Russian. Machine learning (naive Bayes classifier) and natural language processing techniques were used for the program implementation. As a result, the client-server program was implemented, where the server-side application collects tweets via Twitter API and analyses tweets using naive Bayes classifier, and the client-side web application visualizes the public mood using Google Charts libraries. The mood visualization consists of the Russian mood geo chart, the mood changes plot through the day, and the mood changes plot through the week. Cloud computing services were used in this program in two cases. Firstly, the program was deployed on Google App Engine, which allows completely abstracts away infrastructure, so the server administration is not required. Secondly, the data is stored in Google Cloud Datastore, that is, the highly-scalable NoSQL document database, which is fully integrated with Google App Engine.
In this article, the routing problems are described. It is shown, that almost all routing problem can be transformed into each other. An example of the Mixed Chinese Postman problem is discussed. The article gives an overview of various variants of Chinese Postman Problem. For all problems the mathematical formulation is given. Moreover, the useful real-life application is presented, too. Then, the article provides a table of possible Chinese Postman problems and identifies parameters that can be varied for obtaining new problems. Five parameters have been identified, such as: presence of set of edges; presence of set of arcs; presence of edges with cost, depending on traversing; the presence of set of required edges; the presence of set of required arcs. It was shown that by varying these parameters one can obtain tasks that were not described earlier but can be used in real life. Four new tasks were identified. Then it is shown that the Chinese Postman problem can be solved as another routing tasks through graph transformations. The method for transforming Chinese Postman problem into the Generalized Travelling Salesman problem is given. Then the results of solving the above problem are presented by simple algorithms, and their effectiveness is shown. The research is not over yet. The testing of other algorithms is planned.
Well-structured transition systems (WSTS) became a well-known tool in the study of concurrency systems for proving decidability of properties based on coverability and boundedness. Each year brings new formalisms proven to be WSTS systems. Despite the large body of theoretical work on the WSTS theory, there has been a notable gap of empirical research of well-structured transition systems. In this paper, the tool for behavioural analysis of such systems is presented. We suggest the extension of SETL language to describe WSTS systems (WSTSL). It makes the description of new formalisms very close to the formal definition. Therefore, it is easy to introduce and modify new formalisms as well as conduct analysis of the behavioural properties without much programming efforts. It is highly convenient when a new formalism is still under active development. Two most studied algorithms for analysis of well-structured transition systems behavior (backward reachability and the Finite Reachability Tree analyses) have been implemented; and, their performance was measured through the runs on such models as Petri Nets and Lossy Channel Systems. The developed tool can be useful for incorporating and testing analysis methods to formalisms that occur to be well-structuredness transition systems.
In the last few years there has been a growing interest in route building oriented mobile applications with the following features of navigation and sending timely notifications about arrival. Despite the large body of existing knowledge on navigational services, there has been an important issue relative to positioning accuracy. The paper discusses a possible solution to comparison problem, which is linked to the determination of the closeness to destination metro station through finding a difference between user’s current coordinates and fixed-point coordinates. With this end in view, fuzzy logic approach is used to develop Routes Recommender System (RRS) that utilizes linguistic variables to express the vague and uncertain term ‘closeness to…’. The paper provides detailed explanation of each variable considered in the fuzzy inference system (FIS), set of fuzzy rules in line with graphical representation of system’s output. Based on Mamdani model, we propose a set of test cases to check maintainability of the model and provide a description about received results. At a later time, an Android-based mobile application aimed at public transport route building will be developed whose notification system will be based on our model`s implementation presented. It should be emphasized that the paper examines potentials of the modeling approach based on interval type-2 fuzzy sets (IT2FS) that attract much attention these days in various research studies and conventional Mamdani fuzzy inference system (MFIS) as applied to real and rather topical problem. The significance of developing such models may be of a high demand for appropriate representation of factors that are inherently vague and uncertain. Hence, this study may also contribute to future research on similar topics.
Process mining offers various tools for studying process-aware information systems. They mainly involve several participants (or agents) managing and executing operations on the basis of process models. To reveal the actual behavior of agents, we can use process discovery. However, for large-scale processes, it does not yield models, which help understand how agents interact since they are independent and their concurrent implementation can lead to a very sophisticated behavior. To overcome this problem, we propose interface patterns, which allow getting models of multi-agent processes with a clearly identified agent behavior and interaction scheme as well. The correctness of patterns is provided via morphisms. We also conduct a preliminary experiment, results of which are highly competitive compared to the process discovery without interface patterns.
Modern embedded OS are designed to be used in control solutions in various hardware contexts. Control computers may differ in the architecture of the CPU, the structure of communication channels, supported communication protocols, etc. Embedded OS are often statically configured to create an OS image, which intended to be executed on some specific control computer. System integrator usually performs this configuration. Embedded OS are often developed by many companies. Joint development and integration is very complex if OS doesn’t support modularity. Support of modularity and component assembly reduces the need of communication among companies during development and integration. This allows customers to create minimal solutions that are optimally adapted to the particular task and hardware platform. Furthermore, customers may be interested in adding their own low level components without OS modification. In this article, we present an approach to building modular embedded solutions from heterogeneous components based on the RTOS JetOS. The mechanism of components binding developed by us allows uniting heterogeneous components from different manufacturers within the same section of the address space. This mechanism allows component developer to independently develop their components. And system integrator can independently from developers configure what component he likes to see in OS image and how components should interact.
This paper regards problems of analysis and verification of complex modern operating systems, which should take into account variability and configurability of those systems. The main problems of current interest are related with conditional compilation as variability mechanism widely used in system software domain. It makes impossible fruitful analysis of separate pieces of code combined into system variants, because most of these pieces of code has no interface and behavior. From the other side, analysis of all separate variants is also impossible due to their enormous number. The paper provides an overview of analysis methods that are able to cope with the stated problems, distinguishing two classes of such approaches: analysis of variants sampling based on some variants coverage criteria and variation-aware analysis processing many variants simultaneously and using similarities between them to minimize resources required. For future development we choose the most scalable technics, sampling analysis based on code coverage and on coverage of feature combinations and variation-aware analysis using counterexample guided abstraction refinement approach.
The paper is devoted to the problem of early error detection and analysis in hyperconverged systems. One approach to organizing hyperconverged systems is to install on each physical server a separate instance of an operating system (OS) that carries virtualization tools and tools for administering and using a distributed data warehouse. Errors can occur both at the level of a single OS instance and at the level of the entire cluster. For example, incorrect control element commands from one infrastructure node can cause software failure on another node. In addition, errors from the subsystems of the cluster can provoke abnormal situations inside virtual machines. The complexity of the architecture of hyperconverged systems makes it difficult to analyze the errors that occur in them. To simplify such an analysis and increase its effectiveness, it is necessary to automate the process of detecting problems and collecting data necessary for their study and correction. Existing approaches for automation of error detection are described and various improvements are suggested to adopt them for systems where distributed storage and virtualization technologies are actively used. Improvements include log collection from the whole cluster just after the error occurred, additional analysis of guest operating system behaviour inside virtual machines, usage of a knowledge base for automated crash recovery and duplicate detection. Finally, a real-life scenario of error handling process in Virtuozzo company products is described starting from error detection and ending with fix deployment.
This article is dedicated to automation of cluster creation and management for Apache Spark MapReduce implementation in Openstack environments. As a result of this project open-source (Apache 2.0 license) implementation of toolchain for virtual cluster on-demand creation in Openstack environments was presented. The article contains an overview of existing solutions for clustering automation in cloud environments by the start of 2014 year. The article provides a shallow overview of issues and problems in Openstack Heat project that provides a compatibility layer for Amazon EC2 API. The final implementation provided in the article is almost strainforward port of existing toolchain for cluster creation automation for Apache Spark in Amazon EC2 environment with some improvements. Also prepared base system virtual machine image for Openstack is provided. Plans for further work are connected with Ansible project. Using Ansible for observed problem will make possible to implement versatile environment-agnostic solution that is able to work using any cloud computing services provider, set of Docker containers or bare-metal clusters without any dependencies for prepared operating system image. Current article doesn't use Ansible due to the lack of key features at the moment of the project start. The solution provided in this article has been already tested in production environment for graph theory research arcticle.
One of the most important ways of increasing the speed of the modern databases is to cache frequently used data in RAM. Classical replacement policies are intended to minimize the number of buffer pool faults. This optimization method implicitly relies on the fact that the speeds of reading and writing to the hard disc are equal. Gradual technology improvement and cost reduction of flash memory have led to the creation of solid-state data storages (SSD) that are now increasingly used in personal computers and storage systems. Flash drives have advantages over traditional hard drives, high read and write speeds and significantly small time of random data access are the most important of them. However, the most popular flash-memory types read data at a higher speed than write it. Due to this feature the use of classical replacement algorithms of disk data caching is ineffective. This paper reviews recently developed algorithms of database buffer pool management designed to work with flash memory drives: CFDC (Clean First – Dirty Clustered), CASA (Cost-Aware Self- Adaptive), SAWC (Self Adaptive with Write Clustering), and FD-Buffer. Some of these algorithms demonstrate significant advantages over the classical algorithm LRU.