Approaches for code execution using program vulnerabilities are considered in this paper. Particularly, ways of code execution using buffer overflow on stack and on heap, using use-after-free vulnerabilities and format string vulnerabilities are examined in section 2. Methods for automatic generation input data, leading to code execution are described in section 3. This methods are based on dynamic symbolic execution. Dynamic symbolic execution allows to gain input data, which leads program along the path of triggering vulnerability. The security predicate is an extra set of symbolic formulas, describing program's state in which code execution is possible. To get input data, leading to code execution, path and security predicates need to be united, and then the whole system should be solved. Security predicates for pointer overwrite, function pointer overwrite and format string vulnerability, that leads to stack buffer overflow are presented in the paper. Represented security predicates were used in method for software defect severity estimation. The method was applied to several binaries from Darpa Cyber Grand Challenge. Testing security predicate for format string vulnerability, that leads to buffer overflow was conducted on vulnerable version of Ollydbg. As a result of testing it was possible to obtain input data that leads to code execution.
The generation of uniformly distributed random numbers is necessary for computer simulation by Monte Carlo methods and molecular dynamics. Generators of pseudo-random numbers (GPRS) are used to generate random numbers. GPRS uses deterministic algorithms to calculate numbers, but the sequence obtained in this way has the properties of a random sequence. For a number of problems using Monte Carlo methods, random number generation takes up a significant amount of computational time, and increasing the generation capacity is an important task. This paper describes applying SIMD instructions (Single Instruction Multiple Data) to parallelize generation of pseudorandom numbers. We review SIMD instruction set extensions such as MMX, SSE, AVX2, AVX512. The example of AVX512 implementation is given for the LFSR113 pseudorandom number generator. Performance is compared for different algorithm implementations.
Equivalence checking algorithms found vast applications in system programming; they are used in software refactoring, security checking, malware detection, program integration, regression verification, compiler verification and validation. In this paper we show that equivalence checking procedures can be utilized for the development of global optimization transformation of programs. We consider minimization problem for two formal models of programs: deterministic finite state transducers over finitely generated decidable groups that are used as a model of sequential reactive programs, and deterministic program schemata that are used as a model of sequential imperative programs. Minimization problem for both models of programs can be solved following the same approach that is used for minimization of finite state automata by means of two basic optimizing transformations, namely, removing of useless states and joining equivalent states. The main results of the paper are Theorems 1 and 2. Theorems 1. If G is a finitely generated group and the word problem in G is decidable in polynomial time then minimization problem for finite state deterministic transducers over G is decidable in polynomial time as well. Theorem 2. If S is a decidable left-contracted ordered semigroup of basic program statements and the word problem in S is decidable in polynomial time then minimization problem for program schemata operating on the interpretation over S is decidable in polynomial time as well.
This article is an overview of scalable infrastructure for storage and processing of genome data in genetics problems. The overview covers used technologies descriptions, the organization of unified access to genome processing API of different underlying services. The article also covers methods for scalable and cloud computing technologies support. The first service in virtual genome processing laboratory is provided and presented. The service solves transcription factors bindning sites prediction problem. The main principles of service construction are provided. Basic requirements for underlying comptutaion software in virtual laboratory environments are provided. Overview describes the implemented web-service (https://api.ispras.ru/demo/gen) for transcription factors binding site prediction. Provided solution is based on ISPRAS API project as an API gateway and load-balancer; the middle-ware task-manager software for pool of workers support and for communications with Openstack infrastructure; OpenZFS as an intermediate storage with transparent compression support. The described solution is easy to extend with new services fitting the basic requirements.
Many modern applications (such as large-scale Web-sites, social networks, research projects, business analytics, etc.) have to deal with very large data volumes (also referred to as “big data”) and high read/write loads. These applications require underlying data management systems to scale well in order to accommodate data growth and increasing workloads. High throughput, low latencies and data availability are also very important, as well as data consistency guarantees. Traditional SQL-oriented DBMSs, despite their popularity, ACID transactions and rich features, do not scale well and thus are not suitable in certain cases. A number of new data management systems and approaches have emerged over the last decade intended to resolve scalability issues. This paper reviews several classes of such systems and key problems they are able to solve. A large variety of systems and approaches due to the general trend toward specialization in the field of SMS: every data management system has been adapted to solve a certain class of problems. Thus, the selection of specific solutions due to the specific problem to be solved: the expected load, the intensity ratio of read and write, the form of data storage and query types, the desired level of consistency, reliability requirements, the availability of client libraries for the selected language, etc.
Actual task is protecting programs from reverse engineering. The best choice to implement a resistant obfuscation is to create obfuscating compiler based on one of the existing compiler infrastructures. On the one hand, it will produce obfuscated program, with full information about it at all stages of compilation, and the other allows you to focus on the development of protection, rather than on creating the infrastructure required. In addition, this approach provides support for multiple architectures, as well as introduces watermarks for binary images of the program for each user depending from a unique key. The paper describes the methods for obfuscating C/C++ programs to prevent applying static analyzers to them. Paper observes existing obfuscating compilers. The proposed transformations are based on well-known obfuscation algorithms (including constant string protection, fake cycle insertion, control flow graph flattening, functions merge, function call encapsulation, control flow graph structure obfuscation, opaque predicate insertion and other) and they are specifically improved to resist better to static analysis deobfuscation techniques. The methods are implemented within the LLVM (low level virtual machine) compiler infrastructure. Experimental results presenting resulting program slowdown and used memory growth are given.
In this paper the problem of creating virtual clusters in clouds for big data analysis with Apache Hadoop and Apache Spark is discussed. Existing methods for Apache Spark clusters creation are described in this work. Also the implemented solution for building Apache Spark clusters and Apache Spark jobs execution in Openstack environment is described. The implemented solution is a modification for OpenStack Sahara project and it was featured in Openstack Liberty release.
Finite State Machine (FSM) based approaches are widely used for deriving tests with guaranteed fault coverage for discrete event systems and as the behavior of many nowadays information and control systems depends on time, classical FSMs are extended by clock variables. Moreover, optionality in the real system’s specifications motivates the studying test derivation against models with the nondeterministic behavior. In this paper, we adapt classical FSM based test derivation methods for nondeterministic FSMs with timed guards and timeouts (TFSMs). We show that unlike classical FSM conformance relation, the check cannot be reduced to checking the correspondence between TFSMs transitions and this violates the main principle of FSM based test derivation methods. Respectively, a proposed approach and the appropriate fault model are based on the FSM abstraction of the given TFSM specification that is used to adequately describe the behavior of a TFSM. The fault domain contains TFSMs with the known upper boundary on the number of FSM abstraction states and allows to avoid explicit enumeration of implementations under test. We study properties of the FSM abstraction for a nondeterministic TFSM and justify that the use of an FSM abstraction allows to adapt classical FSM based test derivation methods when deriving tests with guaranteed fault coverage for TFSMs. A method is proposed for deriving a complete test suite for a complete possibly nondeterministic TFSM when an implementation under test is a deterministic complete TFSM.
Apache Spark is a framework providing fast computations on Big Data using MapReduce model. With cloud environments Big Data processing becomes more flexible since they allow to create virtual clusters on-demand. One of the most powerful open-source cloud environments is Openstack. The main goal of this project is to provide an ability to create virtual clusters with Apache Spark and other Big Data tools in Openstack. There exist three approaches to do it. The first one is to use Openstack REST APIs to create instances and then deploy the environment. This approach is used by Apache Spark core team to create clusters in propriatary Amazon EC2 cloud. Almost the same method has been implemented for Openstack environments. Although since Openstack API changes frequently this solution is deprecated since Kilo release. The second approach is to integrate virtual clusters creation as a built-in service for Openstack. ISP RAS has provided several patches implementing universal Spark Job engine for Openstack Sahara and Openstack Swift integration with Apache Spark as a drop-in replacement for Apache Hadoop. This approach allows to use Spark clusters as a service in PaaS service model. Since Openstack releases are less frequent than Apache Spark this approach may be not convenient for developers using the latest releases. The third solution implemented uses Ansible for orchestration purposes. We implement the solution in loosely coupled way and provide an ability to add any auxiliary tool or even to use another cloud environment. Also, it provides an ability to choose any Apache Spark and Apache Hadoop versions to deploy in virtual clusters. All the listed approaches are available under Apache 2.0 license.
Ensuring the correctness of microprocessors and other microelectronic equipment is a fundamental problem. To deal with it, various tools for functional verification are used. Unlike bugs in software programs which are relatively easy to fix (it does not apply to their consequences), defects in integrated circuits (both design and manufacturing ones) cannot be removed. In spite of continuous development of computer-aided design (CAD) systems, test generation tools and approaches to analysis of circuits, verification remains the bottleneck of the microprocessor design cycle (it accounts for approximately 70 percent of total design resources). The article gives a brief overview of microprocessor verification tools, describes issues that commonly occur in industrial practice and analyzes possible ways to solve them. The main part of the article is dedicated to research in the field of unit- and system-level hardware verification conducted at ISPRAS. It describes such approaches as contract specification of pipeline, event-driven hardware specification, parallel/distributed testing, combinatorial test program generation and template-based test program generation. The article also summarizes the outcomes of accomplished projects, describes the present works and formulates the directions of further research.
High complexity of present-day programs makes it nigh impossible to write a program without a defect. Thus it is increasingly necessary to use tools for defects detection. This article presents Svace, a tool for static program analysis developed in ISP RAS. This instrument allows to automatically find defects and potential vulnerabilities in programs written in C and C++ languages. Main features of the tool are simplicity of usage, deep interprocedural analysis, wide variety of supported warning types, scalability up to programs of millions lines of code and acceptable quality of analysis (30-80% of true positive warnings). In the core of the Svace tool lies an engine for interprocedural data-flow analysis based on function annotations. Each function is analyzed once and independently of the other functions which allows to achieve almost linear scalability (Linux kernel can be analyzed within 10 minutes on a relatively powerful machine and analysis of the whole Android source code takes less than 3 hours). Intraprocedural analysis is performed on source code internal representation derived from LLVM bitcode. It operates with value identifiers that are shared between memory locations with same values (similarly to generations in SSA representation). Special attributes of these value identifiers are calculated over the control-flow graph of the function. When specific combination of attributes is observed a defect warning is issued. Svace analysis engine is accompanied by Clang compiler-based lightweight analysis tool for checking of language-dependent rules which allows to quickly check a number of syntactic, semantic and situational rules. Analysis results can be presented to the user with the help of Eclipse IDE plugin. They can also be imported into analysis results database to trace history of program defects over time.