Matrix multiplication and universal scalability of the time on the Intel Scalable processors
Matrix multiplication is one of the core operations in many areas of scientific computing. We present the results of the experiments with the matrix multiplication of the big size comparable with the big size of the onboard memory, which is 1.5 terabyte in our case. We run experiments on the computing board with two sockets and with two Intel Xeon Platinum 8164 processors, each with 26 cores and with multi-threading. The most interesting result of our study is the observation of the perfect scalability law of the matrix multiplication, and of the universality of this law.
In this paper, we present an approach to scalable co-scheduling in distributed computing for complex sets of interrelated tasks(jobs). The scalability means that schedules are formed for job models with various levels of task granularity, data replication policies, and processor resource and memory can be upgraded. The necessary of guaranteed job execution at the required quality of service causes taking into account the distributed environment dynamics, namely, changes in the number of jobs for servicing, volumes of computations, possible failures of processor nodes, etc. At a consequence, in the general case, a set of versions of scheduling, or a strategy, is required instead of a single version. We propose a callable model of scheduling based on multicriteria strategies. The choice of the specific schedule depends on the load level of the resource dynamics and is formed as a resource query which is sent to a local batch-job management system.
We simulate model for evolution of local virtual time profile in conservative parallel discrete event the simulation (PDES) algorithm with long-range communication links. The main findings of simulation are that i) growth exponent depends logarithmically on the concentration p of long-range links; ii) utilisation of processing elements time decreases slowly with p. Thismeans that the conservative PDES with long-range communication links is fully scalable.
Population annealing is a novel Monte Carlo algorithm designed for simulations of systems of statistical mechanics with rugged free-energy landscapes. We discuss a realization of the algorithm for the use on a hybrid computing architecture combining CPUs and GPGPUs. The particular advantage of this approach is that it is fully scalable up to many thousands of threads. We report on applications of the developed realization to several interesting problems, in particular the Ising and Potts models, and review applications of population annealing to further systems.
Many modern applications (such as large-scale Web-sites, social networks, research projects, business analytics, etc.) have to deal with very large data volumes (also referred to as “big data”) and high read/write loads. These applications require underlying data management systems to scale well in order to accommodate data growth and increasing workloads. High throughput, low latencies and data availability are also very important, as well as data consistency guarantees. Traditional SQL-oriented DBMSs, despite their popularity, ACID transactions and rich features, do not scale well and thus are not suitable in certain cases. A number of new data management systems and approaches have emerged over the last decade intended to resolve scalability issues. This paper reviews several classes of such systems and key problems they are able to solve. A large variety of systems and approaches due to the general trend toward specialization in the field of SMS: every data management system has been adapted to solve a certain class of problems. Thus, the selection of specific solutions due to the specific problem to be solved: the expected load, the intensity ratio of read and write, the form of data storage and query types, the desired level of consistency, reliability requirements, the availability of client libraries for the selected language, etc.
The purpose of this manual is to consolidate the theoretical knowledge in areas of "computer microarchitecture", "CPU", "CISC-to-RISC pipelines". In this manual describes the features of the organization of the work of contemporary superscalar processors, the Intel architecture when running native commands. At the end of the laboratory work the student must have an understanding of the cycle and phases of operation of the processor core for execution of the command, the micro-operations and the number of them for different phases of the command. The student should be able to explain the graphs and tables the results of modeling for their version of the task and answer the questions placed at the end of this study guide.
A method based on the spectral analysis of thermowave oscillations formed under the effect of radiation of lasers operated in a periodic pulsed mode is developed for investigating the state of the interface of multilayered systems. The method is based on high sensitivity of the shape of the oscillating component of the pyrometric signal to adhesion characteristics of the phase interface. The shape of the signal is quantitatively estimated using the correlation coefficient (for a film–interface system) and the transfer function (for multilayered specimens).