An algorithm of MPI processes mapping optimization is adapted for supercomputers with interconnect Angara. The mapping algorithm is based on partitioning of parallel program communication pattern. It is performed in such a way that the processes between which the most intensive exchanges take place are tied to the nodes/processors with the highest bandwidth. The algorithm finds a near-optimal distribution of its processes for processor cores to minimize the total execution time of exchanges between MPI processes. The analysis of results of optimized placement of processes using proposed method on small supercomputers is shown. The analysis of the dependence of the MPI program execution time on supercomputer parameters and task parameters is performed. A theoretical model is proposed for estimation of effect of mapping optimization on the execution time for several types of supercomputer topologies. The prospect of using implemented optimization library for large-scale supercomputers with the interconnect Angara is discussed.
The problems of identifying latent parallelism in the algorithm by explicitly max (the construction of stacked-parallel form of the algorithm graph) and implicit (the method of streaming - DATA-FLOW - calculations), the development of parallel programs in the MPI-paradigm programming and quantitative research strength calculations for the acceleration parallelization on the parameters of a multiprocessor system and the quality of parallel programs. The manual is practical and can be used by students to prepare for the performance of laboratory and practice of the works, of course and diploma projects. Generated by network applications ra-operability in a multiprocessor environment, architecture MPP (Massively Par-allel Processing); particularly on Linux-cluster computing IT department MGUPI 4. Before working to understand whole con-SPECT lectures on 'Parallel Computing'.
The problem of efficient use of resources of multiprocessor computer systems (AIM) architecture MPP (Massively Parallel Processing) for the settlement of specific parallel algorithms. It is proposed integrated indicator of the efficiency of the parallel program execution on a specific MBC, which takes into account both the hardware performance of the system and the quality of program parallelism.