?
Performance of MD-Algorithms on hybrid systems-on-chip Nvidia Tegra K1 & X1
P. 199-211.
In this paper we consider the efficiency of hybrid systemson-a-chip for high-performance calculations. Firstly, we build Roofline performance models for the systems considered using Empirical Roofline Toolkit and compare the results with the theoretical estimates. Secondly, we use LAMMPS as an example of the molecular dynamic package to demonstrate its performance and efficiency in various configurations running on Nvidia Tegra K1 & X1. Following the Roofline approach, we attempt to distinguish compute-bound and memory-bound conditions for the MD algorithm using the Lennard-Jones liquid model. The results are discussed in the context of the LAMMPS performance on Intel Xeon CPUs and the Nvidia Tesla K80 GPU.
In book
Vol. 687. , Springer, 2016
Pavlov D., Galigerov V., Kolotinskii D. et al., International Journal of High Performance Computing Applications 2023
Fluid dynamics is a ubiquitous problem that arises in different branches of science and industry. It is usually tackled by numerically solving differential equations on a finite grid. Molecular dynamics was not a feasible tool to approach fluid dynamics until very recently due to its disproportional computational complexity. In this paper we propose a new ...
Added: July 18, 2023
Gostev I. M., Sibirtseva E. A., RUDN Journal of Mathematics, Information Sciences and Physics 2014 No. 4 P. 68-84
Low-cost gaze tracking systems are in great demand due to their wide range of application. Commonly, extra devices are needed (for instance, head mounted cameras); however, in this investigation gaze tracking is performed in real-time based on the video stream from an infrared video camera. A comparative analysis of the existing analogues was executed and ...
Added: December 7, 2014
Kondratyuk N., Nikolskiy V., Pavlov D. et al., International Journal of High Performance Computing Applications 2021 Vol. 35 No. 4 P. 312-324
Classical molecular dynamics (MD) calculations represent a significant part of the utilization time of high-performance computing systems. As usual, the efficiency of such calculations is based on an interplay of software and hardware that are nowadays moving to hybrid GPU-based technologies. Several well-developed open-source MD codes focused on GPUs differ both in their data management ...
Added: June 25, 2021
Vecher V., Nikolskiy V., Stegailov V., , in : Supercomputing. RuSCDays 2016. Communications in Computer and Information Science. Revised Selected Papers. Vol. 687.: Springer, 2016. P. 78-90.
Energy consumption of hybrid systems is an actual problem of modern high-performance computing. The trade-off between power consumption and performance becomes more and more prominent. In this paper, we discuss the energy and power efficiency of two modern hybrid minicomputers Jetson TK1 and TX1. We use the Empirical Roofline Tool to obtain peak performance data ...
Added: May 31, 2017
Shershakov S., , in : Proceedings of the International Conference on Electrical and Computer Systems ICECS'12. : Ottawa : International ASET Inc, 2012. P. 207-1-207-8.
The SLAM-based Static Driver Verifier Research Platform (SDVRP), as a tool that systematically analyzes source code and allows writing custom SLIC rules for various platforms, provided a potent verification mechanism for an embedded software system based on ARM Cortex-M0 microprocessor. The correctness of this software is of particular importance in the sense that there are ...
Added: March 14, 2013
Paramonov A. A., Философский журнал 2018 Т. 11 № 4 С. 59-74
This article examines a historical case brought to general attention by English theoretical physicist and historian of science Julian Barbour. The events took place at the beginning of the 20th century when Albert Einstein, on his first steps towards the theory of General relativity, formulated what later was to become the famous Mach’s Principle, in ...
Added: September 1, 2019
Ermak T., Shehovtsov A., Yakovlev P., , in : MATHEMATICAL MODELING AND HIGH-PERFORMANCE COMPUTING IN BIOINFORMATICS, BIOMEDICINE AND BIOTECHNOLOGY (MM-HPC-BBB-2018). The 3rd International Symposium. : Novosibirsk : Институт вычислительной математики и математической геофизики Сибирского отделения РАН, 2018. P. 1-1.
Added: September 24, 2021
Pugachev L., Umarov I., Popov V. et al., , in : Supercomputing: 8th Russian Supercomputing Days, RuSCDays 2022, Moscow, Russia, September 26–27, 2022, Revised Selected Papers. Vol. 13708.: Springer, 2022. P. 290-302.
Particle-in-Cell models are among the most demanding computational problems that require appropriate supercomputing hardware. In this paper we consider the solution of generic PIC problems on the Desmos supercomputer equipped with novel AMD MI50 GPUs and Angara interconnect. The open-source PIConGPU code is used. The acceleration limits and bottlenecks for this type of calculations are ...
Added: May 16, 2023
Rezvykh P. V., Ziche P., Stuttgart : Fromann-Holzboog, 2013
Johannes Kepler (1571–1630) spielte als genialer Entdecker von Naturgesetzen eine zentrale Rolle in der frühen Naturphilosophie Schellings und Hegels; die Romantik feierte ihn als Prototypen des Genies schlechthin. Um 1840 setzt sich Schelling in einem veränderten Kontext für die erste Gesamtausgabe der Werke Keplers ein: Die Naturphilosophie wird nun vom Empirismus und Induktivismus scharf kritisiert. ...
Added: September 28, 2013
Kolotinskii D., Alexei Timofeev, , in : Supercomputing: 8th Russian Supercomputing Days, RuSCDays 2022, Moscow, Russia, September 26–27, 2022, Revised Selected Papers. Vol. 13708.: Springer, 2022. P. 276-289.
According to the TOP-500 supercomputer ranking [27], since 2017, the share of supercomputers which have NVIDIA V100 and A100 graphics accelerators has been continuously growing, reaching 80% by November 2021 from the total number of supercomputers with accelerators and co-processors. This paper presents the results of an assessment of energy and economic efficiency, as well as ...
Added: May 16, 2023
Borovský M., Weigel M., Barash L.Yu. et al., EPJ Web of Conferences 2016 Vol. 108 P. 02016-p.1-02016-p.6
The population annealing algorithm is a novel approach to study systems with rough free-energy landscapes, such as spin glasses. It combines the power of simulated annealing, Boltzmann weighted differential reproduction and sequential Monte Carlo process to bring the population of replicas to the equilibrium even in the low-temperature region. Moreover, it provides a very good ...
Added: January 31, 2018
Podkopaev A., Lahav O., Vafeiadis V., , in : 31st European Conference on Object-Oriented Programming, {ECOOP} 2017. Vol. 74.: Dagstuhl : Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik, 2017. Ch. 22. P. 1-28.
We prove the correctness of compilation of relaxed memory accesses and release-acquire fences from the "promising" semantics of [Kang et al. POPL'17] to the ARMv8 POP machine of [Flur et al. POPL'16]. The proof is highly non-trivial because both the ARMv8 POP and the promising semantics provide some extremely weak consistency guarantees for normal memory ...
Added: December 24, 2018
Fomin D., Математические вопросы криптографии 2015 Vol. 6 No. 2 P. 99-108
In this article we consider NVIDIA GPU implementation aspects of an XSL block cipher over the finite field with MDS-matrix linear transformation. We compare obtained results with some other block ciphers. ...
Added: May 4, 2019
Cadena L., Cadena F., Легалов А. И. et al., , in : Proceedings of The World Congress on Engineering and Computer Science 2018, 23-25 October, 2018. San Francisco, USA. Vol. 1.: Newswood Limited, 2018. P. 453-458.
Nowadays, remotely sensed images are used for various purposes in different applications. One of them is the cadastral application using high resolution satellite imagery. The edge detection has an important role in image processing, especially in the detection and the extraction physical features, those which are useful to their enforcement in the analysis of cadasters. ...
Added: October 31, 2020
Stegailov V., Timofeev A., , in : Supercomputing. RuSCDays 2018. Communications in Computer and Information Science, vol 965. Springer, Cham. : Springer, 2019. P. 543-553.
Modern Elbrus-4S and Elbrus-8S processors show floating point performance comparable to the popular Intel processors in the field of high-performance computing. Tasks oriented to take advantage of the VLIW architecture show even greater efficiency on Elbrus processors. In this paper the efficiency of the most popular materials science codes in the field of classical molecular ...
Added: March 10, 2019
Nikolskiy V., Stegailov V., , in : Parallel Computing: Technology Trends. : IOS Press, 2020. P. 565-573.
In this work, a new algorithm was developed for calculating the fourpoint water model TIP4P on graphics accelerators. It was designed as a part of the flexible molecular dynamics modeling package LAMMPS in the library module “GPU”. In this paper we describe two approaches to implement the TIP4P model for GPU: 1) to divide the ...
Added: March 27, 2020
Аникин А. С., Большакова О. А., Гасников А. В. et al., Журнал вычислительной математики и математической физики 2019 Т. 59 № 12 С. 2060-2076
Большинство проблем структурной вычислительной биологии требуют решения задачи минимизации энергетической функции (силового поля), определенной на геометрии молекулы. Это позволяет определять свойства молекул, предсказывать правильное положение белковых цепей, находить лучшую состыковку молекул при предсказании комплексообразования (докинге), проверять гипотезы относительно белкового дизайна и решать многие другие задачи, возникающие при современной разработке лекарственных средств. В случае низкомолекулярных соединений ...
Added: September 24, 2021
Gorchakov A., Amirkhanova G., Duysenbaeva A., DEStech Publications,Inc., 2018
In this paper we consider the problem of finding the energy minimum of the aggregate of atoms of a fragment of a planar crystal lattice. To calculate the energy, the Brennor or
REBO (reactive empirical bond order) method is used. The REBO potential is calculated using
the LAMMPS package (Large-scale Atomic / Molecular Massively Parallel Simulator). As ...
Added: October 31, 2018
Fomin D., Математические вопросы криптографии 2016 Vol. 7 No. 2 P. 121-130
A timing attack against an AES-type block cipher CUDA implementa- tion is presented. Our experiments show that it is possible to extract a secret AES 128-bit key with complexity of 2^32 chosen plaintext encryptions. This approach may be applied to AES with other key sizes and, moreover, to any block cipher with a linear transform that is ...
Added: May 4, 2019