### Article

## A timing attack on CUDA implementations of an AES-type block cipher

A timing attack against an AES-type block cipher CUDA implementa- tion is presented. Our experiments show that it is possible to extract a secret AES 128-bit key with complexity of 2^32 chosen plaintext encryptions. This approach may be applied to AES with other key sizes and, moreover, to any block cipher with a linear transform that is a composition of two types of linear transformations on a substate.

This volume comprises the proceedings of the 13th International Conference on Parallel Processing and Applied Mathematics (PPAM 2019), which was held inBiałystok, Poland, September 8–11, 2019. It was organized by the Department of Computer and Information Science of the Częstochowa University of Technology together with Białystok University of Technology, under the patronage of the Committee of Informatics of the Polish Academy of Sciences, in technical cooperation with the IEEE Computer Society and IEEE Computational Intelligence Society. The main organizer was Roman Wyrzykowski.PPAM is a biennial conference. 12 previous events have been held in different places in Poland since 1994, when the first PPAM took place in Częstochowa. Thus, the event in Białystok was an opportunity to celebrate the 25th anniversary of PPAM. The proceedings of the last nine conferences have been published by Springer in theLecture Notes in Computer Science series (Nałęczów, 2001, vol. 2328; Częstochowa,2003, vol. 3019; Poznań, 2005, vol. 3911; Gdańsk, 2007, vol. 4967; Wrocław, 2009, vols. 6067 and 6068; Toruń, 2011, vols. 7203 and 7204; Warsaw, 2013, vols. 8384 and8385; Kraków, 2015, vols. 9573 and 9574; and Lublin, 2017, vols. 10777 and 10778). The PPAM conferences have become an international forum for the exchange of the ideas between researchers involved in parallel and distributed computing, including theory and applications, as well as applied and computational mathematics. The focus of PPAM 2019 was on models, algorithms, and software tools that facilitate the efficient and convenient utilization of modern parallel and distributed computing architectures, as well as on large-scale applications, including artificial intelligence and machine learning problems. This meeting gathered more than 170 participants from 26 countries. A strict refereeing process resulted in the acceptance of 91 contributed papers for publication in these conference proceedings. For regular tracks of the conference, 41 papers were selected from 89 submissions, thus resulting in an acceptance rate of about 46%. The regular tracks covered important fields of parallel/distributed/cloud computing and applied mathematics.

We present optimization guidelines and implementations of cryptographic hash functions GOST R 34.11-94 and GOST R 34.11-2012. Results for x86_64 CPUs and NVIDIA CUDA-capable GPUs are provided for our and several other well-known implementations. It is shown that the new standard may be twice as fast as the old one on modern CPUs, but it may be slower on embedded devices and GPUs. The results given for our implementation are the fastest among all the tested implementations on both platforms.

An approach is described to implementation of the Method of Four Russians for reducing the dense matrices over GF(2) to row echelon form using the NVIDIA CUDA platform. Estimates of the algorithm running time and recommendations on choosing the algorithm parameters are given. It is shown that the developed implementation is most effective in comparison with the existing solutions for matrices of a size 2^17 x 2^17.

In this work, we describe the problem of automated pollen recognition using images from lighting microscope. Automated pollen recognition related to such important tasks as honey quality control, air quality control for helping to asthma and allergy patients, paleopalynology, forensic palynology. We describe the problem solution based on machine learning and CUDA. Extracted features and preprocessing steps are described. Results are compared on dataset of 5 specie. The best model is convolutional neural network with 89% of accuracy. Its performance was particularly up twice using CUDA.

A model for organizing cargo transportation between two node stations connected by a railway line which contains a certain number of intermediate stations is considered. The movement of cargo is in one direction. Such a situation may occur, for example, if one of the node stations is located in a region which produce raw material for manufacturing industry located in another region, and there is another node station. The organization of freight traﬃc is performed by means of a number of technologies. These technologies determine the rules for taking on cargo at the initial node station, the rules of interaction between neighboring stations, as well as the rule of distribution of cargo to the ﬁnal node stations. The process of cargo transportation is followed by the set rule of control. For such a model, one must determine possible modes of cargo transportation and describe their properties. This model is described by a ﬁnite-dimensional system of diﬀerential equations with nonlocal linear restrictions. The class of the solution satisfying nonlocal linear restrictions is extremely narrow. It results in the need for the “correct” extension of solutions of a system of diﬀerential equations to a class of quasi-solutions having the distinctive feature of gaps in a countable number of points. It was possible numerically using the Runge–Kutta method of the fourth order to build these quasi-solutions and determine their rate of growth. Let us note that in the technical plan the main complexity consisted in obtaining quasi-solutions satisfying the nonlocal linear restrictions. Furthermore, we investigated the dependence of quasi-solutions and, in particular, sizes of gaps (jumps) of solutions on a number of parameters of the model characterizing a rule of control, technologies for transportation of cargo and intensity of giving of cargo on a node station.

Let k be a field of characteristic zero, let G be a connected reductive algebraic group over k and let g be its Lie algebra. Let k(G), respectively, k(g), be the field of k- rational functions on G, respectively, g. The conjugation action of G on itself induces the adjoint action of G on g. We investigate the question whether or not the field extensions k(G)/k(G)^G and k(g)/k(g)^G are purely transcendental. We show that the answer is the same for k(G)/k(G)^G and k(g)/k(g)^G, and reduce the problem to the case where G is simple. For simple groups we show that the answer is positive if G is split of type A_n or C_n, and negative for groups of other types, except possibly G_2. A key ingredient in the proof of the negative result is a recent formula for the unramified Brauer group of a homogeneous space with connected stabilizers. As a byproduct of our investigation we give an affirmative answer to a question of Grothendieck about the existence of a rational section of the categorical quotient morphism for the conjugating action of G on itself.

Let G be a connected semisimple algebraic group over an algebraically closed field k. In 1965 Steinberg proved that if G is simply connected, then in G there exists a closed irreducible cross-section of the set of closures of regular conjugacy classes. We prove that in arbitrary G such a cross-section exists if and only if the universal covering isogeny Ĝ → G is bijective; this answers Grothendieck's question cited in the epigraph. In particular, for char k = 0, the converse to Steinberg's theorem holds. The existence of a cross-section in G implies, at least for char k = 0, that the algebra k[G]G of class functions on G is generated by rk G elements. We describe, for arbitrary G, a minimal generating set of k[G]G and that of the representation ring of G and answer two Grothendieck's questions on constructing generating sets of k[G]G. We prove the existence of a rational (i.e., local) section of the quotient morphism for arbitrary G and the existence of a rational cross-section in G (for char k = 0, this has been proved earlier); this answers the other question cited in the epigraph. We also prove that the existence of a rational section is equivalent to the existence of a rational W-equivariant map T- - - >G/T where T is a maximal torus of G and W the Weyl group.