Simulating the time projection chamber responses at the MPD detector using generative adversarial networks
High energy physics experiments rely heavily on the detailed detector simulation models in many tasks. Running these detailed models typically requires a notable amount of the computing time available to the experiments. In this work, we demonstrate a new approach to speed up the simulation of the Time Projection Chamber tracker of the MPD experiment at the NICA accelerator complex. Our method is based on a Generative Adversarial Network – a deep learning technique allowing for implicit estimation of the population distribution for a given set of objects. This approach lets us learn and then sample from the distribution of raw detector responses, conditioned on the parameters of the charged particle tracks. To evaluate the quality of the proposed model, we integrate a prototype into the MPD software stack and demonstrate that it produces high-quality events similar to the detailed simulator, with a speed-up of at least an order of magnitude. The prototype is trained on the responses from the inner part of the detector and, once expanded to the full detector, should be ready for use in physics tasks.
During LHC Run 1, the LHCb experiment recorded around 1011 collision events. This paper describes Event Index — an event search system. Its primary function is to quickly select subsets of events from a combination of conditions, such as the estimated decay channel or number of hits in a subdetector. Event Index is essentially Apache Lucene  optimized for read-only indexes distributed over independent shards on independent nodes.
One of the most challenging data analysis tasks of modern High Energy Physics experiments is the identification of particles. In this proceedings we review the new approaches used for particle identification at the LHCb experiment. Machine-Learning based techniques are used to identify the species of charged and neutral particles using several observables obtained by the LHCb sub-detectors. We show the performances of various solutions based on Neural Network and Boosted Decision Tree models.
21st International Conference, Guimaraes, Portugal, November 4–6, 2020, Proceedings, Part IIEditors (view affiliations) Cesar Analide Paulo Novais David Camacho Hujun Yin
Conference proceedings IDEAL 2020
This paper studies the application of machine learning methods to allocate oil resources in the strata. To solve the problem, different algorithms of machine learning were analyzed and trained on borehole logging measurements: K-nearest neighbors, Random forest and Gradient boosting
We propose a novel multi-texture synthesis model based on generative adversarial networks (GANs) with a user-controllable mechanism. The user control ability allows to explicitly specify the texture which should be generated by the model. This property follows from using an encoder part which learns a latent representation for each texture from the dataset. To ensure a dataset coverage, we use an adversarial loss function that penalizes for incorrect reproductions of a given texture. In experiments, we show that our model can learn descriptive texture manifolds for large datasets and from raw data such as a collection of high-resolution photos. We show our unsupervised learning pipeline may help segmentation models. Moreover, we apply our method to produce 3D textures and show that it outperforms existing baselines.
Theoretical analysis in  suggested that adversarially trained generative models are naturally inclined to learn distribution with low support. In particular, this effect is caused by the limited capacity of the discriminator network. To verify this claim,  proposed a statistical test based on the birthday paradox that partially confirmed the analysis. In this paper, we continue this line of work and develop a parameter-free and straightforward method to estimate the support size of an arbitrary decoder-based generative model. Our approach considers the decoder network from a geometric viewpoint and evaluates the support size as the volume of the manifold containing the generative model samples. Additionally, we propose a method to measure non-uniformity of a generative model that can provide additional insight into the model’s behavior. We then apply these tools to perform a quantitative comparison of common generative models.
Recently, deep learning methods have been increasingly applied on spoken language technologies, including signal processing, language understanding and generation, dialogue management, as well as joint optimisations of these (end-to-end learning). However, such methods still have limitations and it is not yet clear that deep learning and joint optimisation is the key to the future.
Encompassing the current deep learning trends and traditional knowledge-based methods, SLT’s 2018 main theme will be around “Spoken Language Technology in the Era of Deep Learning: Challenges and Opportunities”.
The last two decades saw a dramatic increase in the number of papers published on the subject of stylometry, which is often narrowly understood as the task of identification of the author of a particular text fragment based on its stylistic properties. We present a new lightweight algorithm for stylometric identification of authors of Latin prose texts based on Burrows’s Delta, computed over relative frequencies of 244 manually selected genre and topic neutral words, and the Dirichlet distribution, whose parameters we estimate using an iterative maximum-likelihood algorithm. In order to demonstrate the effectiveness of the method, we present a case study of 3000-word fragments of texts by 36 classical and medieval authors and show that our method performs on par with Random Forest, a powerful general-purpose classification algorithm. We provide summary statistics of our algorithm’s performance together with confusion matrices demonstrating pairwise discriminability of texts by different authors. The advantages of our method are that it is very simple to implement, very quick to train and do inference with, and that it is very interpretable since it is a model-based algorithm: precision of the fitted Dirichlet distributions directly corresponds to the stylistic homogeneity of the texts by different authors. This makes it possible to use the algorithm as a general research tool in Latin stylistics.
A model for organizing cargo transportation between two node stations connected by a railway line which contains a certain number of intermediate stations is considered. The movement of cargo is in one direction. Such a situation may occur, for example, if one of the node stations is located in a region which produce raw material for manufacturing industry located in another region, and there is another node station. The organization of freight traﬃc is performed by means of a number of technologies. These technologies determine the rules for taking on cargo at the initial node station, the rules of interaction between neighboring stations, as well as the rule of distribution of cargo to the ﬁnal node stations. The process of cargo transportation is followed by the set rule of control. For such a model, one must determine possible modes of cargo transportation and describe their properties. This model is described by a ﬁnite-dimensional system of diﬀerential equations with nonlocal linear restrictions. The class of the solution satisfying nonlocal linear restrictions is extremely narrow. It results in the need for the “correct” extension of solutions of a system of diﬀerential equations to a class of quasi-solutions having the distinctive feature of gaps in a countable number of points. It was possible numerically using the Runge–Kutta method of the fourth order to build these quasi-solutions and determine their rate of growth. Let us note that in the technical plan the main complexity consisted in obtaining quasi-solutions satisfying the nonlocal linear restrictions. Furthermore, we investigated the dependence of quasi-solutions and, in particular, sizes of gaps (jumps) of solutions on a number of parameters of the model characterizing a rule of control, technologies for transportation of cargo and intensity of giving of cargo on a node station.
Event logs collected by modern information and technical systems usually contain enough data for automated process models discovery. A variety of algorithms was developed for process models discovery, conformance checking, log to model alignment, comparison of process models, etc., nevertheless a quick analysis of ad-hoc selected parts of a journal still have not get a full-fledged implementation. This paper describes an ROLAP-based method of multidimensional event logs storage for process mining. The result of the analysis of the journal is visualized as directed graph representing the union of all possible event sequences, ranked by their occurrence probability. Our implementation allows the analyst to discover process models for sublogs defined by ad-hoc selection of criteria and value of occurrence probability
The dynamics of a two-component Davydov-Scott (DS) soliton with a small mismatch of the initial location or velocity of the high-frequency (HF) component was investigated within the framework of the Zakharov-type system of two coupled equations for the HF and low-frequency (LF) fields. In this system, the HF field is described by the linear Schrödinger equation with the potential generated by the LF component varying in time and space. The LF component in this system is described by the Korteweg-de Vries equation with a term of quadratic influence of the HF field on the LF field. The frequency of the DS soliton`s component oscillation was found analytically using the balance equation. The perturbed DS soliton was shown to be stable. The analytical results were confirmed by numerical simulations.