mPyPl: Python Monadic Pipeline Library for Complex Functional Data Processing
Soshnikov D. V., Valieva Y.
In this paper, we present a new Python library called mPyPl, which is intended to simplify complex data processing tasks using functional approach. This library defines operations on lazy data streams of named dictionaries represented as generators (so-called multi-field datastreams), and allows enriching those data streams with more 'fields' in the process of data preparation and feature extraction. Thus, most data preparation tasks can be expressed in the form of neat linear 'pipeline', similar in syntax to UNIX pipes, or |> functional composition operator in F#. We define basic operations on multi-field data streams, which resemble classical monadic operations, and show similarity of the proposed approach to monads in functional programming. We also show how the library was used in complex deep learning tasks of event detection in video, and discuss different evaluation strategies that allow for different compromises in terms of memory and performance.
Research target: Computer Science
Priority areas: IT and mathematics
, , Microsoft Journal of Applied Research, USA 2019 Vol. 12 P. 140-150
In this paper, we present a new Python library called mPyPl, which is intended to simplify complex data processing tasks using a functional approach. This library defines operations on lazy data streams of named dictionaries represented as generators (so-called multi-field datastreams), and allows enriching those data streams with more ’fields’ in the process of data ...
Added: November 20, 2020
, , et al., Journal of Physics: Conference Series 2017 Vol. 898 No. 3 P. 1-6
High-energy physics experiments rely on reconstruction of the trajectories of particles produced at the interaction point. This is a challenging task, especially in the high track multiplicity environment generated by p-p collisions at the LHC energies. A typical event includes hundreds of signal examples (interesting decays) and a significant amount of noise (uninteresting examples). This ...
Added: February 25, 2018
, , et al., Proceedings of Machine Learning Research 2020 P. 1-9
We propose a novel method for gradient-based optimization of black-box simulators using differentiable local surrogate models. In fields such as physics and engineering, many processes are modeled with non-differentiable simulators with intractable likelihoods. Optimization of these forward models is particularly challenging, especially when the simulator is stochastic. To address such cases, we introduce the use ...
Added: October 31, 2019
Workshop on Compact Deep Neural Network Representation with Industrial Applications, Thirty-second Conference on Neural Information Processing Systems
Montréal : [б.и.], 2018
This workshop aims to bring together researchers, educators, practitioners who are interested in techniques as well as applications of making compact and efficient neural network representations. One main theme of the workshop discussion is to build up consensus in this rapidly developed field, and in particular, to establish close connection between researchers in Machine Learning ...
Added: December 5, 2018
, , et al., Learning velocity model for complex media with deep convolutional neural networks / . 2021.
The paper considers the problem of velocity model acquisition for a complex media based on boundary measurements. The acoustic model is used to describe the media. We used an open-source dataset of velocity distributions to compare the presented results with the previous works directly. Forward modeling is performed using the grid-characteristic numerical method. The inverse ...
Added: May 24, 2022
, , et al., Informatsionno-upravliaiushchie sistemy [Information and Control Systems] 2020 No. 1 P. 2-14
Introduction: Sentiment analysis is a complex problem whose solution essentially depends on the context, field of study and amount of text data. Analysis of publications shows that the authors often do not use the full range of possible data transformations and their combinations. Only a part of the transformations is used, limiting the ways to ...
Added: February 20, 2020
, , М. : ИНФРА-М, 2020
В данном учебном пособии рассматриваются некоторые методы и алгоритмы обработки данных, последовательность решения задач обработки и анализа данных для создания модели поведения объекта с учетом всех компонент его математической модели. Описываются виды технологических методов использования программно-аппаратных средств для решения задач в этой области. Рассматриваются алгоритмы распределений, регрессий временных рядов, их преобразование с целью получения математических ...
Added: July 8, 2019
, , , Bayesian Group Sparsification of Long Short-Term Memory Networks / . 2018.
We propose a new Bayesian sparsification technique for gated recurrent architectures that encounters for its recurrent specifics and gated mechanism. Our method eliminates neurons from the model and makes gates constant, not only compressing the network, but also significantly accelerating a forward pass. On the discriminative tasks our method compresses LSTM extremely, so that only ...
Added: October 16, 2018
Deep convolutional neural networks capabilities for binary classification of polar mesocyclones in satellite mosaics
, , et al., Atmosphere 2018 Vol. 9 No. 426 P. 1-23
Polar mesocyclones (MCs) are small marine atmospheric vortices. The class of intense MCs, called polar lows, are accompanied by extremely strong surface winds and heat fluxes and thus largely influencing deep ocean water formation in the polar regions. Accurate detection of polar mesocyclones in high-resolution satellite data, while challenging, is a time-consuming task, when performed ...
Added: November 26, 2020
, , , Bulletin of the Polish Academy of Sciences: Technical Sciences 2018 Vol. 66 No. 6 P. 811-820
We present a probabilistic model with discrete latent variables that control the computation time in deep learning models such as ResNets and LSTMs. A prior on the latent variables expresses the preference for faster computation. The amount of computation for an input is determined via amortized maximum a posteriori (MAP) inference. MAP inference is performed ...
Added: February 27, 2019
, , et al., Journal of Machine Learning Research 2016 Vol. 51 P. 130-138
The recently proposed Skip-gram model is a powerful method for learning high-dimensional word representations that capture rich semantic relationships between words. However, Skip-gram as well as most prior work on learning word representations does not take into account word ambiguity and maintain only single representation per word. Although a number of Skip-gram modifications were proposed ...
Added: October 1, 2016
, , et al., Information Processing and Management 2021 Vol. 58 No. 6 Article 102674
Ethnicity-targeted hate speech has been widely shown to influence on-the-ground inter-ethnic conflict and violence, especially in such multi-ethnic societies as Russia. Therefore, ethnicity-targeted hate speech detection in user texts is becoming an important task. However, it faces a number of unresolved problems: difficulties of reliable mark-up, informal and indirect ways of expressing negativity in user ...
Added: September 2, 2021
, , et al., Current Medicinal Chemistry 2021 Vol. 28 No. 32 P. 6619-6653
Purpose: Systems biology and network modeling represent, nowadays, the hallmark approaches for the development of predictive and targeted-treatment based precision medicine. The study of health and disease as properties of the human body system allows the understanding of the genotype-phenotype relationship through the definition of molecular interactions and dependencies. In this scenario, metabolism plays a ...
Added: August 25, 2021
, , et al., Spatially Adaptive Computation Time for Residual Networks / . 2016.
This paper proposes a deep learning architecture based on Residual Network that dynamically adjusts the number of executed layers for the regions of the image. This architecture is end-to-end trainable, deterministic and problem-agnostic. It is therefore applicable without any modifications to a wide range of computer vision problems such as image classification, object detection and ...
Added: December 12, 2016
Intelligent Data Processing 11th International Conference, IDP 2016, Barcelona, Spain, October 10–14, 2016, Revised Selected Papers
Switzerland : Springer, 2019
This book constitutes the refereed proceedings of the 11th International Conference on Intelligent Data Processing, IDP 2016, held in Barcelona, Spain, in October 2016. The 11 revised full papers were carefully reviewed and selected from 52 submissions. The papers of this volume are organized in topical sections on machine learning theory with applications; intelligent data processing in life ...
Added: February 8, 2020
, , et al., Frontiers in Pharmacology 2020 Vol. 11 P. 1-10
Generative models are becoming a tool of choice for exploring the molecular space. These models learn on a large training dataset and produce novel molecular structures with similar properties. Generated structures can be utilized for virtual screening or training semi-supervized predictive models in the downstream tasks. While there are plenty of generative models, it is ...
Added: April 21, 2021
, , , Journal of Physics: Conference Series 2021 Vol. 1740 Article 012031
Demographic and population structure inference is one of the most important problems in genomics. Population parameters such as effective population sizes, population split times and migration rates are of high interest both themselves and for many applications, e.g. for genome-wide association studies. Hidden Markov Model (HMM) based methods, such as PSMC, MSMC, coalHMM etc., proved ...
Added: May 17, 2021
, Real-time Streaming Wave-U-Net with Temporal Convolutions for Multichannel Speech Enhancement / Cornell University. Series Computer Science "arxiv.org". 2021.
In this paper we describe our work that we have done to participate in Task1 of ConferencingSpeech2021 challenge. This task set a goal to develop the solution for multi-channel speech enhancement in a real-time manner. We propose a novel system for streaming speech enhancement. We employ Wave-U-Net architecture with temporal convolutions in encoder and decoder. ...
Added: September 26, 2021
Система постановки произношения на основе сверточных нейронных сетей и информационной теории восприятия речи
, Информационные технологии 2019 Т. 25 № 5 С. 313-318
We consider a problem of computer assisted language and pronunciation learning based on the deep learning methods and the information theory of speech perception. In order to improve the efficiency of testing of pronunciation quality, we propose to train a convolutional neural network using the best reference utterances from the user. The experimental results proved ...
Added: May 29, 2019
Cham : Springer, 2019
Intelligent Systems Conference (IntelliSys) 2018 is the fourth research conference in the series. This conference is a part of SAI conferences being held since 2013. The conference series has featured keynote talks, special sessions, poster presentation, tutorials, workshops, and contributed papers each year. The conference focus on areas of intelligent systems and artificial intelligence (AI) and ...
Added: August 29, 2018
Deep neural networks and maximum likelihood search for approximate nearest neighbor in video-based image recognition
, Optical Memory and Neural Networks (Information Optics) 2017 Vol. 26 No. 2 P. 129-136
We analyzed the way to increase computational efficiency of video-based image recognition methods with matching of high dimensional feature vectors extracted by deep convolutional neural networks. We proposed an algorithm for approximate nearest neighbor search. At the first step, for a given video frame the algorithm verifies a reference image obtained when recognizing the previous ...
Added: June 30, 2017
, , , Scientific Reports 2020 Vol. 10 P. 19134
Computational methods to predict Z-DNA regions are in high demand to understand the functional role of Z-DNA. The previous state-of-the-art method Z-Hunt is based on statistical mechanical and energy considerations about B- to Z-DNA transition using sequence information. Z-DNA CHiP-seq experiment results showed little overlap with Z-Hunt predictions implying that sequence information only is not ...
Added: December 11, 2020
, , , Optical Memory and Neural Networks (Information Optics) 2018 Vol. 27 No. 1 P. 23-31
We discuss the video classification problem with the matching of feature vectors extracted using deep convolutional neural networks from each frame. We propose the novel recognition method based on representation of each frame as a sequence of fuzzy sets of reference classes whose degrees of membership are defined based on asymptotic distribution of the Kullback–Leibler ...
Added: February 9, 2018
Foundations of Intelligent Systems. 25th International Symposium on Methodologies for Intelligent Systems: ISMIS 2020
This book constitutes the proceedings of the 25th International Symposium on Foundations of Intelligent Systems, ISMIS 2020, held in Graz, Austria, in October 2020. The conference was held virtually due to the COVID-19 pandemic. The 35 full and 8 short papers presented in this volume were carefully reviewed and selected from 79 submissions. Included is also ...
Added: October 4, 2020