Deep learning based methods for estimating distribution of coalescence rates from genome-wide data
Demographic and population structure inference is one of the most important problems in genomics. Population parameters such as effective population sizes, population split times and migration rates are of high interest both themselves and for many applications, e.g. for genome-wide association studies. Hidden Markov Model (HMM) based methods, such as PSMC, MSMC, coalHMM etc., proved to be powerful and useful for estimation of these parameters in many population genetics studies. At the same time, machine and deep learning have began to be used in natural science widely. In particular, deep learning based approaches have already substituted hidden Markov models in many areas, such as speech recognition or user input prediction. We develop a deep learning (DL) approach for local coalescent time estimation from one whole diploid genome. Our DL models are trained on simulated datasets. Importantly, demographic and population parameters can be inferred based on the distribution of coalescent times. We expect that our approach will be useful under complex population scenarios, which cannot be studied with existing HMM based methods. Our work is also a crucial step in developing a deep learning framework which would allow to create population genomics methods for different genomic data representations.
This book constitutes the refereed proceedings of the 11th International Conference on Intelligent Data Processing, IDP 2016, held in Barcelona, Spain, in October 2016.
The 11 revised full papers were carefully reviewed and selected from 52 submissions. The papers of this volume are organized in topical sections on machine learning theory with applications; intelligent data processing in life and social sciences; morphological and technological approaches to image analysis.
Objective: Brain-computer interfaces (BCIs) decode information from neural activity and send it to external devices. The use of Deep Learning approaches for decoding allows for automatic feature engineering within the specific decoding task. Physiologically plausible interpretation of the network parameters ensures the robustness of the learned decision rules and opens the exciting opportunity for automatic knowledge discovery. Approach: We describe a compact convolutional network-based architecture for adaptive decoding of electrocorticographic (ECoG) data into finger kinematics. We also propose a novel theoretically justified approach to interpreting the spatial and temporal weights in the architectures that combine adaptation in both space and time. The obtained spatial and frequency patterns characterizing the neuronal populations pivotal to the specific decoding task can then be interpreted by fitting appropriate spatial and dynamical models. Main results: We first tested our solution using realistic Monte-Carlo simulations. Then, when applied to the ECoG data from Berlin BCI competition IV dataset, our architecture performed comparably to the competition winners without requiring explicit feature engineering. Using the proposed approach to the network weights interpretation we could unravel the spatial and the spectral patterns of the neuronal processes underlying the successful decoding of finger kinematics from an ECoG dataset. Finally we have also applied the entire pipeline to the analysis of a 32-channel EEG motor-imagery dataset and observed physiologically plausible patterns specific to the task. Significance: We described a compact and interpretable CNN architecture derived from the basic principles and encompassing the knowledge in the field of neural electrophysiology. For the first time in the context of such multibranch architectures with factorized spatial and temporal processing we presented theoretically justified weights interpretation rules. We verified our recipes using simulations and real data and demonstrated that the proposed solution offers a good decoder and a tool for investigating motor control neural mechanisms.
Determining the tonality of the text is a difficult task, the solution of which essentially depends on the context, the field of study and the amount of text data. The analysis shows that the authors in their works do not jointly use the full range of possible transformations on the data and their combinations. The article explores a generalized approach, which consists in sequentially passing through the stages of intelligence analysis, obtaining a basic solution, vectorization, preprocessing, tuning hyperparameters and modeling. The experiments carried out by iterative application of these stages give a positive increase in quality for classical machine learning algorithms and a significant increase for deep learning.
Brain computer interfaces are a growing research field producing many implementations that find various uses in research and medical practice and everyday life. Despite the popularity of the implementations using non-invasive neuroimaging methods, radical improvement in the state channel bandwidth and, thus, decoding accuracy is only possible by using invasive techniques. Electrocorticography (ECoG) is a minimally invasive neuroimaging modality that provides highly informative brain activity signals and entails the use of machine learning methods to efficiently decipher the complex spatial-temporal cortical representation of motor and cognitive function. Deep learning techniques is the family of machine learning methods that allow to learn representations of data with multiple levels of abstraction. We hypothesized that the deep learning would allow to reach higher accuracy in the task of decoding movement timecourse than it is possible with traditional signal processing approaches.
article deals with the problem of isolated words recognition based on deep convolutional neural networks. The use of existing recognition systems in practice is limited by an insufficiently high degree of their reliability functioning in conditions of intense acoustic noise, such as street noise, sounds from passing vehicles, etc. Nowadays, the most accurate recognition methods are characterized by the formation of acoustic models with deep learning technologies and, in particular, convolutional neural networks. For image processing problems the possibility of adaptation of such networks to a new domain with additional finetuning on rather small training samples is well studied. In this paper we proposed to perform additional training of networks for adaptation of acoustic models on a speaker voice with use of small number of the utterances. In order to reduce the error rate, we consider an ensemble of several different speaker-dependent neural network architectures that have been trained in such a way. The final decision is made by a weighted voting rule, in which the weight of each acoustic model is determined in proportion to the accuracy estimated on the training set. The experimental results for recognition of English commands proved that such ensemble of pre-trained acoustic models can significantly improve accuracy compared to traditional pre-trained models, especially if the white Gaussian noise is added to the input signal.
Recently, deep learning methods have been increasingly applied on spoken language technologies, including signal processing, language understanding and generation, dialogue management, as well as joint optimisations of these (end-to-end learning). However, such methods still have limitations and it is not yet clear that deep learning and joint optimisation is the key to the future.
Encompassing the current deep learning trends and traditional knowledge-based methods, SLT’s 2018 main theme will be around “Spoken Language Technology in the Era of Deep Learning: Challenges and Opportunities”.
The book presents a remarkable collection of chapters covering a wide range of topics in the areas of intelligent systems and artificial intelligence, and their real-world applications. It gathers the proceedings of the Intelligent Systems Conference 2019, which attracted a total of 546 submissions from pioneering researchers, scientists, industrial engineers, and students from all around the world. These submissions underwent a double-blind peer-review process, after which 190 were selected for inclusion in these proceedings.
As intelligent systems continue to replace and sometimes outperform human intelligence in decision-making processes, they have made it possible to tackle a host of problems more effectively. This branching out of computational intelligence in several directions and use of intelligent systems in everyday applications have created the need for an international conference as a venue for reporting on the latest innovations and trends.
This book collects both theory and application based chapters on virtually all aspects of artificial intelligence; presenting state-of-the-art intelligent methods and techniques for solving real-world problems, along with a vision for future research, it represents a unique and valuable asset.
Polar mesocyclones (MCs) are small marine atmospheric vortices. The class of intense MCs, called polar lows, are accompanied by extremely strong surface winds and heat fluxes and thus largely influencing deep ocean water formation in the polar regions. Accurate detection of polar mesocyclones in high-resolution satellite data, while challenging, is a time-consuming task, when performed manually. Existing algorithms for the automatic detection of polar mesocyclones are based on the conventional analysis of patterns of cloudiness and they involve different empirically defined thresholds of geophysical variables. As a result, various detection methods typically reveal very different results when applied to a single dataset. We develop a conceptually novel approach for the detection of MCs based on the use of deep convolutional neural networks (DCNNs). As a first step, we demonstrate that DCNN model is capable of performing binary classification of 500 × 500 km patches of satellite images regarding MC patterns presence in it. The training dataset is based on the reference database of MCs manually tracked in the Southern Hemisphere from satellite mosaics. We use a subset of this database with MC diameters falling in the range of 200–400 km. This dataset is further used for testing several different DCNN setups, specifically, DCNN built “from scratch”, DCNN based on VGG16 pre-trained weights also engaging the Transfer Learning technique, and DCNN based on VGG16 with Fine Tuning technique. Each of these networks is further applied to both infrared (IR) and a combination of infrared and water vapor (IR + WV) satellite imagery. The best skills (97% in terms of the binary classification accuracy score) is achieved with the model that averages the estimates of the ensemble of different DCNNs. The algorithm can be further extended to the automatic identification and tracking numerical scheme and applied to other atmospheric phenomena that are characterized by a distinct signature in satellite imagery.
A model for organizing cargo transportation between two node stations connected by a railway line which contains a certain number of intermediate stations is considered. The movement of cargo is in one direction. Such a situation may occur, for example, if one of the node stations is located in a region which produce raw material for manufacturing industry located in another region, and there is another node station. The organization of freight traﬃc is performed by means of a number of technologies. These technologies determine the rules for taking on cargo at the initial node station, the rules of interaction between neighboring stations, as well as the rule of distribution of cargo to the ﬁnal node stations. The process of cargo transportation is followed by the set rule of control. For such a model, one must determine possible modes of cargo transportation and describe their properties. This model is described by a ﬁnite-dimensional system of diﬀerential equations with nonlocal linear restrictions. The class of the solution satisfying nonlocal linear restrictions is extremely narrow. It results in the need for the “correct” extension of solutions of a system of diﬀerential equations to a class of quasi-solutions having the distinctive feature of gaps in a countable number of points. It was possible numerically using the Runge–Kutta method of the fourth order to build these quasi-solutions and determine their rate of growth. Let us note that in the technical plan the main complexity consisted in obtaining quasi-solutions satisfying the nonlocal linear restrictions. Furthermore, we investigated the dependence of quasi-solutions and, in particular, sizes of gaps (jumps) of solutions on a number of parameters of the model characterizing a rule of control, technologies for transportation of cargo and intensity of giving of cargo on a node station.
Event logs collected by modern information and technical systems usually contain enough data for automated process models discovery. A variety of algorithms was developed for process models discovery, conformance checking, log to model alignment, comparison of process models, etc., nevertheless a quick analysis of ad-hoc selected parts of a journal still have not get a full-fledged implementation. This paper describes an ROLAP-based method of multidimensional event logs storage for process mining. The result of the analysis of the journal is visualized as directed graph representing the union of all possible event sequences, ranked by their occurrence probability. Our implementation allows the analyst to discover process models for sublogs defined by ad-hoc selection of criteria and value of occurrence probability
Existing approaches suggest that IT strategy should be a reflection of business strategy. However, actually organisations do not often follow business strategy even if it is formally declared. In these conditions, IT strategy can be viewed not as a plan, but as an organisational shared view on the role of information systems. This approach generally reflects only a top-down perspective of IT strategy. So, it can be supplemented by a strategic behaviour pattern (i.e., more or less standard response to a changes that is formed as result of previous experience) to implement bottom-up approach. Two components that can help to establish effective reaction regarding new initiatives in IT are proposed here: model of IT-related decision making, and efficiency measurement metric to estimate maturity of business processes and appropriate IT. Usage of proposed tools is demonstrated in practical cases.