Deep learning based methods for estimating distribution of coalescence rates from genome-wide data

E. Khomutov; K. Arzymatov; V. Shchur

doi:10.1088/1742-6596/1740/1/012031

Publications

?

Deep learning based methods for estimating distribution of coalescence rates from genome-wide data

Journal of Physics: Conference Series. 2021. Vol. 1740. Article 012031.

Khomutov E., Arzymatov K., Shchur V.

Demographic and population structure inference is one of the most important problems in genomics. Population parameters such as effective population sizes, population split times and migration rates are of high interest both themselves and for many applications, e.g. for genome-wide association studies. Hidden Markov Model (HMM) based methods, such as PSMC, MSMC, coalHMM etc., proved to be powerful and useful for estimation of these parameters in many population genetics studies. At the same time, machine and deep learning have began to be used in natural science widely. In particular, deep learning based approaches have already substituted hidden Markov models in many areas, such as speech recognition or user input prediction. We develop a deep learning (DL) approach for local coalescent time estimation from one whole diploid genome. Our DL models are trained on simulated datasets. Importantly, demographic and population parameters can be inferred based on the distribution of coalescent times. We expect that our approach will be useful under complex population scenarios, which cannot be studied with existing HMM based methods. Our work is also a crucial step in developing a deep learning framework which would allow to create population genomics methods for different genomic data representations.

Research target: Computer Science

Priority areas: IT and mathematics

Keywords: deep learning глубокое обучение Population genomics вычислительная геномика

Publication based on the results of:

Genomic population structure analysis and applications (2021)

Fuzzy Analysis and Deep Convolution Neural Networks in Still-to-video Recognition

Savchenko A., Belova N. S., Savchenko Lyudmila V., Optical Memory and Neural Networks (Information Optics) 2018 Vol. 27 No. 1 P. 23–31

We discuss the video classification problem with the matching of feature vectors extracted using deep convolutional neural networks from each frame. We propose the novel recognition method based on representation of each frame as a sequence of fuzzy sets of reference classes whose degrees of membership are defined based on asymptotic distribution of the Kullback–Leibler ...

Added: February 9, 2018

Simulating the time projection chamber responses at the MPD detector using generative adversarial networks

A. Maevskiy, F. Ratnikov, Zinchenko A. et al., The European Physical Journal C - Particles and Fields 2021 Vol. 81 Article 599

High energy physics experiments rely heavily on the detailed detector simulation models in many tasks. Running these detailed models typically requires a notable amount of the computing time available to the experiments. In this work, we demonstrate a new approach to speed up the simulation of the Time Projection Chamber tracker of the MPD experiment at ...

Added: July 12, 2021

Deep convolutional neural networks capabilities for binary classification of polar mesocyclones in satellite mosaics

Криницкий М. А., Verezemskaya P., Гращенков К. В. et al., Atmosphere 2018 Vol. 9 No. 426 P. 1–23

Polar mesocyclones (MCs) are small marine atmospheric vortices. The class of intense MCs, called polar lows, are accompanied by extremely strong surface winds and heat fluxes and thus largely influencing deep ocean water formation in the polar regions. Accurate detection of polar mesocyclones in high-resolution satellite data, while challenging, is a time-consuming task, when performed ...

Added: November 26, 2020

Learning velocity model for complex media with deep convolutional neural networks

Gremyachikh L., Ustyuzhanin A., Станкевич А. et al., / Series 2110.08626 "Machine Learning". 2021.

The paper considers the problem of velocity model acquisition for a complex media based on boundary measurements. The acoustic model is used to describe the media. We used an open-source dataset of velocity distributions to compare the presented results with the previous works directly. Forward modeling is performed using the grid-characteristic numerical method. The inverse ...

Added: May 24, 2022

Система постановки произношения на основе сверточных нейронных сетей и информационной теории восприятия речи

Savchenko L., Информационные технологии 2019 Т. 25 № 5 С. 313–318

We consider a problem of computer assisted language and pronunciation learning based on the deep learning methods and the information theory of speech perception. In order to improve the efficiency of testing of pronunciation quality, we propose to train a convolutional neural network using the best reference utterances from the user. The experimental results proved ...

Added: May 29, 2019

Approach to Designing CV Systems for Medical Applications: Data, Architecture and AI

Ryabtsev D., Vasilyev Boris, Shershakov S., / Series 2501.14689 "Computer Science ::Computer Vision and Pattern Recognition". 2025.

This paper introduces an innovative software system for fundus image analysis that deliberately diverges from the conventional screening approach, opting not to predict specific diagnoses. Instead, our method mimics the diagnostic process by thoroughly analyzing both normal and pathological features of fundus structures, leaving the ultimate decision-making authority in the hands of healthcare professionals. Our initiative ...

Added: February 14, 2025

Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

Association for Computational Linguistics, 2019.

The 4th Workshop on Representation Learning for NLP (RepL4NLP) will be hosted by ACL 2019 and held on 2 August 2019. The workshop is being organised by Isabelle Augenstein, Spandana Gella, Sebastian Ruder, Katharina Kann, Burcu Can, Alexis Conneau, Johannes Welbl, Xian Ren and Marek Rei; and advised by Kyunghyun Cho, Edward Grefenstette, Karl Moritz ...

Added: November 1, 2019

ИНСТИТУТ БИОИНФОРМАТИКИ. СБОРНИК ТЕЗИСОВ 2020/21

St. Petersburg: Федеральное государственное автономное образовательное учреждение высшего образования "Санкт-Петербургский политехнический университет Петра Великого", 2021.

Bioinformatics Institute 2020/21. Project abstracts. Bioinformatics Summer School 2021. Abstracts. ...

Added: August 9, 2021

Domain adaptation with gradient reversal for MC/real data calibration

Ryzhikov A., Ustyuzhanin A., Journal of Physics: Conference Series 2018 Vol. 1085 P. 1–6

In the research, a new approach for finding rare events in high-energy physics was tested. As an example of physics channel the decay of \tau -> 3 \mu is taken that has been published on Kaggle within LHCb-supported challenge. The training sample consists of simulated signal and real background, so the challenge is to train ...

Added: December 11, 2017

Fast and scalable genome-wide inference of local tree topologies from large number of haplotypes based on tree consistent PBWT data structure

Shchur V., Ziganurova L., Durbin R., / Series New Results "BioRxiv". 2018.

Estimation of the relationship between DNA sequences is one of the most important problems in genomics. Understanding these relationships is central to de- mographic inference, correction of population structure in GWAS, identifying signals of selection etc. The data structure containing the full information about sample genealogy is called the ancestral recombination graph (ARG). However, ARG ...

Added: February 7, 2019

КЛАССИФИКАЦИЯ ВНЕКОРНЕВЫХ ЗАБОЛЕВАНИЙ ЯБЛОНЕВЫХ КУЛЬТУР МЕТОДАМИ КОМПЬЮТЕРНОГО ЗРЕНИЯ

Tereshchenko S., Perov Artem A., Osipov A., Siberian Journal of Life Sciences and Agriculture 2021 Т. 13 № 3 С. 103–118

Цель. Разработка модели сверточной нейронной сети для определения вне корневых заболеваний яблонь по фотографиям листьев с мобильного телефона. Методы и материалы исследования. Материалом для исследований по служили размеченные изображения с различными видами внекорневых заболе ваний яблони, опубликованные в открытом доступе платформы Kaggle. Ме тоды исследования: теория проектирования и разработки информационных систем, программирования, методы аугментации и расширения датасетов для задач компьютерного зрения, методы настройки гиперпараметров ...

Added: November 17, 2021

The NeurIPS '18 Competition: From Machine Learning to Intelligent Conversations

Springer, 2020.

his volume presents the results of the Neural Information Processing Systems Competition track at the 2018 NeurIPS conference. The competition follows the same format as the 2017 competition track for NIPS. Out of 21 submitted proposals, eight competition proposals were selected, spanning the area of Robotics, Health, Computer Vision, Natural Language Processing, Systems and Physics. Competitions have become ...

Added: December 2, 2019

Black-Box Optimization with Local Generative Surrogates

Belavin V., Ustyuzhanin A., Широбоков С. К. et al., Proceedings of Machine Learning Research 2020 P. 1–9

We propose a novel method for gradient-based optimization of black-box simulators using differentiable local surrogate models. In fields such as physics and engineering, many processes are modeled with non-differentiable simulators with intractable likelihoods. Optimization of these forward models is particularly challenging, especially when the simulator is stochastic. To address such cases, we introduce the use ...

Added: October 31, 2019

Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models

Polykovskiy D., Zhebrak A., Sanchez-Lengeling B. et al., Frontiers in Pharmacology 2020 Vol. 11 P. 1–10

Generative models are becoming a tool of choice for exploring the molecular space. These models learn on a large training dataset and produce novel molecular structures with similar properties. Generated structures can be utilized for virtual screening or training semi-supervized predictive models in the downstream tasks. While there are plenty of generative models, it is ...

Added: April 21, 2021

Data-Driven Short-Term Daily Operational Sea Ice Regional Forecasting

Grigoryev T., Verezemskaya P., Krinitskiy M. et al., Remote Sensing 2022 Vol. 14 No. 22 Article 5837

Global warming has made the Arctic increasingly available for marine operations and created a demand for reliable operational sea ice forecasts to increase safety. Because ocean-ice numerical models are highly computationally intensive, relatively lightweight ML-based methods may be more efficient for sea ice forecasting. Many studies have exploited different deep learning models alongside classical approaches ...

Added: June 19, 2023

Deep neural networks and maximum likelihood search for approximate nearest neighbor in video-based image recognition

Savchenko A., Optical Memory and Neural Networks (Information Optics) 2017 Vol. 26 No. 2 P. 129–136

We analyzed the way to increase computational efficiency of video-based image recognition methods with matching of high dimensional feature vectors extracted by deep convolutional neural networks. We proposed an algorithm for approximate nearest neighbor search. At the first step, for a given video frame the algorithm verifies a reference image obtained when recognizing the previous ...

Added: June 30, 2017

Breaking Sticks and Ambiguities with Adaptive Skip-gram

Bartunov S., Vetrov D., Kondrashkin D. et al., Journal of Machine Learning Research 2016 Vol. 51 P. 130–138

The recently proposed Skip-gram model is a powerful method for learning high-dimensional word representations that capture rich semantic relationships between words. However, Skip-gram as well as most prior work on learning word representations does not take into account word ambiguity and maintain only single representation per word. Although a number of Skip-gram modifications were proposed ...

Added: October 1, 2016

Proceedings of International Joint Conference on Neural Networks 2020 (IJCNN 2020)

Piscataway: IEEE, 2020.

2020 International Joint Conference on Neural Networks (IJCNN) held virtually, as part of the IEEE World Congress on Computational Intelligence (IEEE WCCI) 2020. IJCNN 2020 is jointly organized by the IEEE Computational Intelligence Society (CIS) and the International Neural Network Society (INNS). For IJCNN 2020 (and when WCCI is organized in even-numbered years) IEEE CIS ...

Added: October 15, 2020

Bayesian Group Sparsification of Long Short-Term Memory Networks

Lobacheva E., Chirkova N., Vetrov D., /. 2018.

We propose a new Bayesian sparsification technique for gated recurrent architectures that encounters for its recurrent specifics and gated mechanism. Our method eliminates neurons from the model and makes gates constant, not only compressing the network, but also significantly accelerating a forward pass. On the discriminative tasks our method compresses LSTM extremely, so that only ...

Added: October 16, 2018

Refining the ONCE Benchmark With Hyperparameter Tuning

Maksim Golyadkin, Alexander Gambashidze, Nurgaliev I. et al., IEEE Access 2024 Vol. 12 P. 3805–3814

In response to the growing demand for 3D object detection in applications such as autonomous driving, robotics, and augmented reality, this work focuses on the evaluation of semi-supervised learning approaches for point cloud data. The point cloud representation provides reliable and consistent observations regardless of lighting conditions, thanks to advances in LiDAR sensors. Data annotation ...

Added: March 13, 2024

Unet-boosted classifier – мультизадачная архитектура для малых выборок на примере классификации МРТ снимков головного мозга

Sobyanin K., Kulikova S., Информатика и автоматизация (Труды СПИИРАН) 2024 Т. 23 № 4 С. 1022–1046

The problem of training deep neural networks on small samples is especially relevant for medical problems. The paper examines the impact of pixel-wise marking of significant objects in the image, over the true class label, on the quality of the classification. To achieve better classification results on small samples, we propose a multitasking architecture -- ...

Added: June 29, 2024

Использование сверточных нейронных сетей для реидентификации людей в городских условиях

Сучков Е. П., Алексеенко Г. О., Налчаджи К. В., Интеллектуальные системы. Теория и приложения 2022 Т. 26 № 1 С. 250–254

Currently, video surveillance systems are becoming more widespread. One of the main goals of such systems is to control and track a person’s movement. The solution of this problem allows us to solve such applied problems as tracking the occupancy of various premises (whether shopping facilities or educational and cultural institutions), creating a motion heatmap or organizing control of access to ...

Added: January 31, 2023

Real-time Streaming Wave-U-Net with Temporal Convolutions for Multichannel Speech Enhancement

Sokolov A., / Series Computer Science "arxiv.org". 2021.

In this paper we describe our work that we have done to participate in Task1 of ConferencingSpeech2021 challenge. This task set a goal to develop the solution for multi-channel speech enhancement in a real-time manner. We propose a novel system for streaming speech enhancement. We employ Wave-U-Net architecture with temporal convolutions in encoder and decoder. ...

Added: September 26, 2021

Intelligent Systems and Applications

Cham: Springer, 2019.

Intelligent Systems Conference (IntelliSys) 2018 is the fourth research conference in the series. This conference is a part of SAI conferences being held since 2013. The conference series has featured keynote talks, special sessions, poster presentation, tutorials, workshops, and contributed papers each year. The conference focus on areas of intelligent systems and artificial intelligence (AI) and ...

Added: August 29, 2018