Breaking Sticks and Ambiguities with Adaptive Skip-gram

?

Breaking Sticks and Ambiguities with Adaptive Skip-gram

Journal of Machine Learning Research. 2016. Vol. 51. P. 130–138.

Bartunov S., Vetrov D., Kondrashkin D., Osokin A.

The recently proposed Skip-gram model is a powerful method for learning high-dimensional word representations that capture rich semantic relationships between words. However, Skip-gram as well as most prior work on learning word representations does not take into account word ambiguity and maintain only single representation per word. Although a number of Skip-gram modifications were proposed to overcome this limitation and learn multi-prototype word representations, they either require a known number of word meanings or learn them using greedy heuristic approaches. In this paper we propose the Adaptive Skip-gram model which is a nonparametric Bayesian extension of Skip-gram capable to automatically learn the required number of representations for all words at desired semantic resolution. We derive efficient online variational learning algorithm for the model and empirically demonstrate its efficiency on word-sense induction task.

Research target: Computer Science

Priority areas: IT and mathematics

Language: English

Text on another site

Deep learning based methods for estimating distribution of coalescence rates from genome-wide data

Khomutov E., Arzymatov K., Shchur V., Journal of Physics: Conference Series 2021 Vol. 1740 Article 012031

Demographic and population structure inference is one of the most important problems in genomics. Population parameters such as effective population sizes, population split times and migration rates are of high interest both themselves and for many applications, e.g. for genome-wide association studies. Hidden Markov Model (HMM) based methods, such as PSMC, MSMC, coalHMM etc., proved ...

Added: May 17, 2021

Detecting ethnicity-targeted hate speech in Russian social media texts

Pronoza E., Panicheva P., Koltsova O. et al., Information Processing and Management 2021 Vol. 58 No. 6 Article 102674

Ethnicity-targeted hate speech has been widely shown to influence on-the-ground inter-ethnic conflict and violence, especially in such multi-ethnic societies as Russia. Therefore, ethnicity-targeted hate speech detection in user texts is becoming an important task. However, it faces a number of unresolved problems: difficulties of reliable mark-up, informal and indirect ways of expressing negativity in user ...

Added: September 2, 2021

Approach to Designing CV Systems for Medical Applications: Data, Architecture and AI

Ryabtsev D., Vasilyev Boris, Shershakov S., / Series 2501.14689 "Computer Science ::Computer Vision and Pattern Recognition". 2025.

This paper introduces an innovative software system for fundus image analysis that deliberately diverges from the conventional screening approach, opting not to predict specific diagnoses. Instead, our method mimics the diagnostic process by thoroughly analyzing both normal and pathological features of fundus structures, leaving the ultimate decision-making authority in the hands of healthcare professionals. Our initiative ...

Added: February 14, 2025

Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models

Polykovskiy D., Zhebrak A., Sanchez-Lengeling B. et al., Frontiers in Pharmacology 2020 Vol. 11 P. 1–10

Generative models are becoming a tool of choice for exploring the molecular space. These models learn on a large training dataset and produce novel molecular structures with similar properties. Generated structures can be utilized for virtual screening or training semi-supervized predictive models in the downstream tasks. While there are plenty of generative models, it is ...

Added: April 21, 2021

Workshop on Compact Deep Neural Network Representation with Industrial Applications, Thirty-second Conference on Neural Information Processing Systems

Montréal: [б.и.], 2018.

This workshop aims to bring together researchers, educators, practitioners who are interested in techniques as well as applications of making compact and efficient neural network representations. One main theme of the workshop discussion is to build up consensus in this rapidly developed field, and in particular, to establish close connection between researchers in Machine Learning ...

Added: December 5, 2018

mPyPl: Python Monadic Pipeline Library for Complex Functional Data Processing

Soshnikov D. V., Valieva Y., / Series Computer Science "arxiv.org". 2021.

In this paper, we present a new Python library called mPyPl, which is intended to simplify complex data processing tasks using functional approach. This library defines operations on lazy data streams of named dictionaries represented as generators (so-called multi-field datastreams), and allows enriching those data streams with more 'fields' in the process of data preparation ...

Added: October 7, 2021

Система постановки произношения на основе сверточных нейронных сетей и информационной теории восприятия речи

Savchenko L., Информационные технологии 2019 Т. 25 № 5 С. 313–318

We consider a problem of computer assisted language and pronunciation learning based on the deep learning methods and the information theory of speech perception. In order to improve the efficiency of testing of pronunciation quality, we propose to train a convolutional neural network using the best reference utterances from the user. The experimental results proved ...

Added: May 29, 2019

Deep neural networks and maximum likelihood search for approximate nearest neighbor in video-based image recognition

Savchenko A., Optical Memory and Neural Networks (Information Optics) 2017 Vol. 26 No. 2 P. 129–136

We analyzed the way to increase computational efficiency of video-based image recognition methods with matching of high dimensional feature vectors extracted by deep convolutional neural networks. We proposed an algorithm for approximate nearest neighbor search. At the first step, for a given video frame the algorithm verifies a reference image obtained when recognizing the previous ...

Added: June 30, 2017

The NeurIPS '18 Competition: From Machine Learning to Intelligent Conversations

Springer, 2020.

his volume presents the results of the Neural Information Processing Systems Competition track at the 2018 NeurIPS conference. The competition follows the same format as the 2017 competition track for NIPS. Out of 21 submitted proposals, eight competition proposals were selected, spanning the area of Robotics, Health, Computer Vision, Natural Language Processing, Systems and Physics. Competitions have become ...

Added: December 2, 2019

Black-Box Optimization with Local Generative Surrogates

Belavin V., Ustyuzhanin A., Широбоков С. К. et al., Proceedings of Machine Learning Research 2020 P. 1–9

We propose a novel method for gradient-based optimization of black-box simulators using differentiable local surrogate models. In fields such as physics and engineering, many processes are modeled with non-differentiable simulators with intractable likelihoods. Optimization of these forward models is particularly challenging, especially when the simulator is stochastic. To address such cases, we introduce the use ...

Added: October 31, 2019

Deep learning approach for predicting functional Z-DNA regions using omics data

Beknazarov N., Jin S., Poptsova M., Scientific Reports 2020 Vol. 10 P. 19134

Computational methods to predict Z-DNA regions are in high demand to understand the functional role of Z-DNA. The previous state-of-the-art method Z-Hunt is based on statistical mechanical and energy considerations about B- to Z-DNA transition using sequence information. Z-DNA CHiP-seq experiment results showed little overlap with Z-Hunt predictions implying that sequence information only is not ...

Added: December 11, 2020

Probabilistic adaptive computation time

Figurnov M., Sobolev A., Vetrov D., Bulletin of the Polish Academy of Sciences: Technical Sciences 2018 Vol. 66 No. 6 P. 811–820

We present a probabilistic model with discrete latent variables that control the computation time in deep learning models such as ResNets and LSTMs. A prior on the latent variables expresses the preference for faster computation. The amount of computation for an input is determined via amortized maximum a posteriori (MAP) inference. MAP inference is performed ...

Added: February 27, 2019

Intelligent Data Processing 11th International Conference, IDP 2016, Barcelona, Spain, October 10–14, 2016, Revised Selected Papers

Switzerland: Springer, 2019.

This book constitutes the refereed proceedings of the 11th International Conference on Intelligent Data Processing, IDP 2016, held in Barcelona, Spain, in October 2016. The 11 revised full papers were carefully reviewed and selected from 52 submissions. The papers of this volume are organized in topical sections on machine learning theory with applications; intelligent data processing in life ...

Added: February 8, 2020

Fuzzy Analysis and Deep Convolution Neural Networks in Still-to-video Recognition

Savchenko A., Belova N. S., Savchenko Lyudmila V., Optical Memory and Neural Networks (Information Optics) 2018 Vol. 27 No. 1 P. 23–31

We discuss the video classification problem with the matching of feature vectors extracted using deep convolutional neural networks from each frame. We propose the novel recognition method based on representation of each frame as a sequence of fuzzy sets of reference classes whose degrees of membership are defined based on asymptotic distribution of the Kullback–Leibler ...

Added: February 9, 2018

Real-time Streaming Wave-U-Net with Temporal Convolutions for Multichannel Speech Enhancement

Sokolov A., / Series Computer Science "arxiv.org". 2021.

In this paper we describe our work that we have done to participate in Task1 of ConferencingSpeech2021 challenge. This task set a goal to develop the solution for multi-channel speech enhancement in a real-time manner. We propose a novel system for streaming speech enhancement. We employ Wave-U-Net architecture with temporal convolutions in encoder and decoder. ...

Added: September 26, 2021

Bayesian Group Sparsification of Long Short-Term Memory Networks

Lobacheva E., Chirkova N., Vetrov D., /. 2018.

We propose a new Bayesian sparsification technique for gated recurrent architectures that encounters for its recurrent specifics and gated mechanism. Our method eliminates neurons from the model and makes gates constant, not only compressing the network, but also significantly accelerating a forward pass. On the discriminative tasks our method compresses LSTM extremely, so that only ...

Added: October 16, 2018

Intelligent Systems and Applications

Cham: Springer, 2019.

Intelligent Systems Conference (IntelliSys) 2018 is the fourth research conference in the series. This conference is a part of SAI conferences being held since 2013. The conference series has featured keynote talks, special sessions, poster presentation, tutorials, workshops, and contributed papers each year. The conference focus on areas of intelligent systems and artificial intelligence (AI) and ...

Added: August 29, 2018

Deep convolutional neural networks capabilities for binary classification of polar mesocyclones in satellite mosaics

Криницкий М. А., Verezemskaya P., Гращенков К. В. et al., Atmosphere 2018 Vol. 9 No. 426 P. 1–23

Polar mesocyclones (MCs) are small marine atmospheric vortices. The class of intense MCs, called polar lows, are accompanied by extremely strong surface winds and heat fluxes and thus largely influencing deep ocean water formation in the polar regions. Accurate detection of polar mesocyclones in high-resolution satellite data, while challenging, is a time-consuming task, when performed ...

Added: November 26, 2020

Simulating the time projection chamber responses at the MPD detector using generative adversarial networks

A. Maevskiy, F. Ratnikov, Zinchenko A. et al., The European Physical Journal C - Particles and Fields 2021 Vol. 81 Article 599

High energy physics experiments rely heavily on the detailed detector simulation models in many tasks. Running these detailed models typically requires a notable amount of the computing time available to the experiments. In this work, we demonstrate a new approach to speed up the simulation of the Time Projection Chamber tracker of the MPD experiment at ...

Added: July 12, 2021

Statistical Analysis of Protein Side-chain Conformations

Ignatov A., Journal of Physics: Conference Series 2021 Vol. 1740 No. 1 P. 012013

In the paper, three algorithms for predicting protein side-chain conformations are suggested and discussed. All proposed approaches analyze the local neighborhood of the target residue to avoid'steric clashes'. Strong and weak points of the algorithms are described, and ways of improving their outcomes are suggested. The approach based on predicting conformations for all residues in ...

Added: April 16, 2021

Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

Association for Computational Linguistics, 2019.

The 4th Workshop on Representation Learning for NLP (RepL4NLP) will be hosted by ACL 2019 and held on 2 August 2019. The workshop is being organised by Isabelle Augenstein, Spandana Gella, Sebastian Ruder, Katharina Kann, Burcu Can, Alexis Conneau, Johannes Welbl, Xian Ren and Marek Rei; and advised by Kyunghyun Cho, Edward Grefenstette, Karl Moritz ...

Added: November 1, 2019

Variational Inference for Sequential Distance Dependent Chinese Restaurant Process.

Bartunov S. O., Vetrov D., Journal of Machine Learning Research 2014 Vol. 32 No. 1 P. 1404–1412

Recently proposed distance dependent Chinese Restaurant Process (ddCRP) generalizes extensively used Chinese Restaurant Process (CRP) by accounting for dependencies between data points. Its posterior is intractable and so far only MCMC methods were used for inference. Because of very different nature of ddCRP no prior developments in variational methods for Bayesian nonparametrics are appliable. In ...

Added: July 9, 2014

mPyPl: Python Monadic Pipeline Library for Complex Functional Data Processing

Soshnikov D. V., Valieva Y., Microsoft Journal of Applied Research, USA 2019 Vol. 12 P. 140–150

In this paper, we present a new Python library called mPyPl, which is intended to simplify complex data processing tasks using a functional approach. This library defines operations on lazy data streams of named dictionaries represented as generators (so-called multi-field datastreams), and allows enriching those data streams with more ’fields’ in the process of data ...

Added: November 20, 2020

Learning velocity model for complex media with deep convolutional neural networks

Gremyachikh L., Ustyuzhanin A., Станкевич А. et al., / Series 2110.08626 "Machine Learning". 2021.

The paper considers the problem of velocity model acquisition for a complex media based on boundary measurements. The acoustic model is used to describe the media. We used an open-source dataset of velocity distributions to compare the presented results with the previous works directly. Forward modeling is performed using the grid-characteristic numerical method. The inverse ...

Added: May 24, 2022