?
Breaking Sticks and Ambiguities with Adaptive Skip-gram
Arxiv.org
,
2015.
Recently proposed Skip-gram model is a powerful method for learning high-dimensional word representations that capture rich semantic relationships between words. However, Skip-gram as well as most prior work on learning word representations does not take into account word ambiguity and maintain only single representation per word. Although a number of Skip-gram modifications were proposed to overcome this limitation and learn multi-prototype word representations, they either require a known number of word meanings or learn them using greedy heuristic approaches. In this paper we propose the Adaptive Skip-gram model which is a nonparametric Bayesian extension of Skip-gram capable to automatically learn the required number of representations for all words at desired semantic resolution. We derive efficient online variational learning algorithm for the model and empirically demonstrate its efficiency on word-sense induction task.
Research target:
Computer Science
Language:
English
Berlin : Springer, 2014
This book constitutes the proceedings of the Third International Conference on Analysis of Images, Social Networks and Texts, AIST 2014, held in Yekaterinburg, Russia, in April 2014. The 11 full and 10 short papers were carefully reviewed and selected from 74 submissions. They are presented together with 3 short industrial papers, 4 invited papers and ...
Added: November 13, 2014
Денис Турдаков, Астраханцев Н. А., Недумов Я. Р. et al., Труды Института системного программирования РАН 2014 Т. 26 С. 421-438
he paper presents a framework for fast text analytics developed during the Texterra project. Texterra is a technology for multilingual text mining based on novel text processing methods that exploit knowledge extracted from user-generated content. It delivers a fast scalable solution for text mining without the expensive customization. Depending on use-cases Texterra could be utilized ...
Added: November 6, 2017
Aachen : CEUR Workshop Proceedings, 2019
Workshop concentrates on an interdisciplinary approach to modelling human behavior incorporating data mining and expert knowledge from behavioral sciences. Data analysis results extracted from clean data of laboratory experiments will be compared with noisy industrial datasets from the web e.g. Insights from behavioral sciences will help data scientists. Behavior scientists will see new inspirations to ...
Added: November 19, 2019
University Rennes 1, 2017
This volume is the supplementary volume of the 14th International Conference on Formal Concept Analysis (ICFCA 2017), held from June 13th to 16th 2017, at IRISA, Rennes. The ICFCA conference series is one of the major venues for researches from the field of Formal Concept Analysis and related areas to present and discuss their recent ...
Added: June 19, 2017
Switzerland : Springer, 2015
This book constitutes the proceedings of the Fourth International Conference on Analysis of Images, Social Networks and Texts, AIST 2015, held in Yekaterinburg, Russia, in April 2015. The 24 full and 8 short papers were carefully reviewed and selected from 140 submissions. The papers are organized in topical sections on analysis of images and videos; ...
Added: October 12, 2015
CEUR Workshop Proceedings, 2019
Added: October 31, 2019
Switzerland : Springer, 2019
This book constitutes the refereed proceedings of the 11th International Conference on Intelligent Data Processing, IDP 2016, held in Barcelona, Spain, in October 2016.
The 11 revised full papers were carefully reviewed and selected from 52 submissions. The papers of this volume are organized in topical sections on machine learning theory with applications; intelligent data processing in life ...
Added: February 8, 2020
Kuznetsov V. O., Логистика и управление цепями поставок 2018 № 4 (87) С. 27-33
One of the options for a more flexible approach to analyzing the reliability of supply chains is the principal component analysis (PCA). With a large number of variables describing supply chain, it is a difficult task to analyze the structure of variables in two-dimensional space. Within the analysis of the variables dependencies PCA allows to ...
Added: November 29, 2018
Zhuk R., Ignatov D. I., Konstantinova N., Procedia Computer Science 2014 Vol. 31 P. 928-938
We propose extensions of the classical JSM-method and the Na ̈ıve Bayesian classifier for the case of triadic relational data. We performed a series of experiments on various types of data (both real and synthetic) to estimate quality of classification techniques and compare them with other classification algorithms that generate hypotheses, e.g. ID3 and Random ...
Added: June 9, 2014
М. : Торус Пресс, 2018
The volume contains the abstracts of the 12th International Conference "Intelligent Data Processing: Theory and Applications". The conference is organized by the Russian Academy of Sciences, the Federal Research Center "Informatics and Control" of the Russian Academy of Sciences and the Scientific and Coordination Center "Digital Methods of Data Mining". The conference has being held biennially since 1989. It is one ...
Added: October 9, 2018
V'yugin V., М. : МЦНМО, 2013
Книга предназначена для первоначлаьного знакомства с математическими основами современной теории машинного обучения (Machine Learning) и теории игр на предсказания. В первой части излагаются основы статистической теории машинного обучения, рассматриваются задачи классификации и регрессии с опорными векторами, теория обобщения и алгоритмы построения разделяющих гиперплоскостей. Во второй и третьей частях рассматриваются задачи адаптивного прогнозирования в нестохастических теоретико-игровой ...
Added: July 9, 2014
С.Д. Кузнецов, Гомзин А. Г., Труды Института системного программирования РАН 2015 Т. 27 № 4 С. 129-144
he paper is devoted to methods for construction of socio-demographic profile of Internet users. Gender, age, political and religion views, region, relationship status are examples of demographic attributes. This work is a survey of methods that detect demographic attributes from user’s profile and messages. The most of surveyed works are devoted to gender detection. Age, ...
Added: January 23, 2018
Braslavski P., Karpov Nikolay, Worring M. et al., ACM SIGIR Forum 2014 Vol. 48 No. 2 P. 105-110
The 8th Russian Summer School in Information Retrieval (RuSSIR 2014) was held on August 18-22, 2014 in Nizhniy Novgorod, Russia.1 The school was co-organized by the National Research University Higher School of Economics2 and the Russian Information Retrieval Evaluation Seminar (ROMIP) ...
Added: August 22, 2015
Switzerland : Springer, 2017
This book constitutes the proceedings of the 5th International Conference on Analysis of Images, Social Networks and Texts, AIST 2016, held in Yekaterinburg, Russia, in April 2016. The 23 full papers, 7 short papers, and 3 industrial papers were carefully reviewed and selected from 142 submissions. The papers are organized in topical sections on machine ...
Added: October 19, 2016
М. : Торус Пресс, 2016
This proceedings contains the abstracts of papers accepted to IDP-11 ...
Added: November 12, 2016
Ekaterinburg : CEUR Workshop Proceedings, 2014
AIST'2014 is an international data science conference on Analysis of Images, Social Networks, and Texts. Traditionally, the conference is held annually in Yekaterinburg, Russia. The conference is intended for computer scientists and practitioners whose research interests involve Internet mathematics and other related fields of data science.
LIST OF TOPICS (NON EXHAUSTIVE)
Applications of Data Mining and Machine ...
Added: August 28, 2014
Berlin : Association for Computational Linguistics, 2016
The 2016 Conference on Computational Natural Language Learning is the twentieth in the series of annual meetings organized by SIGNLL, the ACL special interest group on natural language learning. CoNLL 2016 will be held on August 11-12, 2016, and is co-located with the 54th annual meeting of the Association for Computational Linguistics (ACL) in Berlin, ...
Added: November 12, 2016
Krylova D., Maksimenko A., Государственное управление. Электронный вестник 2021 № 84 С. 241-255
In this article, the authors, using the example of several foreign publications, analyze the trends in the use of artificial intelligence and machine learning in discernment of corruption. Based on the international review, the authors make the conclusion that the mechanisms for detecting corruption, based on the use of artificial intelligence, described in foreign sources, ...
Added: February 25, 2021
Emmanuel I. C., Mitrofanova E., / Cornell Tech. Series 4064475 "ArXiv Preprint". 2022.
The paper is devoted to the study of the model fairness and process fairness of the Russian demographic dataset by making predictions of divorce of the 1st marriage, religiosity, 1st employment and completion of education. Our goal was to make classifiers more equitable by reducing their reliance on sensitive features while increasing or at least ...
Added: May 31, 2022
Smetanin S., IEEE Access 2020 Vol. 8 P. 110693-110719
Sentiment analysis has become a powerful tool in processing and analysing expressed opinions on a large scale. While the application of sentiment analysis on English-language content has been widely examined, the applications on the Russian language remains not as well-studied. In this survey, we comprehensively reviewed the applications of sentiment analysis of Russian-language content and ...
Added: June 24, 2020
Рысаков С. В., Системный администратор 2015 № 10(155) С. 92-95
The article provides a review of modern methods of morphological ambiguity resolution. We considered such methods as statistical disambiguation, Brill’s automatically generated rules, decision trees and their modifications. For the comparison, the article provides numerical results obtained on two open corpora: OpenCorpora and SynTagRus. ...
Added: November 25, 2015
V.Belavin, A.Filatov, A.Ustyuzhanin et al., Journal of Physics: Conference Series 2018 Vol. 1085 No. 4 P. 042025-1-042025-6
Traces of electro-magnetic showers in the neutrino experiments may be considered as signals of dark-matter particles. For example, SHiP experiment is going to use emulsion film detectors similar to the ones designed for OPERA experiment from dark matter search. The goal of this research is to develop an algorithm that can identify traces of electro-magnetic ...
Added: December 8, 2017
S.D. Kuznetsov, D.Yu. Turdakov, Астраханцев Н. А. et al., Programming and Computer Software 2014 Vol. 40 No. 5 P. 288-295
A framework for fast text analysis, which is developed as a part of the Texterra project, is described. Texterra provides a scalable solution for the fast text processing on the basis of novel methods that exploit knowledge extracted from the Web and text documents. For the developed tools, details of the project, use cases, and ...
Added: November 26, 2017