Breaking Sticks and Ambiguities with Adaptive Skip-gram
Recently proposed Skip-gram model is a powerful method for learning high-dimensional word representations that capture rich semantic relationships between words. However, Skip-gram as well as most prior work on learning word representations does not take into account word ambiguity and maintain only single representation per word. Although a number of Skip-gram modifications were proposed to overcome this limitation and learn multi-prototype word representations, they either require a known number of word meanings or learn them using greedy heuristic approaches. In this paper we propose the Adaptive Skip-gram model which is a nonparametric Bayesian extension of Skip-gram capable to automatically learn the required number of representations for all words at desired semantic resolution. We derive efficient online variational learning algorithm for the model and empirically demonstrate its efficiency on word-sense induction task.
Analysis of Images, Social Networks and Texts Third International Conference, AIST 2014, Yekaterinburg, Russia, April 10-12, 2014, Revised Selected Papers
Berlin: Springer, 2014
This book constitutes the proceedings of the Third International Conference on Analysis of Images, Social Networks and Texts, AIST 2014, held in Yekaterinburg, Russia, in April 2014. The 11 full and 10 short papers were carefully reviewed and selected from 74 submissions. They are presented together with 3 short industrial papers, 4 invited papers and ...
Added: November 13, 2014
Труды Института системного программирования РАН 2014 Т. 26 С. 421-438
he paper presents a framework for fast text analytics developed during the Texterra project. Texterra is a technology for multilingual text mining based on novel text processing methods that exploit knowledge extracted from user-generated content. It delivers a fast scalable solution for text mining without the expensive customization. Depending on use-cases Texterra could be utilized ...
Added: November 6, 2017
Proceedings of the Fifth Workshop on Experimental Economics and Machine Learning at the National Research University Higher School of Economics co-located with the Seventh International Conference on Applied Research in Economics (iCare7)
Aachen: CEUR Workshop Proceedings, 2019
Workshop concentrates on an interdisciplinary approach to modelling human behavior incorporating data mining and expert knowledge from behavioral sciences. Data analysis results extracted from clean data of laboratory experiments will be compared with noisy industrial datasets from the web e.g. Insights from behavioral sciences will help data scientists. Behavior scientists will see new inspirations to ...
Added: November 19, 2019
University Rennes 1, 2017
This volume is the supplementary volume of the 14th International Conference on Formal Concept Analysis (ICFCA 2017), held from June 13th to 16th 2017, at IRISA, Rennes. The ICFCA conference series is one of the major venues for researches from the field of Formal Concept Analysis and related areas to present and discuss their recent ...
Added: June 19, 2017
М.: МЦНМО, 2013
Книга предназначена для первоначлаьного знакомства с математическими основами современной теории машинного обучения (Machine Learning) и теории игр на предсказания. В первой части излагаются основы статистической теории машинного обучения, рассматриваются задачи классификации и регрессии с опорными векторами, теория обобщения и алгоритмы построения разделяющих гиперплоскостей. Во второй и третьей частях рассматриваются задачи адаптивного прогнозирования в нестохастических теоретико-игровой ...
Added: July 9, 2014
CEUR Workshop Proceedings, 2019
Added: October 31, 2019
Intelligent Data Processing 11th International Conference, IDP 2016, Barcelona, Spain, October 10–14, 2016, Revised Selected Papers
Switzerland: Springer, 2019
This book constitutes the refereed proceedings of the 11th International Conference on Intelligent Data Processing, IDP 2016, held in Barcelona, Spain, in October 2016. The 11 revised full papers were carefully reviewed and selected from 52 submissions. The papers of this volume are organized in topical sections on machine learning theory with applications; intelligent data processing in life ...
Added: February 8, 2020
Логистика и управление цепями поставок 2018 № 4 (87) С. 27-33
One of the options for a more flexible approach to analyzing the reliability of supply chains is the principal component analysis (PCA). With a large number of variables describing supply chain, it is a difficult task to analyze the structure of variables in two-dimensional space. Within the analysis of the variables dependencies PCA allows to ...
Added: November 29, 2018
Procedia Computer Science 2014 Vol. 31 P. 928-938
We propose extensions of the classical JSM-method and the Na ̈ıve Bayesian classifier for the case of triadic relational data. We performed a series of experiments on various types of data (both real and synthetic) to estimate quality of classification techniques and compare them with other classification algorithms that generate hypotheses, e.g. ID3 and Random ...
Added: June 9, 2014
М.: Торус Пресс, 2018
The volume contains the abstracts of the 12th International Conference "Intelligent Data Processing: Theory and Applications". The conference is organized by the Russian Academy of Sciences, the Federal Research Center "Informatics and Control" of the Russian Academy of Sciences and the Scientific and Coordination Center "Digital Methods of Data Mining". The conference has being held biennially since 1989. It is one ...
Added: October 9, 2018
Analysis of Images, Social Networks and Texts. 4th International Conference, AIST 2015, Yekaterinburg, Russia, April 9–11, 2015, Revised Selected Papers
Switzerland: Springer, 2015
This book constitutes the proceedings of the Fourth International Conference on Analysis of Images, Social Networks and Texts, AIST 2015, held in Yekaterinburg, Russia, in April 2015. The 24 full and 8 short papers were carefully reviewed and selected from 140 submissions. The papers are organized in topical sections on analysis of images and videos; ...
Added: October 12, 2015
Труды Института системного программирования РАН 2015 Т. 27 № 4 С. 129-144
he paper is devoted to methods for construction of socio-demographic profile of Internet users. Gender, age, political and religion views, region, relationship status are examples of demographic attributes. This work is a survey of methods that detect demographic attributes from user’s profile and messages. The most of surveyed works are devoted to gender detection. Age, ...
Added: January 23, 2018
Added: November 20, 2017
ACM SIGIR Forum 2014 Vol. 48 No. 2 P. 105-110
The 8th Russian Summer School in Information Retrieval (RuSSIR 2014) was held on August 18-22, 2014 in Nizhniy Novgorod, Russia.1 The school was co-organized by the National Research University Higher School of Economics2 and the Russian Information Retrieval Evaluation Seminar (ROMIP) ...
Added: August 22, 2015
Analysis of Images, Social Networks and Texts. 5th International Conference, AIST 2016, Yekaterinburg, Russia, April 7-9, 2016, Revised Selected Papers. Communications in Computer and Information Science
Switzerland: Springer, 2017
This book constitutes the proceedings of the 5th International Conference on Analysis of Images, Social Networks and Texts, AIST 2016, held in Yekaterinburg, Russia, in April 2016. The 23 full papers, 7 short papers, and 3 industrial papers were carefully reviewed and selected from 142 submissions. The papers are organized in topical sections on machine ...
Added: October 19, 2016
М.: Торус Пресс, 2016
This proceedings contains the abstracts of papers accepted to IDP-11 ...
Added: November 12, 2016
Supplementary Proceedings of the 3rd International Conference on Analysis of Images, Social Networks and Texts (AIST 2014)
Ekaterinburg: CEUR Workshop Proceedings, 2014
AIST'2014 is an international data science conference on Analysis of Images, Social Networks, and Texts. Traditionally, the conference is held annually in Yekaterinburg, Russia. The conference is intended for computer scientists and practitioners whose research interests involve Internet mathematics and other related fields of data science. LIST OF TOPICS (NON EXHAUSTIVE) Applications of Data Mining and Machine ...
Added: August 28, 2014
Berlin: Association for Computational Linguistics, 2016
The 2016 Conference on Computational Natural Language Learning is the twentieth in the series of annual meetings organized by SIGNLL, the ACL special interest group on natural language learning. CoNLL 2016 will be held on August 11-12, 2016, and is co-located with the 54th annual meeting of the Association for Computational Linguistics (ACL) in Berlin, ...
Added: November 12, 2016
Вопросы языкознания 2014 № 1 С. 120-145
This paper is an overview of the current issues and tendencies in Computational linguistics. The overview is based on the materials of the conference on computational linguistics COLING’2012. The modern approaches to the traditional NLP domains such as pos-tagging, syntactic parsing, machine translation are discussed. The highlights of automated information extraction, such as fact extraction, ...
Added: October 15, 2013
The Applications of Sentiment Analysis for Russian Language Texts: Current Challenges and Future Perspectives
IEEE Access 2020 Vol. 8 P. 110693-110719
Sentiment analysis has become a powerful tool in processing and analysing expressed opinions on a large scale. While the application of sentiment analysis on English-language content has been widely examined, the applications on the Russian language remains not as well-studied. In this survey, we comprehensively reviewed the applications of sentiment analysis of Russian-language content and ...
Added: June 24, 2020
International Journal of General Systems 2014 Vol. 43 No. 2 P. 105-134
Formal Concept Analysis (FCA) is a mathematical technique that has been extensively applied to Boolean data in knowledge discovery, information retrieval, web mining, etc. applications. During the past years, the research on extending FCA theory to cope with imprecise and incomplete information made significant progress. In this paper, we give a systematic overview of the ...
Added: June 9, 2014
Programming and Computer Software 2014 Vol. 40 No. 5 P. 288-295
A framework for fast text analysis, which is developed as a part of the Texterra project, is described. Texterra provides a scalable solution for the fast text processing on the basis of novel methods that exploit knowledge extracted from the Web and text documents. For the developed tools, details of the project, use cases, and ...
Added: November 26, 2017
Journal of Physics: Conference Series 2018 Vol. 1085 No. 4 P. 042025-1-042025-6
Traces of electro-magnetic showers in the neutrino experiments may be considered as signals of dark-matter particles. For example, SHiP experiment is going to use emulsion film detectors similar to the ones designed for OPERA experiment from dark matter search. The goal of this research is to develop an algorithm that can identify traces of electro-magnetic ...
Added: December 8, 2017
Системный администратор 2015 № 10(155) С. 92-95
The article provides a review of modern methods of morphological ambiguity resolution. We considered such methods as statistical disambiguation, Brill’s automatically generated rules, decision trees and their modifications. For the comparison, the article provides numerical results obtained on two open corpora: OpenCorpora and SynTagRus. ...
Added: November 25, 2015