Analysis of Images, Social Networks and Texts. 5th International Conference, AIST 2016, Yekaterinburg, Russia, April 7-9, 2016, Revised Selected Papers. Communications in Computer and Information Science
This book constitutes the proceedings of the 5th International Conference on Analysis of Images, Social Networks and Texts, AIST 2016, held in Yekaterinburg, Russia, in April 2016. The 23 full papers, 7 short papers, and 3 industrial papers were carefully reviewed and selected from 142 submissions. The papers are organized in topical sections on machine learning and data analysis; social networks; natural language processing; analysis of images and video.
In this paper we suggest the first systematic review and com- pare performance of most frequently used machine learning algorithms for prediction of the match winner from the teams’ drafts in DotA 2 computer game. Although previous research attempted this task with simple models, weve made several improvements in our approach aiming to take into account interactions among heroes in the draft. For that pur- pose we’ve tested the following machine learning algorithms: Naive Bayes classifier, Logistic Regression and Gradient Boosted Decision Trees. We also introduced Factorization Machines for that task and got our best re- sults from them. Besides that, we found that model’s prediction accuracy depends on skill level of the players. We’ve prepared publicly available dataset which takes into account shortcomings of data used in previous research and can be used further for algorithms development, testing and benchmarking.
In this paper we present a comparison of three morphological taggers for Russian with regard to the quality of morphological disambiguation performed by these taggers. We test the quality of the analysis in three different ways: lemmatization, POS-tagging and assigning full morphological tags. We analyze the mistakes made by the taggers, outline their strengths and weaknesses, and present a possible way to improve the quality of morphological analysis for Russian.
Nowadays many algorithms for mobile robot mapping in indoor environments have been created. In this work we use a Kinect 2.0 camera, a visible range cameras Beward B2720 and an infrared camera Flir Tau 2 for building 3D dense maps of indoor environments. We present the RGB-D Mapping and a new fusion algorithm combining visual features and depth information for matching images, aligning of 3D point clouds, a “loop-closure” detection, pose graph optimization to build global consistent 3D maps. Such 3D maps of environments have various applications in robot navigation, real-time tracking, non-cooperative remote surveillance, face recognition, semantic mapping. The performance and computational complexity of the proposed RGB-D Mapping algorithm in real indoor environments is presented and discussed.
Modern corpora provide suitable access to the stored data. However, they are convenient rather for researchers than for students learning a foreign language and not familiar with the corpus linguistics. Therefore, we set the task of creating a corpus, which contains information on words co-occurrence, their syntactical relations and their government for the Russian language.
Homophily is considered by network scientists as one of the major mechanisms of social network formation. However, the role of dynamic homophily in the network growth process has not been investigated in detail yet. In this paper, we estimate the role of homophily by various attributes at different stages of online network formation process. We consider the process of online friendship formation in the Vkontakte social networking site among first-year students at a Russian university. We reveal that at the beginning of the network formation a similarity in gender and score in entrance exams plays the key role, while by the end of network establishment period the role of the same group affiliation becomes more important. We explain the results with the tendency of students to follow different strategies to control the information flow in their social environment. Do you want to read the rest of this chapter? Homophily Evolution in Online Networks: Who Is a Good Friend and When?.
A balanced social structure within an organization is often considered as one of the major factors of company success. Thus the analysis of organizational networks is an important direction in network and organizational studies. In this paper we explore the mechanisms of collaboration using information about scientific paper coauthorships. We reveal the collaboration mechanisms within research departments of top Russian oil companies, Gazpromneft, Bashneft, Lukoil, and Tatneft. We examine the role of management in professional community formation.
Organizational citizenship behavior (OCB) is an important management construct. Despite previous investigations in relation to social capital, the role of networks in its emergence has received only limited attention. In this paper we investigate the relationship between OCB, with data collected from supervisors evaluating their subordinates; several types of organizational networks (professional, friendship, support, supervisor-subordinate), and several other constructs (collected from the employees themselves), shown to affect OCB in the past. All data were collected at a large insurance company in Russia. Outcomes of this study have several important implications. First, the impact of networks on manifestation of OCB depends not only on the strength of network ties, but on types of network. Second, interorganizational relationships are complex and consist of several levels of mediated relationships. Results of this study can impact the theoretical understanding of OCB and have practical implications for the supervisor-subordinate relationships in the workplace.
In the paper vector-space semantic models based on Word2Vec word embeddings algorithm and a count-based association-oriented algorithm are evaluated and compared by measuring association strength between Russian nouns and adjectives. A dataset of nouns and associated adjectives is used as the test set for pseudodisambiguation task. Models are trained with corpora of Russian fiction. A measure of lexical association anomaly is applied evaluating similarity between the initial noun and the resulting attributive phrase. Results of association strength are reported for models characterized by different parameter values; the best parameter value combinations are proposed. The test exemplars producing the error rate are manually annotated, and the model errors are categorized in terms of their linguistic nature and compositionality features.
The Cropland Capture game (CCG) aims to map cultivated lands using around 170000 satellite images. The contribution of the paper is threefold: (a) we improve the quality of the CCG’s dataset, (b) we benchmark state-of-the-art algorithms designed for an aggregation of votes in a crowdsourcing-like setting and compare the results with machine learning algorithms, (c) we propose an explanation for surprisingly similar accuracy of all examined algorithms. To accomplish (a), we detect image duplicates using the perceptual hash function pHash. In addition, using a blur detection algorithm, we filter out unidentifiable images. In part (c), we suggest that if all workers are accurate, the task assignment in the dataset is highly irregular, then state-of-the-art algorithms perform on a par with Majority Voting. We increase the estimated consistency with expert opinions from 77% to 91% and up to 96% if we restrict our attention to images with more than 9 votes.