Book
Proceedings of the Artificial Intelligence and Natural Language AINL FRUCT 2016 Conference, Saint-Petersburg, Russia, 10-12 November 2016
Proceeding of the AINL FRUCT: Artificial Intelligence and Natural Language Conference 2016
The availability of large urban social media data creates new opportunities for studying cities. In our paper we propose a new direction for this research: a joint analysis of geolocations of shared images and their content as determined by computer vision. To test our ideas, we use a dataset of 47,410 Instagram images shared in the city of St.Petersburg over one year. We show how a combination of semantic clustering, image recognition and geospatial analysis can detect important patterns related to both how people use a city and how they represent in social media.
Despite disputable possibility of extension of analysis of social relations on Twitter to real life, Twitter discussions are stiU being under attention of scholars studying structures and meanings of news-and issue-based ad-hoc public discourse. One of the socially relevant aspects of Twitter studies is that of influencers-accounts that produce impact, either inside or outside Twitter. But there is still no agreement in the research community on how to defme and measure who is an inDuencer: either by 'absolute figUres' or by network analysis metrics; this issue is even rarely discussed. Politically, today's mediatized pub6c sphere where traditional media play the role of information hubs is highly uneven in terms of auess to opinion expression; it privileges institutional players, including political elites, corporations, and media themselves. Hopes that Twitter would provide a more equal space for public deliberation are still not proven weD enough. Using web crawling and manual assessment of Twitter ad-hoc discussion on the Biryulyovo bashings of 2013, we show that users who post or even get commented most do not make it to the positions of most 'central' users by network metrics. We also demonstrate that usen that rank high by betweenness and pagerank centn.lity form circles of reciprocal commenting that show the social cleavage wider than the discussion itselt.
With the process of globalization the number of borrowings from English has rapidly increased in languages all over the world. In systems of automatic speech recognition, spell-checking, tagging and other tasks in the field of natural language processing the loan words frequently cause problems and should be treat separately. In this paper we present a corpora-based approach for the automatic detection of anglicisms in Russian social network texts. Proposed method is based on the idea of simultaneous scripting, phonetics and semantics similarity of the original Latin word and its Cyrillic analogue. We used a set of transliteration, phonetic transcription and morphological analysis methods to find possible hypotheses and distributional semantic models to filter them. Resulting list of borrowings, gathered from approximately 20 million LiveJournal texts shows good intersection with manually collected dictionary. Proposed method is fully automated and can be applied to any domain-specific area.
The presented project is intended to make use of growing amounts or textual data in social networks in the Russian language, In order to find linguistic correlates of the Dark Triad personality traits, comprising non-clinical Narcissism, Machiavellianism and Psychopathy. The background for the investigation includes, on the one hand, psychological research on these phenomena and their measurement instruments, and on the other hand, recent advances In computational stylometry and text-based author profiling. The measures for these psychological phenomena are provided by recognized self-report psychological surveys adapted to Russian. Morphological and semantic analysis are applied to investigate the relationship between the Dark traits and their linguistic manifestation in social network texts. Significant morphological and semantic correlates of Narcissism, Machiavellianism and Psychopathy are identified and compared to respective advances In English author profiling. In order to deepen our understanding of the relation between these psychological characteristics and natural language use, the identified linguistic features are Interpreted In terms of the line-grained factor structure of the Dark traits. Identifying correlated features is a step towards automatic Dark trait prediction and early detection of the potentially harmful mental states.
Media frames have been traditionally extracted via manual content and discourse analysis. Such approach has a limited ability to deal with large text collections and is prone to subjectivity both in terms of text selection and interpretation. We illustrate possibilities and limitations of topic modeling for frame detection applying this method to a collection of 50,000 news items related to the Ukrainian crisis and retrieved from a Russian and a Ukrainian TV channels websites. We conclude that although topic modeling results allow to make assumptions about how topic is framed it is still not as precise as human reading of texts.

This article is an expanded version of the report submitted by the author on V scientific and practical conference dedicated to the memory of the first Dean of the Faculty of Sociology HSE Alexander O. Kryshtanovskiy "Sociological research methods in modern practice". The article is based on a study of the quantative data obtained in the course of one of the stages of the study "New social movements of youth" by Center of Youth Studies HSE - SaintPetersburg. At this stage, youth community mapping was conducted and analysis of the data using SNA tools was organised. The issue of this work is related to the specific application of network theory and network analysis methods in the process of discovering relations between various informal organisations on the example of youth communities.
Proceeding of the 15th International Conference on Artificial Intelligence: Methodology, Systems, Applications , AIMSA 2012, Varna, Bulgaria, September 12-15, 2012.
This paper is an overview of the current issues and tendencies in Computational linguistics. The overview is based on the materials of the conference on computational linguistics COLING’2012. The modern approaches to the traditional NLP domains such as pos-tagging, syntactic parsing, machine translation are discussed. The highlights of automated information extraction, such as fact extraction, opinion mining are also in focus. The main tendency of modern technologies in Computational linguistics is to accumulate the higher level of linguistic analysis (discourse analysis, cognitive modeling) in the models and to combine machine learning technologies with the algorithmic methods on the basis of deep expert linguistic knowledge.
Compared with the area of spatial relations force interactions haven’t been in the limelight of attention of ontologists working on natural language processing. This article gives an example of text meaning representation based on the ontology and the lexicon of force interactions.
In 2015-2016 the Department of Communication, Media and Design of the National Research University “Higher School of Economics” in collaboration with non-profit organization ROCIT conducted research aimed to construct the Index of Digital Literacy in Russian Regions. This research was the priority and remain unmatched for the momentIn 2015-2016 the Department of Communication, Media and Design of the National Research University “Higher School of Economics” in collaboration with non-profit organization ROCIT conducted research aimed to construct the Index of Digital Literacy in Russian Regions. This research was the priority and remain unmatched for the moment
This book constitutes the refereed proceedings of the 12th Industrial Conference on Data Mining, ICDM 2012, held in Berlin, Germany in July 2012. The 22 revised full papers presented were carefully reviewed and selected from 97 submissions. The papers are organized in topical sections on data mining in medicine and biology; data mining for energy industry; data mining in traffic and logistic; data mining in telecommunication; data mining in engineering; theory in data mining; theory in data mining: clustering; theory in data mining: association rule mining and decision rule mining.
This article is talking about state management and cultural policy, their nature and content in term of the new tendency - development of postindustrial society. It mentioned here, that at the moment cultural policy is the base of regional political activity and that regions can get strong competitive advantage if they are able to implement cultural policy successfully. All these trends can produce elements of new economic development.