Recommender system for crowdsourcing platform Witology
This paper discusses the recommender models and methods for crowdsourcing platforms. These models are based on modern methods of data analysis of object-attribute data, such as Formal Concept Analysis and biclustering. In particular, the paper is focused on the solution of two tasks – idea and antagonists recommendation – on the example of crowdsourcing platform Witology.
Concept discovery is a Knowledge Discovery in Databases (KDD) research field that uses human-centered techniques such as Formal Concept Analysis (FCA), Biclustering, Triclustering, Conceptual Graphs etc. for gaining insight into the underlying conceptual structure of the data. Traditional machine learning techniques are mainly focusing on structured data whereas most data available resides in unstructured, often textual, form. Compared to traditional data mining techniques, human-centered instruments actively engage the domain expert in the discovery process. This volume contains the contributions to CDUD 2011, the International Workshop on Concept Discovery in Unstructured Data (CDUD) held in Moscow. The main goal of this workshop was to provide a forum for researchers and developers of data mining instruments working on issues with analyzing unstructured data. We are proud that we could welcome 13 valuable contributions to this volume. The majority of the accepted papers described innovative research on data discovery in unstructured texts. Authors worked on issues such as transforming unstructured into structured information by amongst others extracting keywords and opinion words from texts with Natural Language Processing methods. Multiple authors who participated in the workshop used methods from the conceptual structures field including Formal Concept Analysis and Conceptual Graphs. Applications include but are not limited to text mining police reports, sociological definitions, movie reviews, etc.
In this paper we propose two novel methods for analyzing data collected from online social networks. In particular we will do analyses on Vkontake data (Russian online social network). Using biclustering we extract groups of users with similar interests and find communities of users which belong to similar groups. With triclustering we reveal users’ interests as tags and use them to describe Vkontakte groups. After this social tagging process we can recommend to a particular user relevant groups to join or new friends from interesting groups which have a similar taste. We present some preliminary results and explain how we are going to apply these methods on massive data repositories.
The notions of crowdsourcing and reputation are compared. It is shown that crowdsourcing may be a significant factor influencing reputation formation of various social players; in strategic perspective it allows to build a new model of social interaction.
We create collaborative environment for collaborative creation, improvement and promoting bills within public and legislative projects. Enacting a new law means that a community devises out new rules which help it to become more efficient. Below are the principles on which legislative collaboration is based: Public construction of a document aiming at complex cloud issues has high educational value. The practice helps not only produce a quality document and build a community of people interested in its implementation, but promote the innovative document, maintain a new level of its understanding and perception by the society. 518 Collaborative document creation and voting has a priority over document deliberation. Our technology allows collaboration participants to create their own text versions, that could be voted for by other participants. The value of deliberation is less than the value of collaboration. Contemporary collaboration does not always need discussions. Discussion can take so much time and efforts that participants do not have resources to collaborate. The process of selecting text segments is based on the participants' voting. All the votes should be counted but the weight of each vote depends on the participant's impact and the estimation of this impact by the community. The more is the participant's impact and its estimation, the more is the participant's vote weight.
The paper makes a brief introduction into multiple classifier systems and describes a particular algorithm which improves classification accuracy by making a recommendation of an algorithm to an object. This recommendation is done under a hypothesis that a classifier is likely to predict the label of the object correctly if it has correctly classified its neighbors. The process of assigning a classifier to each object involves here the apparatus of Formal Concept Analysis. We explain the principle of the algorithm on a toy example and describe experiments with real-world datasets.
An important characteristic feature of recommender systems for web pages is the abundance of textual information in and about the items being recommended (web pages). To improve recommendations and enhance user experience, we propose to use automatic tag (keyword) extraction for web pages entering the recommender system. We present a novel tag extraction algorithm that employs semi-supervised classification based on a dataset consisting of pre-tagged documents and (for the most part) partially tagged documents whose tags are automatically mined from the content. We also compare several classification algorithms for tag extraction in this context.