Anaphoric annotation and corpus-based anaphora resolution: An experiment
The paper describes the noun phase and anaphora annotation in OpenCorpora and compares it to that in other corpora. We discuss the choice of representative texts for anaphoric annotation and the basic principles of syntactic annotation. In case of noun phrase annotation we followed the scheme introduced earlier for morphological annotation: it was carried out in two stages: firstly, all noun phrases and some other syntactic units were annotated by a heterogenous group of people, then a linguist compared all markup results and found the best one, or corrected mistakes. We present some annotation results and cases of annotator's disagreement and proceed to introduce our data-driven anaphora resolution system based on decision trees. We then list the features used to fit the classificator and discuss their relevance and some changes which improved the classificator performance. We also present out rule-based approach to automated noun phrase extraction using Tomita parser. A baseline for anaphora resolution is introduced and we compare it with our results.
In this paper we present a review of the existing typologies of Inter-net service users. We zoom in on social networking services including blogs and crowdsourcing websites. Based on the results of the analysis of the consid-ered typologies obtained by means of FCA we developed a new user typology of a certain class of Internet services, namely a collaboration innovation plat-form. Cluster analysis of data extracted from the collaboration platform Witolo-gy was used to divide more than 500 participants into 6 groups based on 3 ac-tivity indicators: idea generation, commenting, and evaluation (assigning marks) The obtained groups and their percentages appear to follow the “90 – 9 – 1” rule.
This book constitutes the refereed proceedings of the 10th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2014, held in St. Petersburg, Russia in July 2014. The 40 full papers presented were carefully reviewed and selected from 128 submissions. The topics range from theoretical topics for classification, clustering, association rule and pattern mining to specific data mining methods for the different multimedia data types such as image mining, text mining, video mining and Web mining.
We create collaborative environment for collaborative creation, improvement and promoting bills within public and legislative projects. Enacting a new law means that a community devises out new rules which help it to become more efficient. Below are the principles on which legislative collaboration is based: Public construction of a document aiming at complex cloud issues has high educational value. The practice helps not only produce a quality document and build a community of people interested in its implementation, but promote the innovative document, maintain a new level of its understanding and perception by the society. 518 Collaborative document creation and voting has a priority over document deliberation. Our technology allows collaboration participants to create their own text versions, that could be voted for by other participants. The value of deliberation is less than the value of collaboration. Contemporary collaboration does not always need discussions. Discussion can take so much time and efforts that participants do not have resources to collaborate. The process of selecting text segments is based on the participants' voting. All the votes should be counted but the weight of each vote depends on the participant's impact and the estimation of this impact by the community. The more is the participant's impact and its estimation, the more is the participant's vote weight.
The article is devoted to the possibilities of effective organization of internal crowdsourcing to purpose of examination of its normative documents. Crowdsourcing is a relatively new phenomenon in the practice of domestic companies, at the same time, its potential is huge, but the toolkit organization of crowdsourcing activities has not yet developed. The article makes an attempt to consolidate the experience of crowdsourcing, in the direction of crowdsourcing revised internal normative documents of the company. The principal feature of this type of crowdsourcing is that it allows for examination of the draft document (original version), make the necessary changes and create the new edition.
The notions of crowdsourcing and reputation are compared. It is shown that crowdsourcing may be a significant factor influencing reputation formation of various social players; in strategic perspective it allows to build a new model of social interaction.
Manually annotated corpora are very important and very expensive resources: the annotation process requires a lot of time and skills. In Open- Corpora project we are trying to involve into annotation works native speakers with no special linguistic knowledge. In this paper we describe the way we organize our processes in order to maintain high quality of annotation and report on our preliminary results.
In this paper we consider choice problems under the assumption that the preferences of the decision maker are expressed in the form of a parametric partial weak order without assuming the existence of any value function. We investigate both the sensitivity (stability) of each non-dominated solution with respect to the changes of parameters of this order, and the sensitivity of the set of non-dominated solutions as a whole to similar changes. We show that this type of sensitivity analysis can be performed by employing techniques of linear programming.
I give the explicit formula for the (set-theoretical) system of Resultants of m+1 homogeneous polynomials in n+1 variables