A method of ontology-aided expertise matching for facilitating knowledge exchange
The paper proposes a new method for facilitating knowledge exchange by seeking relevant university experts for commenting actual information events arising in the open environment of a modern economical cluster. This method is based on a new mathematical model of ontology concepts matching. We propose to use in the formal core of our method a new modification of Latent Dirichlet allocation. The method and the mathematical model of ontology matching were validated in the form of a software-based solution: the newly designed decision support system titled EXPERTIZE. The system regularly monitors different text sources in the Internet, performs document analysis and provides university employees with critical information about relevant events according a developed matching algorithm. In the proposed solution we made several contributions to the advances of knowledge processing, including: new modifications of topic modeling method suitable for application in expert finding tasks, integration of new algorithms and existing ontology services to show feasibility of the solution.
Proceeding of the 15th International Conference on Artificial Intelligence: Methodology, Systems, Applications , AIMSA 2012, Varna, Bulgaria, September 12-15, 2012.
This paper is an overview of the current issues and tendencies in Computational linguistics. The overview is based on the materials of the conference on computational linguistics COLING’2012. The modern approaches to the traditional NLP domains such as pos-tagging, syntactic parsing, machine translation are discussed. The highlights of automated information extraction, such as fact extraction, opinion mining are also in focus. The main tendency of modern technologies in Computational linguistics is to accumulate the higher level of linguistic analysis (discourse analysis, cognitive modeling) in the models and to combine machine learning technologies with the algorithmic methods on the basis of deep expert linguistic knowledge.
Compared with the area of spatial relations force interactions haven’t been in the limelight of attention of ontologists working on natural language processing. This article gives an example of text meaning representation based on the ontology and the lexicon of force interactions.
In this paper, we consider opinion word extraction, one of the key problems in sentiment analysis. Sentiment analysis (or opinion mining) is an important research area within computational linguistics. Opinion words, which form an opinion lexicon, describe the attitude of the author towards certain opinion targets, i.e., entities and their attributes on which opinions have been expressed. Hence, the availability of a representative opinion lexicon can facilitate the extraction of opinions from texts. For this reason, opinion word mining is one of the key issues in sentiment analysis. We designed and implemented several methods for extracting opinion words. We evaluated these approaches by testing how well the resulting opinion lexicons help improve the accuracy of methods for determining the polarity of the reviews if the extracted opinion words are used as features. We used several machine learning methods: SVM, Logistic Regression, Naive Bayes, and KNN. By using the extracted opinion words as features we were able to improve over the baselines in some cases. Our experiments showed that, although opinion words are useful for polarity detection, they are not su fficient on their own and should be used only in combination with other features.
An important text mining problem is to find, in a large collection of texts, documents related to specic topics and then discern further structure among the found texts. This problem is especially important for social sciences, where the purpose is to nd the most representative documents for subsequent qualitative interpretation. To solve this problem, we propose an interval semi-supervised LDA approach, in which certain predened sets of keywords (that dene the topics researchers are interested in) are restricted to specic intervals of topic assignments. We present a case study on a Russian LiveJournal dataset aimed at ethnicity discourse analysis.
An important text mining problem is to find, in a large collection of texts, documents related to specific topics and then discern further structure among the found texts. This problem is especially important for social sciences, where the purpose is to find the most representative documents for subsequent qualitative interpretation. To solve this problem, we propose an interval semi-supervised LDA approach, in which certain predefined sets of keywords (that define the topics researchers are interested in) are restricted to specific intervals of topic assignments.
Concept discovery is a Knowledge Discovery in Databases (KDD) research field that uses human-centered techniques such as Formal Concept Analysis (FCA), Biclustering, Triclustering, Conceptual Graphs etc. for gaining insight into the underlying conceptual structure of the data. Traditional machine learning techniques are mainly focusing on structured data whereas most data available resides in unstructured, often textual, form. Compared to traditional data mining techniques, human-centered instruments actively engage the domain expert in the discovery process. This volume contains the contributions to CDUD 2011, the International Workshop on Concept Discovery in Unstructured Data (CDUD) held in Moscow. The main goal of this workshop was to provide a forum for researchers and developers of data mining instruments working on issues with analyzing unstructured data. We are proud that we could welcome 13 valuable contributions to this volume. The majority of the accepted papers described innovative research on data discovery in unstructured texts. Authors worked on issues such as transforming unstructured into structured information by amongst others extracting keywords and opinion words from texts with Natural Language Processing methods. Multiple authors who participated in the workshop used methods from the conceptual structures field including Formal Concept Analysis and Conceptual Graphs. Applications include but are not limited to text mining police reports, sociological definitions, movie reviews, etc.
The paper concerns discourse-new referent detection. The task of coreference resolution is essential in many text-mining applications. The focus in this task is to detect noun phrases (NPs) that refer to the same entity. In languages without articles, there are no overt grammatical clues in an NP for whether it introduces a new referent into discourse or it refers to one of before-mentioned entities. However, there are some theoretical researches which claim that referent first-mentioning NPs have some specific features. In our research, we examine features that serve as discourse-new detectors for NPs corresponding to discourse salient referents and provide an experiment on different features contribution to this detection. The first-mention detection could help the quality of coreference resolution systems.