Глава
Concept-based Recommendations for Internet Advertisement
The problem of detecting terms that can be interesting to the advertiser is considered. If a company has already bought some advertising terms which describe certain services, it is reasonable to find out the terms bought by competing companies. A part of them can be recommended as future advertising terms to the company. The goal of this work is to propose better interpretable recommendations based on FCA and association rules.
Concept discovery is a Knowledge Discovery in Databases (KDD) research field that uses human-centered techniques such as Formal Concept Analysis (FCA), Biclustering, Triclustering, Conceptual Graphs etc. for gaining insight into the underlying conceptual structure of the data. Traditional machine learning techniques are mainly focusing on structured data whereas most data available resides in unstructured, often textual, form. Compared to traditional data mining techniques, human-centered instruments actively engage the domain expert in the discovery process. This volume contains the contributions to CDUD 2011, the International Workshop on Concept Discovery in Unstructured Data (CDUD) held in Moscow. The main goal of this workshop was to provide a forum for researchers and developers of data mining instruments working on issues with analyzing unstructured data. We are proud that we could welcome 13 valuable contributions to this volume. The majority of the accepted papers described innovative research on data discovery in unstructured texts. Authors worked on issues such as transforming unstructured into structured information by amongst others extracting keywords and opinion words from texts with Natural Language Processing methods. Multiple authors who participated in the workshop used methods from the conceptual structures field including Formal Concept Analysis and Conceptual Graphs. Applications include but are not limited to text mining police reports, sociological definitions, movie reviews, etc.
Предлагается способ реализации автоматизированной системы контроля знаний на основе таких интеллектуальных средств как онтологический подход, нечёткая логика и технология извлечения знаний из данных.
В работе описывается система анализа данных кол- лаборативной платформы компании Witology. Проект находится в состоянии разработки, поэтому в статье отражены в основном методологические аспекты и результаты первых экспериментов. В основу системы положен ряд моделей и методов современного анализа объектно-признаковых и неструктурированных данных (текстов), таких как Анализ Формальных Понятий, мультимо- дальная кластеризация, поиск ассоциативных правил и извлече- ние ключевых словосочетаний и слов из текстов.
В данной работе рассматриваются некоторые методы и подходы к решению задачи назначения исполнителя на задачу. Также приводится анализ достоинств и недостатков рассмотренных алгоритмов, предлагаются пути дальнейшего исследования с целью разработки методики назначения исполнителей в области управления проектами.
An important characteristic feature of recommender systems for web pages is the abundance of textual information in and about the items being recommended (web pages). To improve recommendations and enhance user experience, we propose to use automatic tag (keyword) extraction for web pages entering the recommender system. We present a novel tag extraction algorithm that employs semi-supervised classification based on a dataset consisting of pre-tagged documents and (for the most part) partially tagged documents whose tags are automatically mined from the content. We also compare several classification algorithms for tag extraction in this context.