Notes on relation between symbolic classifiers
Formal Concept Analysis (FCA) is a mathematically well-founded theory aimed at data analysis and classication, introduced and detailed in the book of Bernhard Ganter and Rudolf Wille, \Formal Concept Analysis", Springer 1999. The area came into being in the early 1980s and has since then spawned over 10000 scientic publications and a variety of practically deployed tools. FCA allows one to build from a data table with objects in rows and attributes in columns a taxonomic data structure called concept lattice, which can be used for many purposes, especially for Knowledge Discovery and Information Retrieval. The \Formal Concept Analysis Meets Information Retrieval" (FCAIR) workshop collocated with the 35th European Conference on Information Retrieval (ECIR 2013) was intended, on the one hand, to attract researchers from FCA community to a broad discussion of FCA-based research on information retrieval, and, on the other hand, to promote ideas, models, and methods of FCA in the community of Information Retrieval. This volume contains 11 contributions to FCAIR workshop (including 3 abstracts for invited talks and tutorial) held in Moscow, on March 24, 2013. All submissions were assessed by at least two reviewers from the program committee of the workshop to which we express our gratitude. We would also like to thank the co-organizers and sponsors of the FCAIR workshop: Russian Foundation for Basic Research, National Research University Higher School of Economics, and Yandex.
Higher School of Economics (HSE) and supported by the Information Retrieval Specialist Group at the British Computer Society (BCS–IRSG). The conference was held during March 24–27, 2013, in Moscow, Russia – the easternmost location in the history of the ECIR series. ECIR 2013 received a total of 287 submissions in three categories: 191 full papers, 78 posters, and 18 demonstrations. The geographical distribution of the submissions is as follows: 70% were from Europe (including 9% from Russia), 17% from Asia, 12% from North and South America, and 3% from the rest of the world. All submissions were reviewed by at least three members of an international two-tier Program Committee. Of the papers submitted to the main research track, 30 were selected for oral presentation and 25 for poster/short presentation (16% and 13%, respectively, hence a 29% acceptance rate). In addition, 38 posters (49%) and 10 demonstrations (56%) were accepted. The accepted contributions represent the state of the art in information retrieval, cover a diverse range of topics, propose novel applications, and indicate promising directions for future research. Out of accepted contributions, 66% have a student as the primary author. We gratefully thank all Program Committee members for their time and efforts ensuring a high-quality level of the ECIR 2013 program. Additionally, ECIR 2013 hosted four tutorials and two workshops covering various IR-related topics. We express our gratitude to the Workshop Chair, Evgeniy Gabrilovich, and the Tutorial Chair, Djoerd Hiemstra, and the members of their committees.
– Searching the Web of Data
– Practical Online Retrieval Evaluation
– Cross-Lingual Probabilistic Topic Modeling and Its Applications in Information
– Distributed Information Retrieval and Applications
– From Republicans to Teenagers: Group Membership and Search (GRUMPS)
– Integrating IR Technologies for Professional Search
The conference included a Mentoring Program and Doctoral Consortium.
We thank Mikhail Ageev and Hideo Joho and Dmitriy Ignatov, respectively, for coordinating these activities.
We would like to thank our invited speakers – Mor Naaman (Rutgers University, Social Media Information Lab) and the winner of the Karen Sparck Jones award. The Industry Day took place on the final day of the conference and featured a bright assortment of talks given by prominent researchers and practitioners: Paul Ogilvie (LinkedIn), Hilary Mason (bitly), Antonio Gulli (Bing), Andrey Kalinin (Mail.Ru), Jimmy Lin (Twitter/University of Maryland), Marc Najork (Microsoft Research), and Andrey Styskin (Yandex), to whom we express our gratitude. We appreciate generous financial support from Yandex and HSE, as well as from our sponsorsMail.Ru and Russian Foundation for Basic Research (platinum level), Google and ABBYY
В статье дается краткое введение в ансамбли классификаторов в машинном обучении и описывается алгоритм, повышающий качество классификации за счет рекомендации классификаторов объектам. Гипотеза, заложенная в основу алгоритма, состоит в том, что классификатор скорее правильно классифицирует объект, если он правильно предсказал метки соседей этого объекта из обучающей выборки. Автор иллюстрирует принцип алгоритма на простом примере и описывает тестирование на реальных данных.
В данной работе рассматривается проблема автоматического аннотирования изображений набором ключевых слов, что позволяет осуществлять поиск изображений в больших коллекциях по текстовому запросу. Рассматривается общая схема аннотации с использованием глобальных низкоуровневых признаков изображений, представляемых как статистические классы. С помощью процедуры классификации статистических классов, основанной на предлагаемой мере включения, производится построение вторичных информативных признаков изображений, по которым и производится классификация изображений по ключевым словам.
This paper considers a data analysis system for collaborative platforms which was developed by the joint research team of the National Research University Higher School of Economics and the Witology company. Our focus is on describing the methodology and results of the first experiments. The developed system is based on several modern models and methods for analysing of object-attribute and unstructured data (texts) such as Formal Concept Analysis, multimodal clustering, association rule mining, and keyword and collocation extraction from texts.