• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • Using Annotated Suffix Trees for Fuzzy Full Text Search
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
May 25, 2026
HSE Scientists Train Neural Network to 'Hear' Faults in Electric Motors
Researchers at the AI and Digital Science Institute of the HSE Faculty of Computer Science have developed a new method—the Signature-Guided Data Augmentation (SGDA) framework—that achieves 99% accuracy in motor fault detection and 86% accuracy in fault classification. The application of this approach can reduce industrial equipment repair costs, minimise downtime, and improve production safety. The study results have been published in Engineering Applications of Artificial Intelligence.
May 25, 2026
'The Humanities Serve as a Conscience'
Maria Mizernaia studies Soviet literature and the history of book publishing. In this interview for the HSE Young Scientists project, she discusses plans to publish a novel about besieged Leningrad, AI-provoked reflections on what it means to be human, and how novels can help satisfy our dopamine hunger.
May 25, 2026
Is It Possible to Predict a Citys Life Based on the Shape of Its Neighbourhoods?
Is it possible to predict, based on the configuration of streets and buildings, where a café will open or where traffic congestion will occur? Participants in the Spatial Analysis and Modelling of Urban Processes research and study group use open data and machine learning to identify universal patterns. Alexander Sheludkov and Eduard Somov discuss the purpose of comparing cities, the need for new forms of urban statistics, and how open data is transforming approaches to urban studies.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Using Annotated Suffix Trees for Fuzzy Full Text Search

.
Dmitry Frolov
In press

A method for fuzzy full text search is proposed. The method
follows a popular two-stage scheme with a novel second stage: a prelim-
inary search stage using an n-gram inverted index and, at the second
stage, relevance checking between the query and documents using fre-
quency annotated suffix trees (ASTs). The ASTs are built for all docu-
ments of the collection off-line. The method is compared with two pop-
ular fuzzy text retrieval techniques, one using n-gram inverted indexing
with Levenshtein distance checking and signature hashing, and the other
being Lemur, a popular toolkit for language modelling and information
retrieval. For computational experiments we use ”Reuters 21578” text
collection and a collection of USPTO patents. Our AST-based method
generally leads to accuracy scores that are similar to those obtained
by the winner, the Levenshtein distance-based method. However, our
method significantly outperforms the Levenshtein distance based method
over speed. Therefore, when using both criteria, the accuracy and speed,
simultaneously, the AST-based method has shown significant advantages.

Language: English
Keywords: information retrieval

In book

Communications in Computer and Information Science. Information Retrieval. 10th Russian Summer School, RuSSIR 2016, Saratov, Russia, August 22-26, 2016, Revised Selected Papers
Springer, 2016.
Similar publications
CIKM '25: Proceedings of the 34th ACM International Conference on Information and Knowledge Management
ACM, 2025.
It is our great honor and pleasure to welcome you to the 2025 ACM International Conference on Information and Knowledge Management (CIKM 2025). CIKM has long served as a premier annual forum for researchers and practitioners worldwide, rotating across different locations each year. We are delighted that, for the very first time, CIKM will take ...
Added: November 16, 2025
SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
Association for Computing Machinery (ACM), 2024.
Welcome to the 47th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2024), taking place in Washington D.C., USA, from July 14 to 18, 2024. SIGIR serves as the foremost international forum for the presentation of groundbreaking research findings, the demonstration of innovative systems and techniques, and the exploration of forwardthinking ...
Added: May 9, 2024
HCI International 2023 Posters
Springer, 2023.
Added: October 21, 2023
Knowledge Discovery, Knowledge Engineering and Knowledge Management: 13th International Joint Conference, IC3K 2021, Virtual Event, October 25–27, 2021, Revised Selected Papers
Springer, 2023.
This book constitutes the extended and revised versions of a set of selected papers from the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2021, on October 25–27, 2021. The conference was held virtually due to the COVID-19 crisis. The 9 full papers included in this book were carefully reviewed and ...
Added: July 8, 2023
Advances in Information Retrieval. 45th European Conference on Information Retrieval, ECIR 2023, Dublin, Ireland, April 2–6, 2023, Proceedings, Part II
Springer, 2023.
Added: March 22, 2023
SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
Association for Computing Machinery (ACM), 2022.
Added: July 8, 2022
Pattern Structures for Knowledge Processing and Information Retrieval
Kuznetsov S., Goncharova E., , in: Proceedings of the Fifth International Scientific Conference "Intelligent Information Technologies for Industry" (IITI'21)Vol. 330.: Springer, 2022. P. 410–420.
Added: October 28, 2021
Concept-based chatbot for interactive query refinement in product search
Goncharova E., Ilvovsky D., Galitsky B., , in: Proceedings of the 9th International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI 2021)Vol. 2972.: CEUR-WS, 2021. P. 51–58.
Added: October 28, 2021
Experimental IR Meets Multilinguality, Multimodality, and Interaction: 12th International Conference of the CLEF Association, CLEF 2021, Virtual Event, September 21–24, 2021, Proceedings
Springer, 2021.
Added: September 28, 2021
Data Analytics and Management in Data Intensive Domains. 23rd International Conference, DAMDID/RCDL 2021, Moscow, Russia, October 26–29, 2021, Revised Selected Papers
Springer, 2022.
“Data Analytics and Management in Data Intensive Domains” conference (DAMDID) is planned as a multidisciplinary forum of researchers and practitioners from various domains of science and research promoting cooperation and exchange of ideas in the area of data analysis and management in data intensive domains. Approaches to data analysis and management being developed in specific data intensive domains of X-informatics (such as X = astro, bio, chemo, geo, medicine, neuro, physics, ...
Added: August 30, 2021
A Systematic Evaluation of Transfer Learning and Pseudo-labeling with BERT-based Ranking Models
Mokrii I., Boytsov L., Braslavski P., , in: SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval.: ACM, 2021. P. 2081–2085.
Due to high annotation costs making the best use of existing human-created training data is an important research direction. We, therefore, carry out a systematic evaluation of transferability of BERT-based neural ranking models across five English datasets. Previous studies focused primarily on zero-shot and few-shot transfer from a large dataset to a dataset with a ...
Added: August 11, 2021
SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval
ACM, 2021.
Added: August 11, 2021
Advances in Information Retrieval. 43rd European Conference on IR Research
Springer, 2021.
Added: July 23, 2021
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’19)
NY: Association for Computing Machinery (ACM), 2019.
WelcometoSIGIR2019,the42ndAnnualInternationalACMSIGIRConferenceonResearchandDevelop-mentinInformationRetrieval,thepremierscientificconferenceinthebroadareaofinformationretrieval.WearedelightedtowelcomeyoutotheMuseumofScienceandIndustrylocatedinthenorth-eastofParis.TheconferenceissupportedbytheFrenchAssociationforInformationRetrievalandApplications,whichorganizestheyearlyFrenchIRconference.Itsmembersactivelyparticipatedintheorganizationofthisconference.Wereceivedgoodqualitysubmissionsinalltracksandevents:fullpapers,shortpapers,demos,industrypa-pers,tutorials,workshops,andthedoctoralconsortium.Wew ouldliketothankeveryonewhocontributedtothepaperselectionprocess,including100SeniorProgramCommittee(SPC)members,317ProgramCommittee(PC)members,and80additionalreviewersfortheircontributionstopaperselection. ...
Added: October 29, 2020
Foundations of Intelligent Systems. 25th International Symposium on Methodologies for Intelligent Systems: ISMIS 2020
Springer, 2020.
This book constitutes the proceedings of the 25th International Symposium on Foundations of Intelligent Systems, ISMIS 2020, held in Graz, Austria, in October 2020. The conference was held virtually due to the COVID-19 pandemic. The 35 full and 8 short papers presented in this volume were carefully reviewed and selected from 79 submissions. Included is also ...
Added: October 4, 2020
Experimental IR Meets Multilinguality, Multimodality, and Interaction
Springer, 2020.
Added: October 4, 2020
FCA-based Approach for Interactive Query Refinement with IR-chatbots
Makhalova T., Ilvovsky D., Galitsky B. et al., , in: RAAI 2020 Russian Advances in Artificial Intelligence 2020 Selected Contributions of the "Russian Advances in Artificial Intelligence" Track at RCAI 2020 co-located with 18th Russian Conference on Artificial Intelligence (RCAI 2020)Vol. 2648.: CEUR-WS, 2020. P. 144–156.
Information retrieval (IR) chatbot is a special class of virtual assistants, which is widely used nowadays in customer support services. However, the work of modern IR retrieval systems is limited by simple queries to the database, which does not utilize all the potential of interaction with the user. In this paper we implement an FCA-based ...
Added: September 15, 2020
Digital Transformation and Global Society, 4th International Conference, DTGS 2019
Springer, 2019.
This volume constitutes the refereed proceedings of the 4th International Conference on Digital Transformation and Global Society, DTGS 2019, held in St. Petersburg, Russia, in June 2019. The 56 revised full papers and 9 short papers presented in the volume were carefully reviewed and selected from 194 submissions. The papers are organized in topical sections on ...
Added: February 22, 2020
AIST: International Conference on Analysis of Images, Social Networks and Texts. 8th International Conference, AIST 2019, Kazan, Russia, July 17–19, 2019, Revised Selected Papers
Springer, 2020.
This book constitutes the proceedings of the 8th International Conference on Analysis of Images, Social Networks and Texts, AIST 2019, held in Kazan, Russia, in July 2019. The 24 full papers and 10 short papers were carefully reviewed and selected from 134 submissions (of which 21 papers were rejected without being reviewed). The papers are organized ...
Added: February 9, 2020
Proceedings of the 27th ACM International Conference on Information and Knowledge Management
Association for Computing Machinery (ACM), 2018.
Added: December 27, 2019
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit