Revisiting the Inverted Indices for Billion-Scale Approximate Nearest Neighbors

A. Babenko; Baranchuk D.; Malkov Y.

АБВ
АБВ
АБВ

Обычная версия сайта

Priority areas

by year

Subject

News

May 25, 2026

HSE Scientists Train Neural Network to 'Hear' Faults in Electric Motors

Researchers at the AI and Digital Science Institute of the HSE Faculty of Computer Science have developed a new method—the Signature-Guided Data Augmentation (SGDA) framework—that achieves 99% accuracy in motor fault detection and 86% accuracy in fault classification. The application of this approach can reduce industrial equipment repair costs, minimise downtime, and improve production safety. The study results have been published in Engineering Applications of Artificial Intelligence.

May 25, 2026

'The Humanities Serve as a Conscience'

Maria Mizernaia studies Soviet literature and the history of book publishing. In this interview for the HSE Young Scientists project, she discusses plans to publish a novel about besieged Leningrad, AI-provoked reflections on what it means to be human, and how novels can help satisfy our dopamine hunger.

May 25, 2026

Is It Possible to Predict a Citys Life Based on the Shape of Its Neighbourhoods?

Is it possible to predict, based on the configuration of streets and buildings, where a café will open or where traffic congestion will occur? Participants in the Spatial Analysis and Modelling of Urban Processes research and study group use open data and machine learning to identify universal patterns. Alexander Sheludkov and Eduard Somov discuss the purpose of comparing cities, the need for new forms of urban statistics, and how open data is transforming approaches to urban studies.

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications

?

Revisiting the Inverted Indices for Billion-Scale Approximate Nearest Neighbors

P. 1–15.

Babenko A., Baranchuk D., Malkov Y.

This work addresses the problem of billion-scale nearest neighbor search. The state-of-the-art retrieval systems for billion-scale databases are currently based on the inverted multi-index, the recently proposed generalization of the inverted index structure. The multi-index provides a very fine-grained partition of the feature space that allows extracting concise and accurate short-lists of candidates for the search queries. In this paper, we argue that the potential of the simple inverted index was not fully exploited in previous works and advocate its usage both for the highly-entangled deep descriptors and relatively disentangled SIFT descriptors. We introduce a new retrieval system that is based on the inverted index and outperforms the multi-index by a large margin for the same memory consumption and construction complexity. For example, our system achieves the state-of-the-art recall rates up to six times faster on the dataset of one billion deep descriptors compared to the efficient implementation of the inverted multi-index from the FAISS library.

Language: English

Full text

Keywords: Inverted Indices

In book

15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings

Springer, 2018.