The inverted multi-index

Babenko A.; Lempitsky V.

?

The inverted multi-index

P. 3069–3076.

Babenko A., Lempitsky V.

A new data structure for efficient similarity search in very large dataseis of high-dimensional vectors is introduced. This structure called the inverted multi-index generalizes the inverted index idea by replacing the standard quantization within inverted indices with product quantization. For very similar retrieval complexity and preprocessing time, inverted multi-indices achieve a much denser subdivision of the search space compared to inverted indices, while retaining their memory efficiency. Our experiments with large dataseis of SIFT and GIST vectors demonstrate that because of the denser subdivision, inverted multi-indices are able to return much shorter candidate lists with higher recall. Augmented with a suitable reranking procedure, multi-indices were able to improve the speed of approximate nearest neighbor search on the dataset of 1 billion SIFT vectors by an order of magnitude compared to the best previously published systems, while achieving better recall and incurring only few percent of memory overhead.

Language: English

Text on another site

Keywords: data structures структуры данных image retrieval query formulation извлечение изображений формулирование запросов

In book

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2012)

Providence: IEEE, 2012.

Automata Equipped with Auxiliary Data Structures and Regular Realizability Problems

Rubtsov A. A., Vyalyi M., , in: Descriptional Complexity of Formal Systems: 23rd IFIP WG 1.02 International Conference, DCFS 2021, Virtual Event, September 5, 2021, Proceedings.: Springer, 2021. P. 150–162.

Added: February 2, 2022

Algorithms and Data Structures. WADS 2019. Lecture Notes in Computer Science

Springer, 2019.

16th International Symposium, WADS 2019, Edmonton, AB, Canada, August 5–7, 2019, Proceedings ...

Added: October 26, 2021

Unsupervised neural quantization for compressed-domain similarity search

Morozov S., Babenko A., , in: Proceedings of the IEEE International Conference on Computer Vision (ICCV 2019).: IEEE, 2019. P. 3036–3045.

We tackle the problem of unsupervised visual descriptors compression, which is a key ingredient of large-scale image retrieval systems. While the deep learning machinery has benefited literally all computer vision pipelines, the existing state-of-the-art compression methods employ shallow architectures, and we aim to close this gap by our paper. In more detail, we introduce a ...

Added: July 6, 2020

Подходы к организации поискового дерева решений в методе ветвей и границ для асимметричной задачи коммивояжера

Fomichev M., Ulyanov M., Информационные технологии 2018 Т. 24 № 11 С. 698–704

Повышение временной эффективности программных реализаций метода ветвей и границ для асимметричной задачи коммивояжера может быть достигнуто как за счет выбора наиболее приемлемой структуры данных, обеспечивающей эффективные по времени операции с листьями поискового дерева решений, так и за счет использования дополнительной памяти для хранения усеченных матриц в листьях поискового дерева решений. Дополнительно могут быть предложены и ...

Added: January 26, 2020

Cascade Heap: Towards Time-Optimal Extractions

Babenko M. A., Kolesnichenko I., Smirnov I., Theory of Computing Systems 2019 Vol. 63 No. 4 P. 637–646

Heaps are well-studied fundamental data structures, having myriads of applications, both theoretical and practical. We consider the problem of designing a heap with an “optimal” extract-min operation. Assuming an arbitrary linear ordering of keys, a heap with n elements typically takes O(log n) time to extract the minimum. Extracting all elements faster is impossible as ...

Added: December 6, 2019

Fundamentals of Computation Theory, 22nd International Symposium, FCT 2019, Copenhagen, Denmark, August 12-14, 2019, Proceedings

Springer, 2019.

Added: August 4, 2019

АНАЛИЗ ПРОИЗВОДИТЕЛЬНОСТИ СТРАТЕГИЙ СИНХРОНИЗАЦИИ ПОТОКОВ В СТРУКТУРАХ ДАННЫХ, ОСНОВАННЫХ НА FLAT-COMBINING

Галимуллин М. Ф., Kalishenko E., Рапоткин Н. А., Известия Санкт-Петербургского государственного электротехнического университета ЛЭТИ 2016 № 7 С. 13–23

Deals with the development of threads synchronizing strategies based on the creation of concurrent «flat-combining» data structures as well as research of their performance. The paper considers «flat-combining» approach and its implementation in the library libcds, the development of thread synchronization strategy and its possible implementations. The efficiency of synchronization strategies usage is researched on ...

Added: November 1, 2018

Hybrid neural network and bi-criteria tabu-machine: comparison of new approaches to maximum clique problem

Babkina T. S., Demidovskij A., Babkin E., International Journal of Big Data Intelligence 2018 Vol. 5 No. 3 P. 143–155

This paper presents two new approaches to solving a classical NP-hard problem of maximum clique problem (MCP), which frequently arises in the domain of information management, including design of database structures and big data processing. In our research, we are focusing on solving that problem using the paradigm of artificial neural networks. The first approach ...

Added: October 3, 2018

Lecture Notes in Computer Science

Berlin, Heidelberg: Springer, 2017.

The 12th issue of LNCS Transactions on Petri Nets and Other Models of Concurrency (ToPNoC) contains revised and extended versions of a selection of the best papers from the workshops held at the 37th International Conference on Application and Theory of Petri Nets and Concurrency (Petri Nets 2016, Toruń, Poland, 19–24 June 2016), and the ...

Added: September 27, 2017

Resource characteristics of ways to organize a decision tree in the branch-andboundmethod for the traveling salesmen problem

Ulyanov M.V., Fomichev M.I., Business Informatics 2015 No. 4 (34) P. 38–46

The resource efficiency of different implementations of the branch-and-bound method for the classical traveling salesman problem depends, inter alia, on ways to organize a search decision tree generated by this method. The classic «time-memory» dilemma is realized herein either by an option of storing reduced matrices at the points of the decision tree, which leads ...

Added: November 5, 2016

Automatic image annotation with low-level features and conditional random fields

Bronevich A.G., Melnichenko A. S., , in: IC3K 2013; KDIR 2013 - 5th International Conference on Knowledge Discovery and Information Retrieval and KMIS 2013 - 5th International Conference on Knowledge Management and Information Sharing, Proc. Vilamoura, Algarve; Portugal; 19 -22 September.: Setúbal: SciTePress, 2013. P. 197–201.

This work is devoted to the problem of automatic image annotation by analyzing image low-level characteristics. This problem consists in assigning words of a natural language to an arbitrary image by analyzing image low-level characteristics without any other additional information. Automatic image annotation could be useful for extraction of high-level semantic information from images, organizing ...

Added: October 17, 2016

Computing minimal and maximal suffixes of a substring

Maxim Babenko, Gawrychowski P., Kociumaka T. et al., Theoretical Computer Science 2016 Vol. 638 P. 112–121

We consider the problems of computing the maximal and the minimal non-empty suffixes of substrings of a longer text of length . n. For the minimal suffix problem we show that for every . τ, . 1≤τ≤logn, there exists a linear-space data structure with . O(τ) query time and . O(nlogn/τ) preprocessing time. As a ...

Added: October 8, 2015

Технологии разработки информационных систем: сборник статей международной научно-практической конференции

Таганрог: Издательство ЮФУ, 2015.

Сборник составлен по материалам VI Международной научно-практической конференции "Технологии разработки информационных систем", состоявшейся 6-12 сентабря 2015 г. в г. Геленджик. Ответственность за аутентичность и точность цитат, имен, названий и иных сведений несут авторы публикуемых материалов. Материалы публикуются в авторской редакции. Мероприятие проведено при финансовой поддержке Российского фонда фундаментальных исследований (грант № 15-07-20559-г). ...

Added: September 13, 2015

The inverted multi-index

Babenko A., Lempitsky V., IEEE Transactions on Pattern Analysis and Machine Intelligence 2015 Vol. 37 No. 6 P. 1247–1260

A new data structure for efficient similarity search in very large datasets of high-dimensional vectors is introduced. This structure called the inverted multi-index generalizes the inverted index idea by replacing the standard quantization within inverted indices with product quantization. For very similar retrieval complexity and pre-processing time, inverted multi-indices achieve a much denser subdivision of ...

Added: September 3, 2015

The Inverted Multi-Index

Babenko A., IEEE Transactions on Pattern Analysis and Machine Intelligence 2014 Vol. PP No. 99 P. 1

Added: December 19, 2014

Wavelet Trees Meet Suffix Trees

Babenko M. A., Gawrychowski P., Kociumaka T. et al., , in: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms.: San Diego: SIAM, 2015. P. 572–591.

We present an improved wavelet tree construction algorithm and discuss its applications to a number of rank/select problems for integer keys and strings. Given a string of length n over an alphabet of size ω ≤ n, our method builds the wavelet tree in O(n log ω √log n) time, improving upon the state-of-the-art algorithm ...

Added: October 4, 2014

A suffix tree or not a suffix tree?

Vildhoj H. W., Starikovskaya T., , in: Combinatorial Algorithms. 25th International Workshop, IWOCA 2014, Duluth, MN, USA, October 15-17, 2014, Revised Selected PapersVol. 8986.: Springer, 2014. P. 338–350.

In this paper we study the structure of suffix trees. Given an unlabelled tree $\tau$ on $n$ nodes and suffix links of its internal nodes, we ask the question "Is $\tau$ a suffix tree?", i.e., is there a string $S$ whose suffix tree has the same topological structure as $\tau$? We place no restrictions on ...

Added: October 4, 2014

Proceedings of the 21st International Symposium on String Processing and Information Retrieval

Springer, 2014.

This book constitutes the proceedings of the 21st International Symposium on String Processing and Information Retrieval, SPIRE 2014, held in Ouro Preto, Brazil, in October 2014. The 20 full and 6 short papers included in this volume were carefully reviewed and selected from 45 submissions. The papers focus not only on fundamental algorithms in string ...

Added: October 4, 2014