Scene Recognition in User Preference Prediction Based on Classification of Deep Embeddings and Object Detection

A. Savchenko; A. Rassadin

doi:10.1007/978-3-030-22808-8_41

Publications

?

Scene Recognition in User Preference Prediction Based on Classification of Deep Embeddings and Object Detection

Ch. 41. P. 422–430.

Savchenko A., Rassadin A.

In this paper we consider general scene recognition problem for analysis of user preferences based on his or her photos on mobile phone. Special attention is paid to out-of-class detections and efficient processing using MobileNet-based architectures. We propose the three stage procedure. At first, pre-trained convolutional neural network (CNN) is used extraction of input image embeddings at one of the last layers, which are used for training a classifier, e.g., support vector machine or random forest. Secondly, we fine-tune the pre-trained network on the given training set and compute the predictions (scores) at the output of the resulted CNN. Finally, we perform object detection in the input image, and the resulted sparse vector of detected objects is classified. The decision is made based on a computation of a weighted sum of the class posterior probabilities estimated by all three classifiers. Experimental results with a subset of ImageNet dataset demonstrate that the proposed approach is up to 5% more accurate when compared to conventional fine-tuned models.

Keywords: object detection обработка и распознавание изображений Convolutional Neural Network сверточные нейронные сети scene recognition распознавание сцен обнаружение объектов image processing

Publication based on the results of:

Эффективные методы распознавания мультимедийных данных для задач анализа предпочтений пользователей мобильных устройств (2019)

In book

Advances in Neural Networks – ISNN 2019 16th International Symposium on Neural Networks, ISNN 2019, Moscow, Russia, July 10–12, 2019, Proceedings, Part II

Cham: Springer, 2019.

On the Use of Metaheuristics of Different Classes and an Island Model for Image Steganography

Melman A., Mikhail Alexandrov, Oleg Evsutin, , in: 2025 XIХ International Symposium on Problems of Redundancy in Information and Control Systems (Redundancy), 5-7 Nov. 2025.: IEEE, 2025. P. 1–6.

Digital steganography protects the privacy of data by hiding them in some digital containers, such as images. Protection of confidential messages by hiding them in digital images faces the problem of balancing the main performance indicators, i.e. embedding imperceptibility and capacity. Metaheuristic optimization can be used to flexibly customize embedding options, including parameters and locations ...

Added: January 28, 2026

Метод улучшения обнаружения атак презентации на биометрическую систему распознавания лиц с помощью сверточной сети с механизмом внимания

Pikul A. S., В кн.: Альманах научных работ молодых ученых университета ИТМО. Материалы Пятьдесят третьей (LIII) научной и учебно-методической конференции Том 1.: СПб.: Университет ИТМО, 2024. С. 338–342.

Предложен новый подход для улучшения распознавания атак презентации на биометрическую систему распознавания лиц с помощью сверточной сети с механизмом внимания. Проверена центральная гипотеза, которая заключалась в том, что с помощью механизма внимания возможно улучшить результаты работы исходной сверточной нейронной сети. В ходе экспериментов гипотеза была подтверждена. Наибольший прирост по качеству был достигнут на наборе данных ...

Added: December 13, 2025

Глубокая нейронная сеть с графовым вниманием для выявления поддельных изображений лица

Pikul A. S., Лепендин А. А., Труды молодых ученых Алтайского государственного университета 2023 № 20 С. 190–193

Представлен новый подход для выявления атак презентации на системы распознавания по лицу. Он основан на использовании механизма графового внимания, применяемого к промежуточным картам характеристик изображений лица, вычисленным сверточной сетью ResNet18. Показано, что предложенный подход позволил добиться высокого качества распознавания поддельных изображений при лицевой биометрической верификации, сравнимого с имеющимися в настоящее время альтернативными решениями. ...

Added: December 12, 2025

Ансамбль современных моделей компьютерного зрения для задачи обнаружения дипфейков

Pikul A. S., Безопасность информационных технологий 2024 Т. 31 № 4 С. 116–127

This article explores the potential use of modern computer vision architectures for the task of deepfake detection. The following architectures are considered: EfficientNet, Vision Transformer (ViT), VisionLSTM (ViL), Vision KAN, and Mamba Vision. The novelty of the approach lies in the application and comparison of these architectures, as well as their combination into paired ensembles ...

Added: December 12, 2025

Artificial Neural Networks and Machine Learning. ICANN 2025 International Workshops and Special Sessions: 34th International Conference on Artificial Neural Networks, Kaunas, Lithuania, September 9–12, 2025, Proceedings, Part V

Cham: Springer, 2025.

This book constitutes the refereed proceedings of 34th International Workshops which were held in conjunction with the 34th International Conference on Artificial Neural Networks and Machine Learning, ICANN 2025, held in Kaunas, Lithuania, September 9–12, 2025. The 20 full papers and 8 abstracts included in this workshop volume were carefully reviewed and selected from 42 submissions. ...

Added: September 29, 2025

2025 IEEE International Conference on Image Processing (ICIP)

IEEE, 2025.

Added: September 24, 2025

Множественное встраивание водяных знаков в пространственно-частотную область изображений на основе генетического алгоритма

Melman A., Evsyutin O., Senyukova O., Компьютерная оптика 2025 Т. 49 № 2 С. 273–281

The widespread use of digital content makes the task of protecting author’s and owner’s rights increasingly important, in particular with regard to digital images. Digital watermarking technology is an effective tool that solves many problems associated with proving authorship of images, verifying authenticity, and tracking illegal copying. An effective watermarking algorithm requires achieving high levels ...

Added: March 8, 2025

Automatic Morpheme Segmentation for Russian: Can an Algorithm Replace Experts?

Morozov D., Garipov T., Lyashevskaya O. et al., Journal of Language and Education 2024 Vol. 10 No. 4 P. 71–84

Introduction: Numerous algorithms have been proposed for the task of automatic morpheme segmentation of Russian words. Due to the differences in task formulation and datasets utilized, comparing the quality of these algorithms is challenging. It is unclear whether the errors in the models are due to the ineffectiveness of algorithms themselves or to errors and inconsistencies ...

Added: January 7, 2025

Evolving Safety Protocols: Deep Learning-Enabled Detection of Personal Protective Equipment

, in: Lecture Notes in Electrical EngineeringVol. 489: Applied Physics, System Science and Computers II.: Springer, 2019. P. 87–100.

To give shift in safety protocols, we have employed advanced deep learning algorithms and frameworks to construct an innovative AI model. The designed model detects the usage of personal protective equipment (PPE) by workers in high-risk industries such as construction and manufacturing. We have used Google’s TensorFlow object detection API to modify and train a model for ...

Added: December 30, 2024

Development of a Detector for Stamps on Images

Kseniia Prokudina, Mikhail Skriplyonok, Alexander Vostrikov, , in: 2024 International Conference on Industrial Engineering, Applications and Manufacturing (ICIEAM), 20-24 May 2024.: IEEE, 2024. P. 865–869.

Added: November 26, 2024

Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track. European Conference, ECML PKDD 2024, Vilnius, Lithuania, September 9–13, 2024, Proceedings, Part X. LNCS, volume 14950

Cham: Springer, 2024.

This multi-volume set, LNAI 14941 to LNAI 14950, constitutes the refereed proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2024, held in Vilnius, Lithuania, in September 2024. ...

Added: November 22, 2024

Image watermarking based on a ratio of DCT coefficient sums using a gradient-based optimizer

Anna Melman, Oleg Evsutin, Computers and Electrical Engineering 2024 Vol. 117 Article 109271

Social networks and various websites provide a lot of tools for publishing and sharing digital images. However, publishing digital content online makes it vulnerable to illegal copying and distribution. An effective method for protecting images is watermarking technology, which embeds a logo or some kind of identifier into an image in an invisible way. If ...

Added: April 28, 2024

An Image Watermarking Algorithm in DCT Domain Based on Optimal Patterns

Anna Melman, Oleg Evsutin, Danil Smirnov, , in: 2023 XVIII International Symposium Problems of Redundancy in Information and Control Systems (REDUNDANCY).: IEEE, 2023. P. 1–5.

Sharing images via social media and specialized sites creates a copyright issue. Image watermarking methods provide copyright protection for authors and owners of digital content. The security level of a watermarking algorithm depends on a watermark’s resistance to various distorting effects, such as brightness changing, contrast changing, applying a Gaussian filter, and others. At the ...

Added: January 5, 2024

FPGA implementation of robust and low complexity template-based watermarking for digital images

Dzhanashia K., Oleg Evsutin, Multimedia Tools and Applications 2024 Vol. 83 No. 20 P. 58855–58874

Watermarking is a widespread technique for information protection and an invisible alternative to quick response codes. The literature mainly considers software implementations of watermarking methods, even though there are applications for which hardware watermarking solutions become preferable or the only possible option due to increased speed, power, or information safety requirements. A convenient, flexible, and ...

Added: December 14, 2023

Three-way classification for sequences of observations

A. V. Savchenko, L. V. Savchenko, Information Sciences 2023 Vol. 648 Article 119540

This article introduces the novel technique to reduce the computation time for classifying a sequence of observations (frames), such as a video stream, where each observation is described by high-dimensional embeddings extracted by a deep neural network. By using the methodology of granular computing, an observed sequence is represented at various scales using different frame ...

Added: August 27, 2023

Segmentation of Prostate Cancer on TRUS Images Using ML

Zaev R., Romanov A., Solovyev R., , in: 2023 International Russian Smart Industry Conference (SmartIndustryCon), 27-31 March 2023.: Sochi: IEEE, 2023. P. 460–465.

Medical research has made tremendous progress in detecting various pathologies in the human body. There is still the problem of the speed of the process, and the lack of a sufficient number of trained professionals in this field. Detection of prostate cancer, in particular, without surgery is a very labor- intensive process. A neural network-based ...

Added: July 30, 2023