User Modeling on Mobile Device Based on Facial Clustering and Object Detection in Photos and Videos

I. Grechikhin; Andrey V. Savchenko

doi:10.1007/978-3-030-31321-0_37

Publications

?

User Modeling on Mobile Device Based on Facial Clustering and Object Detection in Photos and Videos

P. 429–440.

Grechikhin I., Andrey V. Savchenko

The article describes an approach for extraction of user preferences based on the analysis of a gallery of photos and videos on mobile device. It is proposed to firstly use fast SSD-based methods in order to detect objects of interests in offline mode directly on mobile device. Next we perform facial analysis of all visual data: extract feature vectors from detected facial regions, cluster them and select public photos and videos which do not contain faces from the large clusters of an owner of mobile device and his or her friends and relatives. At the second stage, these public images are processed on the remote server using very accurate but rather slow object detectors. Experimental study of several contemporary detectors is presented with the specially designed subset of MS COCO, ImageNet and Open Images datasets.

Keywords: object detection convolutional neural networks mobile system User modelling Facial clustering

Publication based on the results of:

Эффективные методы распознавания мультимедийных данных для задач анализа предпочтений пользователей мобильных устройств (2019)

In book

Pattern Recognition and Image Analysis

* 2. , Springer, 2019.

Ансамбль современных моделей компьютерного зрения для задачи обнаружения дипфейков

Pikul A. S., Безопасность информационных технологий 2024 Т. 31 № 4 С. 116–127

This article explores the potential use of modern computer vision architectures for the task of deepfake detection. The following architectures are considered: EfficientNet, Vision Transformer (ViT), VisionLSTM (ViL), Vision KAN, and Mamba Vision. The novelty of the approach lies in the application and comparison of these architectures, as well as their combination into paired ensembles ...

Added: December 12, 2025

Recognition of Mentally Pronounced Russian Phonemes Using Convolutional Neural Networks and Electroencephalography Data

Seleznev L. E., Chupakhin A. A., Kostenko V. A. et al., Optical Memory and Neural Networks (Information Optics) 2023 Vol. 32 No. 2 P. 73–85

We analyze a classification problem of mentally pronounced Russian phonemes based on data obtained by means of an electroencephalography device. We describe the data collection method as well as the methods of the obtained data processing. To solve the small sample size problem we present the augmentation techniques that use the time stretching and the ...

Added: October 2, 2025

Convolutional Neural Networks Decode Finger Movements in Motor Sequence Learning from MEG Data

Zabolotniy A., Chan R. W., Moiseeva V. et al., Frontiers in Neuroscience 2025 Vol. 19 Article 1623380

We demonstrated the feasibility of finger movement decoding with a tailored Convolutional Neural Network. The performance of our approach was comparable to complex deep learning architectures, while providing faster and interpretable outcome. This algorithmic strategy holds high potential for the investigation of the mechanisms underlying non-invasive neurophysiological recordings in cognitive neuroscience. ...

Added: October 2, 2025

Automatic Morpheme Segmentation for Russian: Can an Algorithm Replace Experts?

Morozov D., Garipov T., Lyashevskaya O. et al., Journal of Language and Education 2024 Vol. 10 No. 4 P. 71–84

Introduction: Numerous algorithms have been proposed for the task of automatic morpheme segmentation of Russian words. Due to the differences in task formulation and datasets utilized, comparing the quality of these algorithms is challenging. It is unclear whether the errors in the models are due to the ineffectiveness of algorithms themselves or to errors and inconsistencies ...

Added: January 7, 2025

Evolving Safety Protocols: Deep Learning-Enabled Detection of Personal Protective Equipment

, in: Lecture Notes in Electrical EngineeringVol. 489: Applied Physics, System Science and Computers II.: Springer, 2019. P. 87–100.

To give shift in safety protocols, we have employed advanced deep learning algorithms and frameworks to construct an innovative AI model. The designed model detects the usage of personal protective equipment (PPE) by workers in high-risk industries such as construction and manufacturing. We have used Google’s TensorFlow object detection API to modify and train a model for ...

Added: December 30, 2024

Development of a Detector for Stamps on Images

Kseniia Prokudina, Mikhail Skriplyonok, Alexander Vostrikov, , in: 2024 International Conference on Industrial Engineering, Applications and Manufacturing (ICIEAM), 20-24 May 2024.: IEEE, 2024. P. 865–869.

Added: November 26, 2024

Proceedings Volume 11605, Thirteenth International Conference on Machine Vision

Teplyakov L., Kaymakov K., Shvets E. et al., SPIE, 2021.

Line detection is an important computer vision task traditionally solved by Hough Transform. With the advance of deep learning, however, trainable approaches to line detection became popular. In this paper we propose a lightweight CNN for line detection with an embedded parameter-free Hough layer, which allows the network neurons to have global strip-like receptive fields. ...

Added: November 5, 2024

Lightweight and Elegant Data Reduction Strategies for Training Acceleration of Convolutional Neural Networks

Demidovskij A., Artyom Tugaryov, Aleksei Trutnev et al., Mathematics 2023 Vol. 14 No. 11 Article 3120

Due to industrial demands to handle increasing amounts of training data, lower the cost of computing one model at a time, and lessen the ecological effects of intensive computing resource consumption, the job of speeding the training of deep neural networks becomes exceedingly challenging. Adaptive Online Importance Sampling and IDS are two brand-new methods for ...

Added: September 12, 2023

Robust Collision Warning System based on Multi Objects Distance Estimation

Saleh H., Saleh S., Nathan Teyou Toure et al., , in: 2021 IEEE Concurrent Processes Architectures and Embedded Systems Virtual Conference (COPA).: IEEE, 2022. Ch. 7 P. 1–6.

The annual number of road deaths is still increasing, especially in less developed and developing countries. Road accidents are the 5th cause of death and the leading reason for death among young people between 5 and 29 years of age in 2030. In this study, a robust solution is implemented by integrating object recognition with ...

Added: October 31, 2022

Framework for recognizing information about railway traffic lights

Belykh M. Vladimirovna, Belov A. Vladimirovich, , in: 2022 International Conference on Interdisciplinary Research in Technology and Management, IRTM 2022 - Proceedings.: IEEE, 2022. P. 1–4.

Added: July 15, 2022

MobileEmotiFace: Efficient Facial Image Representations in Video-Based Emotion Recognition on Mobile Devices

Demochkina P., Savchenko A., , in: Pattern Recognition. ICPR International Workshops and Challenges. Virtual Event, January 10–15, 2021, Proceedings, Part V.: Springer, 2021. P. 266–274.

In this paper, we address the emotion classification problem in videos using a two-stage approach. At the first stage, deep features are extracted from facial regions detected in each video frame using a MobileNet-based image model. This network has been preliminarily trained to identify the age, gender, and identity of a person, and further fine-tuned ...

Added: April 10, 2022

Touching the Limits of a Dataset in Video-Based Facial Expression Recognition

Churaev E., Savchenko A., , in: 2021 International Russian Automation Conference (RusAutoCon).: IEEE, 2021. P. 633–638.

In this paper, we examine the issue of video-based facial emotion recognition algorithms which show excellent performance on some benchmarks, but have much worse accuracy in practical applications. For example, the typical error rate of contemporary deep neural networks on the RAVDESS dataset is less than 5%. We argue that such results are obtained only ...

Added: October 7, 2021

Preference prediction based on a photo gallery analysis with scene recognition and object detection

Savchenko A., Demochkin K., Grechikhin I., Pattern Recognition 2022 Vol. 121 Article 108248

In this paper, a user modeling task is examined by processing mobile device gallery of photos and videos. We propose a novel engine for preferences prediction based on scene recognition, object detection and facial analysis. At first, all faces in a gallery are clustered, and all private photos and videos with faces from large clusters ...

Added: August 19, 2021

Real-time Object Detection with FPGA Using CenterNet

Solovyev R. A., Telpukhov D. V., Romanova I. I. et al., , in: Proceedings of the 2021 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering, ElConRus 2021.: IEEE, 2021. P. 2029–2034.

The paper proposes methodology for transferring architecture of modern neural network CenterNet to FPGA. CenterNet is a OneStage object detector that is used to detect and locate objects in images. Although this neural network has simple decoder, it shows good performance in terms of accuracy. Very high operation speed of the neural network hardware is ...

Added: August 8, 2021

Railway Traffic Lights Recognition System

Belov A. V., Belykh M., Tofayli S., , in: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTERDISCIPLINARY RESEARCH IN TECHNOLOGY AND MANAGEMENT (IRTM, 2021).: CRC Press, 2021. Ch. 32 P. 199–203.

The article deals with the problem of collecting information about railway traffic lights located on the territory of the Russian Federation. To solve this problem, a trained neural network is used that detects railway traffic lights in high-definition images. These images were taken by a camera installed on a train carriage, continuously photographing the area ...

Added: July 15, 2021