?
Depth Map Interpolation using Perceptual Loss
P. 93–94.
In this paper, we discuss a semi-dense depth map interpolation method based on convolutional neural network. We propose a compact neural network architecture with loss function defined as Euclidean distance in the feature space of VGG-16 neural network used for deep visual recognition. The suggested solution shows state-of-art performance on synthetic and real datasets. Together with LSD-SLAM, the method could be used to provide a dense depth map for interaction purposes, such as creating a first person game in AR/MR or perception module for autonomous vehicle.
Shadrina E. V., Мохова В. О., Загоскин В. А. et al., Нижегородский психологический альманах 2024 № 2
The article considers the problem of learning of recognizing emotions from pictures. A review and analysis of domestic and foreign works of scientists dealing with the problem of emotional intelligence was carried out. Its formation, influence on human activity and existing variants of its structure were considered, and common features in the understanding of emotional ...
Added: April 9, 2026
Медведев Д. С., Ignatov A., Научно-технический вестник информационных технологий, механики и оптики 2022 Т. 22 № 2 С. 410–414
A method of arm aiming direction estimation for low performance Internet of Things devices is proposed. It uses Human Pose Estimation (HPE) algorithms for retrieving human skeleton key points. Having these key points, arm aiming directions model is calculated. Two well-known HPE methods (PoseNet and OpenPose) are examined. These algorithms have been tested and compared ...
Added: June 1, 2022
Makarov I., Bakhanova M., Nikolenko S. et al., PeerJ Computer Science 2022 Vol. 8 Article e865
Depth estimation has been an essential task for many computer vision applications, especially in autonomous driving, where safety is paramount. Depth can be estimated not only with traditional supervised learning but also via a self-supervised approach that relies on camera motion and does not require ground truth depth maps. Recently, major improvements have been introduced ...
Added: February 1, 2022
Krinitskiy M., Alexandrova M., Verezemskaya P. et al., Remote Sensing 2021 Vol. 13 No. 2 Article 326
Total Cloud Cover (TCC) retrieval from ground-based optical imagery is a problem that has been tackled by several generations of researchers. The number of human-designed algorithms for the estimation of TCC grows every year. However, there has been no considerable progress in terms of quality, mostly due to the lack of systematic approach to the ...
Added: September 24, 2021
Dmitrii Maslov, Makarov I., , in: Advances in Computational Intelligence: 16th International Work-Conference on Artificial Neural Networks, IWANN 2021, Virtual Event, June 16–18, 2021, Proceedings, Part I* 1. Vol. 12861.: Springer, 2021. Ch. 38 P. 456–467.
In this paper, we study depth reconstruction via RGB-based, Sparse-Depth, and RGBd approaches. We showed that combination of RGB and Sparse Depth approach in RGBd scenario provides the best results. We also proved that the models performance can be further tuned via proper selection of architecture blocks and number of depth points guiding RGB-to-depth reconstruction. ...
Added: September 1, 2021
Kudriavtseva P., Kashkinov M., Kertész-Farkas A., Journal of Proteome Research 2021 Vol. 20 No. 10 P. 4708–4717
Spectrum annotation is a challenging task due to the presence of unexpected peptide fragmentation ions as well as the inaccuracy of the detectors of the spectrometers. We present a deep convolutional neural network, called Slider, which learns an optimal feature extraction in its kernels for scoring mass spectrometry (MS)/MS spectra to increase the number of ...
Added: August 30, 2021
Makarov I., Nikolay Veldyaykin, Maxim Chertkov et al., , in: Proceedings of the 12th ACM International Conference on PErvasive Technologies Related to Assistive Environments (PETRA '19).: NY: ACM, 2019. P. 204–210.
Sign languages are the main way for people from deaf community to communicate with other people. In this paper, we have compared several real-time sign language dactyl recognition systems using deep convolutional neural networks. Our system is able to recognize words from natural language gestured using signs for each letter. We evaluate our approach on ...
Added: July 10, 2021
Savchenko A., , in: Proceedings of International Joint Conference on Neural Networks 2020 (IJCNN 2020).: Piscataway: IEEE, 2020. P. 1–8.
In this paper a new formulation of event recognition task is examined: it is required to predict event categories given a gallery of images, for which albums (groups of photos corresponding to a single event) are unknown. The novel two-stage approach is proposed. At first, features are extracted in each photo using the pre-trained convolutional ...
Added: October 15, 2020
Savchenko A., , in: Proceedings of International Joint Conference on Neural Networks 2020 (IJCNN 2020).: Piscataway: IEEE, 2020. P. 1–8.
In this paper the problem of high computational complexity of deep convolutional nets in image recognition is considered. An existing framework of adaptive neural networks is extended by appending the separate classifier to intermediate layers. The hierarchical representations of the input image are sequentially analyzed. If the first classifier returns rather high confidence score, the ...
Added: October 15, 2020
Kuznetsov A., Savchenko A., , in: Proceedings of the International Conference on Computer Vision and Graphics (ICCVG 2020)Vol. 12334.: Cham: Springer, 2020. Ch. 8 P. 87–97.
In this research we introduce a new labelled SportLogo dataset, that contains images of two kinds of sports: hockey (NHL) and basketball (NBA). This dataset presents several challenges typical for logo detection tasks. A huge number of occlusions and logo view changes during playing games lead to an ambiguity of a straightforward detection approach use. ...
Added: October 1, 2020
Miasnikov E., Savchenko A., , in: Proceedings of International Conference on Image Analysis and Recognition (ICIAR 2020)Vol. 12131.: Cham: Springer, 2020. Ch. 9 P. 83–94.
Food analysis is one of the most important parts of user preference prediction engines for recommendation systems in the travel domain. In this paper, we describe and study the neural network method that allows you to recognize food in a gallery of photos taken with mobile devices. The described method consists of three main stages, ...
Added: October 1, 2020
Kharchevnikova A., Savchenko A., Компьютерная оптика 2020 Т. 44 № 4 С. 618–626
В работе рассматривается задача извлечения предпочтений пользователя по его фотоальбому. Предложен новый подход на основе автоматического порождения текстовых описаний фотографий и последующей классификации таких описаний. Проведен анализ известных методов создания аннотаций по изображению на основе свёрточных и рекуррентных (Long short-term memory) нейронных сетей. С использованием набора данных Google’s Conceptual Captions обучены новые модели, в которых ...
Added: September 16, 2020
Savchenko A., Miasnikov E., , in: Advances in Intelligent Data Analysis XVIII (IDA 2020)Vol. 12080.: Cham: Springer, 2020. Ch. 33 P. 418–430.
In this paper, we consider the problem of event recognition on single images. In contrast to conventional fine-tuning of convolutional neural networks (CNN), we proposed to use image captioning, i.e., a generative model that converts images to textual descriptions. The motivation here is the possibility to combine conventional CNNs with a completely different approach in ...
Added: May 17, 2020
Makarov I., Veldyaykin N., Maxim Chertkov et al., , in: Analysis of Images, Social Networks and Texts. 8th International Conference AIST 2019.: Springer, 2019. P. 309–320.
Sign language is the main way to communicate for people from deaf community. However, common people mostly do not know sign language. In this paper, we overview several real-time sign language dactyl recognition systems using deep convolutional neural networks. These systems are able to recognize dactylized words gestured by signs for each letter. We evaluate ...
Added: February 4, 2020
Demochkin K., Savchenko A., , in: Analysis of Images, Social Networks and Texts. 8th International Conference, AIST 2019, Lecture Notes in Computer Science, Revised Selected PapersVol. 11832.: Cham: Springer, 2019. Ch. 26 P. 291–297.
In this paper we focus on the problem of multi-label image recognition for visually-aware recommender systems. We propose a two stage approach in which a deep convolutional neural network is firstly fine-tuned on a part of the training set. Secondly, an attention-based aggregation network is trained to compute the weighted average of visual features in ...
Added: December 22, 2019
Makarov I., Dmitrii Maslov, Gerasimova O. et al., , in: Proceedings of 27th ACM International Conference on Multimedia.: NY: ACM, 2019. P. 1080–1084.
In our recent papers, we proposed a new family of residual convolutional neural networks trained for semi-dense and sparse depth reconstruction without use of RGB channel. The proposed models can be used in low-resolution depth sensors or SLAM methods estimating partial depth with certain distributions. We proposed using perceptual loss for training depth reconstruction in ...
Added: September 16, 2019
Dmitry Akimov, Makarov I., , in: Proceedings of 11th International Conference on Advances in Multimedia (MMEDIA'19).: Lansing: ThinkMind, 2019. P. 59–64.
In this work, we study the effect of combining existent improvements for Deep Q-Networks (DQN) in Markov Decision Processes (MDP) and Partially Observable MDP (POMDP) settings. Combinations of several heuristics, such as Distributional Learning and Dueling architectures improvements, for MDP are well-studied. We propose a new combination method of simple DQN extensions and develop a ...
Added: July 29, 2019
Alisa Korinevskaya, Makarov I., , in: Proceedings of IEEE International Symposium on Mixed and Augmented Reality (ISMAR'18).: NY: IEEE, 2019. P. 117–122.
Depth map super-resolution is a challenging computer vision problem. In this paper, we present two deep convolutional neural networks solving the problem of single depth map super-resolution. Both networks learn residual decomposition and trained with specific perceptual loss improving sharpness and perceptive quality of the upsampled depth map. Several experiments on various depth super-resolution benchmark ...
Added: July 29, 2019