Depth Map Interpolation using Perceptual Loss

I. Makarov; Vladimir Aliev; Gerasimova Olga; Pavel Polyakov

doi:10.1109/ISMAR-Adjunct.2017.39

Publications

?

Depth Map Interpolation using Perceptual Loss

P. 93–94.

Makarov I., Vladimir Aliev, Gerasimova Olga, Pavel Polyakov

In this paper, we discuss a semi-dense depth map interpolation method based on convolutional neural network. We propose a compact neural network architecture with loss function defined as Euclidean distance in the feature space of VGG-16 neural network used for deep visual recognition. The suggested solution shows state-of-art performance on synthetic and real datasets. Together with LSD-SLAM, the method could be used to provide a dense depth map for interaction purposes, such as creating a first person game in AR/MR or perception module for autonomous vehicle.

Keywords: first-person shooter Depth map Mixed Reality смешанная реальность Semi-Dense Depth Map Interpolation Deep Convolutional Neural Networks Карта глубины

In book

Adjunct Proceedings of 2017 IEEE International Symposium on Mixed and Augmented Reality (ISMAR-Adjunct)

NY: IEEE, 2017.

Обучение распознаванию эмоций посредством мобильного приложения «ТРОПЭМО»

Shadrina E. V., Мохова В. О., Загоскин В. А. et al., Нижегородский психологический альманах 2024 № 2

The article considers the problem of learning of recognizing emotions from pictures. A review and analysis of domestic and foreign works of scientists dealing with the problem of emotional intelligence was carried out. Its formation, influence on human activity and existing variants of its structure were considered, and common features in the understanding of emotional ...

Added: April 9, 2026

Метод детектирования пространственного положения рук по данным глубинных камер для малопроизводительных вычислительных устройств

Медведев Д. С., Ignatov A., Научно-технический вестник информационных технологий, механики и оптики 2022 Т. 22 № 2 С. 410–414

A method of arm aiming direction estimation for low performance Internet of Things devices is proposed. It uses Human Pose Estimation (HPE) algorithms for retrieving human skeleton key points. Having these key points, arm aiming directions model is calculated. Two well-known HPE methods (PoseNet and OpenPose) are examined. These algorithms have been tested and compared ...

Added: June 1, 2022

Self-supervised recurrent depth estimation with attention mechanisms

Makarov I., Bakhanova M., Nikolenko S. et al., PeerJ Computer Science 2022 Vol. 8 Article e865

Depth estimation has been an essential task for many computer vision applications, especially in autonomous driving, where safety is paramount. Depth can be estimated not only with traditional supervised learning but also via a self-supervised approach that relies on camera motion and does not require ground truth depth maps. Recently, major improvements have been introduced ...

Added: February 1, 2022

On the generalization ability of data-driven models in the problem of total cloud cover retrieval

Krinitskiy M., Alexandrova M., Verezemskaya P. et al., Remote Sensing 2021 Vol. 13 No. 2 Article 326

Total Cloud Cover (TCC) retrieval from ground-based optical imagery is a problem that has been tackled by several generations of researchers. The number of human-designed algorithms for the estimation of TCC grows every year. However, there has been no considerable progress in terms of quality, mostly due to the lack of systematic approach to the ...

Added: September 24, 2021

Fast Depth Reconstruction Using Deep Convolutional Neural Networks

Dmitrii Maslov, Makarov I., , in: Advances in Computational Intelligence: 16th International Work-Conference on Artificial Neural Networks, IWANN 2021, Virtual Event, June 16–18, 2021, Proceedings, Part I* 1. Vol. 12861.: Springer, 2021. Ch. 38 P. 456–467.

In this paper, we study depth reconstruction via RGB-based, Sparse-Depth, and RGBd approaches. We showed that combination of RGB and Sparse Depth approach in RGBd scenario provides the best results. We also proved that the models performance can be further tuned via proper selection of architecture blocks and number of depth points guiding RGB-to-depth reconstruction. ...

Added: September 1, 2021

Deep Convolutional Neural Networks Help Scoring Tandem Mass Spectrometry Data in Database-Searching Approaches

Kudriavtseva P., Kashkinov M., Kertész-Farkas A., Journal of Proteome Research 2021 Vol. 20 No. 10 P. 4708–4717

Spectrum annotation is a challenging task due to the presence of unexpected peptide fragmentation ions as well as the inaccuracy of the detectors of the spectrometers. We present a deep convolutional neural network, called Slider, which learns an optimal feature extraction in its kernels for scoring mass spectrometry (MS)/MS spectra to increase the number of ...

Added: August 30, 2021

American and Russian Sign Language Dactyl Recognition

Makarov I., Nikolay Veldyaykin, Maxim Chertkov et al., , in: Proceedings of the 12th ACM International Conference on PErvasive Technologies Related to Assistive Environments (PETRA '19).: NY: ACM, 2019. P. 204–210.

Sign languages are the main way for people from deaf community to communicate with other people. In this paper, we have compared several real-time sign language dactyl recognition systems using deep convolutional neural networks. Our system is able to recognize words from natural language gestured using signs for each letter. We evaluate our approach on ...

Added: July 10, 2021

Event Recognition with Automatic Album Detection based on Sequential Grouping of Confidence Scores and Neural Attention

Savchenko A., , in: Proceedings of International Joint Conference on Neural Networks 2020 (IJCNN 2020).: Piscataway: IEEE, 2020. P. 1–8.

In this paper a new formulation of event recognition task is examined: it is required to predict event categories given a gallery of images, for which albums (groups of photos corresponding to a single event) are unknown. The novel two-stage approach is proposed. At first, features are extracted in each photo using the pre-trained convolutional ...

Added: October 15, 2020

Sequential Analysis with Specified Confidence Level and Adaptive Convolutional Neural Networks in Image Recognition

Savchenko A., , in: Proceedings of International Joint Conference on Neural Networks 2020 (IJCNN 2020).: Piscataway: IEEE, 2020. P. 1–8.

In this paper the problem of high computational complexity of deep convolutional nets in image recognition is considered. An existing framework of adaptive neural networks is extended by appending the separate classifier to intermediate layers. The hierarchical representations of the input image are sequentially analyzed. If the first classifier returns rather high confidence score, the ...

Added: October 15, 2020

A New Sport Teams Logo Dataset for Detection Tasks

Kuznetsov A., Savchenko A., , in: Proceedings of the International Conference on Computer Vision and Graphics (ICCVG 2020)Vol. 12334.: Cham: Springer, 2020. Ch. 8 P. 87–97.

In this research we introduce a new labelled SportLogo dataset, that contains images of two kinds of sports: hockey (NHL) and basketball (NBA). This dataset presents several challenges typical for logo detection tasks. A huge number of occlusions and logo view changes during playing games lead to an ambiguity of a straightforward detection approach use. ...

Added: October 1, 2020

Detection and Recognition of Food in Photo Galleries for Analysis of User Preferences

Miasnikov E., Savchenko A., , in: Proceedings of International Conference on Image Analysis and Recognition (ICIAR 2020)Vol. 12131.: Cham: Springer, 2020. Ch. 9 P. 83–94.

Food analysis is one of the most important parts of user preference prediction engines for recommendation systems in the travel domain. In this paper, we describe and study the neural network method that allows you to recognize food in a gallery of photos taken with mobile devices. The described method consists of three main stages, ...

Added: October 1, 2020

Извлечение предпочтений пользователя на основе методов автоматического порождения текстовых описаний изображений фотоальбома

Kharchevnikova A., Savchenko A., Компьютерная оптика 2020 Т. 44 № 4 С. 618–626

В работе рассматривается задача извлечения предпочтений пользователя по его фотоальбому. Предложен новый подход на основе автоматического порождения текстовых описаний фотографий и последующей классификации таких описаний. Проведен анализ известных методов создания аннотаций по изображению на основе свёрточных и рекуррентных (Long short-term memory) нейронных сетей. С использованием набора данных Google’s Conceptual Captions обучены новые модели, в которых ...

Added: September 16, 2020

Event Recognition Based on Classification of Generated Image Captions

Savchenko A., Miasnikov E., , in: Advances in Intelligent Data Analysis XVIII (IDA 2020)Vol. 12080.: Cham: Springer, 2020. Ch. 33 P. 418–430.

In this paper, we consider the problem of event recognition on single images. In contrast to conventional fine-tuning of convolutional neural networks (CNN), we proposed to use image captioning, i.e., a generative model that converts images to textual descriptions. The motivation here is the possibility to combine conventional CNNs with a completely different approach in ...

Added: May 17, 2020

American and Russian Sign Language Dactyl Recognition and Text2Sign Translation

Makarov I., Veldyaykin N., Maxim Chertkov et al., , in: Analysis of Images, Social Networks and Texts. 8th International Conference AIST 2019.: Springer, 2019. P. 309–320.

Sign language is the main way to communicate for people from deaf community. However, common people mostly do not know sign language. In this paper, we overview several real-time sign language dactyl recognition systems using deep convolutional neural networks. These systems are able to recognize dactylized words gestured by signs for each letter. We evaluate ...

Added: February 4, 2020

Multi-label Image Set Recognition in Visually-Aware Recommender Systems

Demochkin K., Savchenko A., , in: Analysis of Images, Social Networks and Texts. 8th International Conference, AIST 2019, Lecture Notes in Computer Science, Revised Selected PapersVol. 11832.: Cham: Springer, 2019. Ch. 26 P. 291–297.

In this paper we focus on the problem of multi-label image recognition for visually-aware recommender systems. We propose a two stage approach in which a deep convolutional neural network is firstly fine-tuned on a part of the training set. Secondly, an attention-based aggregation network is trained to compute the weighted average of visual features in ...

Added: December 22, 2019

On Reproducing Semi-dense Depth Map Reconstruction using Deep Convolutional Neural Networks with Perceptual Loss

Makarov I., Dmitrii Maslov, Gerasimova O. et al., , in: Proceedings of 27th ACM International Conference on Multimedia.: NY: ACM, 2019. P. 1080–1084.

In our recent papers, we proposed a new family of residual convolutional neural networks trained for semi-dense and sparse depth reconstruction without use of RGB channel. The proposed models can be used in low-resolution depth sensors or SLAM methods estimating partial depth with certain distributions. We proposed using perceptual loss for training depth reconstruction in ...

Added: September 16, 2019

Deep Reinforcement Learning in VizDoom First-Person Shooter for Health Gathering Scenario

Dmitry Akimov, Makarov I., , in: Proceedings of 11th International Conference on Advances in Multimedia (MMEDIA'19).: Lansing: ThinkMind, 2019. P. 59–64.

In this work, we study the effect of combining existent improvements for Deep Q-Networks (DQN) in Markov Decision Processes (MDP) and Partially Observable MDP (POMDP) settings. Combinations of several heuristics, such as Distributional Learning and Dueling architectures improvements, for MDP are well-studied. We propose a new combination method of simple DQN extensions and develop a ...

Added: July 29, 2019

Fast Depth Map Super-Resolution Using Deep Neural Network

Alisa Korinevskaya, Makarov I., , in: Proceedings of IEEE International Symposium on Mixed and Augmented Reality (ISMAR'18).: NY: IEEE, 2019. P. 117–122.

Depth map super-resolution is a challenging computer vision problem. In this paper, we present two deep convolutional neural networks solving the problem of single depth map super-resolution. Both networks learn residual decomposition and trained with specific perceptual loss improving sharpness and perceptive quality of the upsampled depth map. Several experiments on various depth super-resolution benchmark ...

Added: July 29, 2019