Depth Map Interpolation using Perceptual Loss

I. Makarov; Vladimir Aliev; Gerasimova Olga; Pavel Polyakov

doi:10.1109/ISMAR-Adjunct.2017.39

Publications

?

Depth Map Interpolation using Perceptual Loss

P. 93–94.

Makarov I., Vladimir Aliev, Gerasimova Olga, Pavel Polyakov

In this paper, we discuss a semi-dense depth map interpolation method based on convolutional neural network. We propose a compact neural network architecture with loss function defined as Euclidean distance in the feature space of VGG-16 neural network used for deep visual recognition. The suggested solution shows state-of-art performance on synthetic and real datasets. Together with LSD-SLAM, the method could be used to provide a dense depth map for interaction purposes, such as creating a first person game in AR/MR or perception module for autonomous vehicle.

Keywords: first-person shooter Depth map Mixed Reality смешанная реальность Semi-Dense Depth Map Interpolation Deep Convolutional Neural Networks Карта глубины

In book

Adjunct Proceedings of 2017 IEEE International Symposium on Mixed and Augmented Reality (ISMAR-Adjunct)

NY: IEEE, 2017.

Fast Depth Map Super-Resolution Using Deep Neural Network

Alisa Korinevskaya, Makarov I., , in: Proceedings of IEEE International Symposium on Mixed and Augmented Reality (ISMAR'18). NY: IEEE, 2019. P. 117–122.

Depth map super-resolution is a challenging computer vision problem. In this paper, we present two deep convolutional neural networks solving the problem of single depth map super-resolution. Both networks learn residual decomposition and trained with specific perceptual loss improving sharpness and perceptive quality of the upsampled depth map. Several experiments on various depth super-resolution benchmark ...

Added: July 29, 2019

Adjunct Proceedings of 2017 IEEE International Symposium on Mixed and Augmented Reality (ISMAR-Adjunct)

NY: IEEE, 2017.

Accepted Poster Papers will be published in the adjunct proceedings of IEEE ISMAR 2017 and will be included in the IEEE Xplore digital library. ...

Added: August 5, 2017

Semi-Dense Depth Interpolation using Deep Convolutional Neural Networks

Makarov I., Vladimir Aliev, Olga Gerasimova, , in: Proceedings of the 25th ACM international conference on Multimedia (ACM MM'17), Mountain View, CA USA, 23-27 October 2017. NY: Association for Computing Machinery (ACM), 2017. P. 1407–1415.

With advances of recent technologies, augmented reality systems and autonomous vehicles gained a lot of interest from academics and industry. Both these areas rely on scene geometry understanding, which usually requires depth map estimation. However, in case of systems with limited computational resources, such as smartphones or autonomous robots, high resolution dense depth map estimation ...

Added: June 25, 2017

On Reproducing Semi-dense Depth Map Reconstruction using Deep Convolutional Neural Networks with Perceptual Loss

Makarov I., Dmitrii Maslov, Gerasimova O. et al., , in: Proceedings of 27th ACM International Conference on Multimedia. NY: ACM, 2019. P. 1080–1084.

In our recent papers, we proposed a new family of residual convolutional neural networks trained for semi-dense and sparse depth reconstruction without use of RGB channel. The proposed models can be used in low-resolution depth sensors or SLAM methods estimating partial depth with certain distributions. We proposed using perceptual loss for training depth reconstruction in ...

Added: September 16, 2019

Deep probabilistic human pose estimation

Petrov I., Shakhuro V., Konushin A., IET Computer Vision 2018 Vol. 12 No. 5 P. 578–585

The authors consider the problem of human pose estimation using probabilistic convolutional neural networks. They explore ways to improve human pose estimation accuracy on standard pose estimation benchmarks MPII human pose and Leeds Sports Pose (LSP) datasets using frameworks for probabilistic deep learning. Such frameworks transform deterministic neural network into a probabilistic one and allow ...

Added: March 14, 2018

Adapting First-Person Shooter Video Game for Playing with Virtual Reality Headsets

Makarov I., Konoplya O., Pavel Polyakov et al., , in: Proceedings of the Thirtieth International Florida Artificial Intelligence Research Society Conference, FLAIRS 2017, Marco Island, Florida, USA, May 22-24, 2017. AAAI Press 2017, ISBN 978-1-57735-787-2. Palo Alto: AAAI Press, 2017. P. 412–415.

In this article a combination of two modern aspects of games development is considered: (i) the impact of high quality graphics and virtual reality (VR) user adaptation to believe in realness of in-game events by user’s own eyes; (ii) modeling an enemy’s behavior under automatic computer control, called BOT, which reacts similarly to human players. ...

Added: June 24, 2017

American and Russian Sign Language Dactyl Recognition

Makarov I., Nikolay Veldyaykin, Maxim Chertkov et al., , in: Proceedings of the 12th ACM International Conference on PErvasive Technologies Related to Assistive Environments (PETRA '19). NY: ACM, 2019. P. 204–210.

Sign languages are the main way for people from deaf community to communicate with other people. In this paper, we have compared several real-time sign language dactyl recognition systems using deep convolutional neural networks. Our system is able to recognize words from natural language gestured using signs for each letter. We evaluate our approach on ...

Added: July 10, 2021

Organizing Multimedia Data in Video Surveillance Systems Based on Face Verification with Convolutional Neural Networks

Anastasiia D. Sokolova, Angelina S. Kharchevnikova, Savchenko A., , in: Analysis of Images, Social Networks and Texts. 6th International Conference, 2017, Revised Selected PapersVol. 10716. Cham: Springer, 2018. P. 223–230.

In this paper we propose the two-stage approach of organizing information in video surveillance systems. At first, the faces are detected in each frame and a video stream is split into sequences of frames with face region of one person. Secondly, these sequences (tracks) that contain identical faces are grouped using face verification algorithms and ...

Added: May 2, 2018

Organizing Multimedia Data in Video Surveillance Systems Based on Face Verification with Convolutional Neural Networks

Sokolova Anastasiia, Kharchevnikova Angelina, Savchenko A., Lecture Notes in Computer Science 2018 Vol. 10716 P. 223–230

Added: October 24, 2017

Sequential Analysis with Specified Confidence Level and Adaptive Convolutional Neural Networks in Image Recognition

Savchenko A., , in: Proceedings of International Joint Conference on Neural Networks 2020 (IJCNN 2020). Piscataway: IEEE, 2020. P. 1–8.

In this paper the problem of high computational complexity of deep convolutional nets in image recognition is considered. An existing framework of adaptive neural networks is extended by appending the separate classifier to intermediate layers. The hierarchical representations of the input image are sequentially analyzed. If the first classifier returns rather high confidence score, the ...

Added: October 15, 2020

On the generalization ability of data-driven models in the problem of total cloud cover retrieval

Krinitskiy M., Alexandrova M., Verezemskaya P. et al., Remote Sensing 2021 Vol. 13 No. 2 Article 326

Total Cloud Cover (TCC) retrieval from ground-based optical imagery is a problem that has been tackled by several generations of researchers. The number of human-designed algorithms for the estimation of TCC grows every year. However, there has been no considerable progress in terms of quality, mostly due to the lack of systematic approach to the ...

Added: September 24, 2021

A real-time algorithm for mobile robot mapping based on rotation-invariant descriptors and iterative close point algorithm

Vokhmintcev A., Yakovlev K., , in: Analysis of Images, Social Networks and Texts. 5th International Conference, AIST 2016, Yekaterinburg, Russia, April 7-9, 2016, Revised Selected Papers. Communications in Computer and Information ScienceVol. 661. Switzerland: Springer, 2017. P. 353–369.

Nowadays many algorithms for mobile robot mapping in indoor environments have been created. In this work we use a Kinect 2.0 camera, a visible range cameras Beward B2720 and an infrared camera Flir Tau 2 for building 3D dense maps of indoor environments. We present the RGB-D Mapping and a new fusion algorithm combining visual ...

Added: April 22, 2017

Кластеризация видеопоследовательностей в системах видеонаблюдения на основе сверточных нейронных сетей

Соколова А. Д., Savchenko A., В кн.: Материалы XXIII международной научно-технической конференции «Информационные системы и технологии-2017». [б.и.], 2017. С. 870–875.

Рассматривается задача структурирования информации в программных системах видеонаблюдения с помощью группирования видеоданных, в которых присутствуют идентичные лица. Сделан акцент на эффективную кластеризацию видеопоследовательностей с использованием сверточных нейронных сетей для извлечения характерных признаков. Разработан новый алгоритм кластеризации фрагментов видео на основе технологий глубокого обучения и статистического подхода. Приведены предварительные результаты экспериментального исследования точности и быстродействия предложенного ...

Added: October 24, 2017

Deep Convolutional Neural Networks Help Scoring Tandem Mass Spectrometry Data in Database-Searching Approaches

Kudriavtseva P., Kashkinov M., Kertész-Farkas A., Journal of Proteome Research 2021 Vol. 20 No. 10 P. 4708–4717

Spectrum annotation is a challenging task due to the presence of unexpected peptide fragmentation ions as well as the inaccuracy of the detectors of the spectrometers. We present a deep convolutional neural network, called Slider, which learns an optimal feature extraction in its kernels for scoring mass spectrometry (MS)/MS spectra to increase the number of ...

Added: August 30, 2021

Event Recognition Based on Classification of Generated Image Captions

Savchenko A., Miasnikov E., , in: Advances in Intelligent Data Analysis XVIII (IDA 2020)Vol. 12080. Cham: Springer, 2020. Ch. 33 P. 418–430.

In this paper, we consider the problem of event recognition on single images. In contrast to conventional fine-tuning of convolutional neural networks (CNN), we proposed to use image captioning, i.e., a generative model that converts images to textual descriptions. The motivation here is the possibility to combine conventional CNNs with a completely different approach in ...

Added: May 17, 2020

Detection and Recognition of Food in Photo Galleries for Analysis of User Preferences

Miasnikov E., Savchenko A., , in: Proceedings of International Conference on Image Analysis and Recognition (ICIAR 2020)Vol. 12131. Cham: Springer, 2020. Ch. 9 P. 83–94.

Food analysis is one of the most important parts of user preference prediction engines for recommendation systems in the travel domain. In this paper, we describe and study the neural network method that allows you to recognize food in a gallery of photos taken with mobile devices. The described method consists of three main stages, ...

Added: October 1, 2020

Метод детектирования пространственного положения рук по данным глубинных камер для малопроизводительных вычислительных устройств

Медведев Д. С., Ignatov A., Научно-технический вестник информационных технологий, механики и оптики 2022 Т. 22 № 2 С. 410–414

A method of arm aiming direction estimation for low performance Internet of Things devices is proposed. It uses Human Pose Estimation (HPE) algorithms for retrieving human skeleton key points. Having these key points, arm aiming directions model is calculated. Two well-known HPE methods (PoseNet and OpenPose) are examined. These algorithms have been tested and compared ...

Added: June 1, 2022

First-Person Shooter Game for Virtual Reality Headset with Advanced Multi-Agent Intelligent System

Makarov I., Mikhail Tokmakov, Pavel Polyakov et al., , in: Proceedings of the 24th ACM international conference on Multimedia (ACM MM'16), Amsterdam, Netherlands, 15-19 October 2016. NY: Association for Computing Machinery (ACM), 2016. P. 735–736.

We present a multiplayer first-person shooter (FPS) game with advanced intelligent non-playable characters (NPC) under computer control. The game is specially adapted for playing in VR headset so the simulator sickness symptoms are significantly reduced. The demo allows users to play with the other human and NPC players in a shooter game made in Unreal Engine ...

Added: August 28, 2016

Multi-label Image Set Recognition in Visually-Aware Recommender Systems

Demochkin K., Savchenko A., , in: Analysis of Images, Social Networks and Texts. 8th International Conference, AIST 2019, Lecture Notes in Computer Science, Revised Selected PapersVol. 11832. Cham: Springer, 2019. Ch. 26 P. 291–297.

In this paper we focus on the problem of multi-label image recognition for visually-aware recommender systems. We propose a two stage approach in which a deep convolutional neural network is firstly fine-tuned on a part of the training set. Secondly, an attention-based aggregation network is trained to compute the weighted average of visual features in ...

Added: December 22, 2019

Proceedings of IEEE International Symposium on Mixed and Augmented Reality (ISMAR'18)

NY: IEEE, 2019.

The IEEE ISMAR is the leading international academic conference in the fields of Augmented Reality and Mixed Reality. The symposium is organized and supported by the IEEE Computer Society, IEEE VGTC and ACM SIGCHI. The congress organized by TU Munich and ETH Zurich will be held at MOC Events Center in Munich (Germany), on October 16-20, 2018. ...

Added: July 29, 2019

Распознавание пола и возраста по видеоизображению лица на основе сверточных нейронных сетей

Kharchevnikova A., Savchenko A., В кн.: Материалы XXIII международной научно-технической конференции «Информационные системы и технологии-2017». [б.и.], 2017. С. 864–869.

Рассматривается задача построения интеллектуальных систем контекстной рекламы с автоматической настройкой на потенциальные предпочтения пользователя. Выполнен аналитический обзор современных публикаций, посвященных распознаванию пола и возраста по видеоизображению лица, в том числе на основе глубоких сверточных нейронных сетей. Проведен сравнительный анализ способов агрегации решений, полученных при распознавании каждого видеокадра. Приведены результаты экспериментального исследования их точности и быстродействия. ...

Added: October 24, 2017

Event Recognition with Automatic Album Detection based on Sequential Grouping of Confidence Scores and Neural Attention

Savchenko A., , in: Proceedings of International Joint Conference on Neural Networks 2020 (IJCNN 2020). Piscataway: IEEE, 2020. P. 1–8.

In this paper a new formulation of event recognition task is examined: it is required to predict event categories given a gallery of images, for which albums (groups of photos corresponding to a single event) are unknown. The novel two-stage approach is proposed. At first, features are extracted in each photo using the pre-trained convolutional ...

Added: October 15, 2020

Извлечение предпочтений пользователя на основе методов автоматического порождения текстовых описаний изображений фотоальбома

Kharchevnikova A., Savchenko A., Компьютерная оптика 2020 Т. 44 № 4 С. 618–626

В работе рассматривается задача извлечения предпочтений пользователя по его фотоальбому. Предложен новый подход на основе автоматического порождения текстовых описаний фотографий и последующей классификации таких описаний. Проведен анализ известных методов создания аннотаций по изображению на основе свёрточных и рекуррентных (Long short-term memory) нейронных сетей. С использованием набора данных Google’s Conceptual Captions обучены новые модели, в которых ...

Added: September 16, 2020

Russian Sign Language Dactyl Recognition

Makarov I., Veldyaykin N., Maxim Chertkov et al., , in: 2019 42nd International Conference on Telecommunications and Signal Processing (TSP). NY: IEEE, 2019. P. 726–729.

In this paper, we compare several real-time sign language dactyl recognition systems and present a new model based on deep convolutional neural networks. These systems are able to recognize Russian alphabet letters presented as static signs in Russian Sign language used by people from deaf community. In such an approach, we recognize words from Russian ...

Added: July 29, 2019