?
Depth Map Interpolation using Perceptual Loss
P. 93-94.
In this paper, we discuss a semi-dense depth map interpolation method based on convolutional neural network. We propose a compact neural network architecture with loss function defined as Euclidean distance in the feature space of VGG-16 neural network used for deep visual recognition. The suggested solution shows state-of-art performance on synthetic and real datasets. Together with LSD-SLAM, the method could be used to provide a dense depth map for interaction purposes, such as creating a first person game in AR/MR or perception module for autonomous vehicle.
Alisa Korinevskaya, Makarov I., , in : Proceedings of IEEE International Symposium on Mixed and Augmented Reality (ISMAR'18). : NY : IEEE, 2019. P. 117-122.
Depth map super-resolution is a challenging computer vision problem. In this paper, we present two deep convolutional neural networks solving the problem of single depth map super-resolution. Both networks learn residual decomposition and trained with specific perceptual loss improving sharpness and perceptive quality of the upsampled depth map. Several experiments on various depth super-resolution benchmark ...
Added: July 29, 2019
NY : IEEE, 2017
Accepted Poster Papers will be published in the adjunct proceedings of IEEE ISMAR 2017 and will be included in the IEEE Xplore digital library. ...
Added: August 5, 2017
Makarov I., Vladimir Aliev, Olga Gerasimova, , in : Proceedings of the 25th ACM international conference on Multimedia (ACM MM'17), Mountain View, CA USA, 23-27 October 2017. : NY : Association for Computing Machinery (ACM), 2017. P. 1407-1415.
With advances of recent technologies, augmented reality systems and autonomous vehicles gained a lot of interest from academics and industry. Both these areas rely on scene geometry understanding, which usually requires depth map estimation. However, in case of systems with limited computational resources, such as smartphones or autonomous robots, high resolution dense depth map estimation ...
Added: June 25, 2017
Makarov I., Dmitrii Maslov, Gerasimova O. et al., , in : Proceedings of 27th ACM International Conference on Multimedia. : NY : ACM, 2019. P. 1080-1084.
In our recent papers, we proposed a new family of residual convolutional neural networks trained for semi-dense and sparse depth reconstruction without use of RGB channel. The proposed models can be used in low-resolution depth sensors or SLAM methods estimating partial depth with certain distributions. We proposed using perceptual loss for training depth reconstruction in ...
Added: September 16, 2019
Petrov I., Shakhuro V., Konushin A., IET Computer Vision 2018 Vol. 12 No. 5 P. 578-585
The authors consider the problem of human pose estimation using probabilistic convolutional neural networks. They explore ways to improve human pose estimation accuracy on standard pose estimation benchmarks MPII human pose and Leeds Sports Pose (LSP) datasets using frameworks for probabilistic deep learning. Such frameworks transform deterministic neural network into a probabilistic one and allow ...
Added: March 14, 2018
Makarov I., Konoplya O., Pavel Polyakov et al., , in : Proceedings of the Thirtieth International Florida Artificial Intelligence Research Society Conference, FLAIRS 2017, Marco Island, Florida, USA, May 22-24, 2017. AAAI Press 2017, ISBN 978-1-57735-787-2. : Palo Alto : AAAI Press, 2017. P. 412-415.
In this article a combination of two modern aspects of games development is considered: (i) the impact of high quality graphics and virtual reality (VR) user adaptation to believe in realness of in-game events by user’s own eyes; (ii) modeling an enemy’s behavior under automatic computer control, called BOT, which reacts similarly to human players. ...
Added: June 24, 2017
Makarov I., Nikolay Veldyaykin, Maxim Chertkov et al., , in : Proceedings of the 12th ACM International Conference on PErvasive Technologies Related to Assistive Environments (PETRA '19). : NY : ACM, 2019. P. 204-210.
Sign languages are the main way for people from deaf community to communicate with other people. In this paper, we have compared several real-time sign language dactyl recognition systems using deep convolutional neural networks. Our system is able to recognize words from natural language gestured using signs for each letter. We evaluate our approach on ...
Added: July 10, 2021
Anastasiia D. Sokolova, Angelina S. Kharchevnikova, Savchenko A., , in : Analysis of Images, Social Networks and Texts. 6th International Conference, 2017, Revised Selected Papers. Vol. 10716.: Cham : Springer, 2018. P. 223-230.
In this paper we propose the two-stage approach of organizing information in video surveillance systems. At first, the faces are detected in each frame and a video stream is split into sequences of frames with face region of one person. Secondly, these sequences (tracks) that contain identical faces are grouped using face verification algorithms and ...
Added: May 2, 2018
Sokolova Anastasiia, Kharchevnikova Angelina, Savchenko A., Lecture Notes in Computer Science 2018 Vol. 10716 P. 223-230
In this paper we propose the two-stage approach of organizing information in video surveillance systems. At first, the faces are detected in each frame and a video stream is split into sequences of frames with face region of one person. Secondly, these sequences (tracks) that contain identical faces are grouped using face verification algorithms and ...
Added: October 24, 2017
Savchenko A., , in : Proceedings of International Joint Conference on Neural Networks 2020 (IJCNN 2020). : Piscataway : IEEE, 2020. P. 1-8.
In this paper the problem of high computational complexity of deep convolutional nets in image recognition is considered. An existing framework of adaptive neural networks is extended by appending the separate classifier to intermediate layers. The hierarchical representations of the input image are sequentially analyzed. If the first classifier returns rather high confidence score, the ...
Added: October 15, 2020
Krinitskiy M., Alexandrova M., Verezemskaya P. et al., Remote Sensing 2021 Vol. 13 No. 2 Article 326
Total Cloud Cover (TCC) retrieval from ground-based optical imagery is a problem that has been tackled by several generations of researchers. The number of human-designed algorithms for the estimation of TCC grows every year. However, there has been no considerable progress in terms of quality, mostly due to the lack of systematic approach to the ...
Added: September 24, 2021
Vokhmintcev A., Yakovlev K., , in : Analysis of Images, Social Networks and Texts. 5th International Conference, AIST 2016, Yekaterinburg, Russia, April 7-9, 2016, Revised Selected Papers. Communications in Computer and Information Science. Vol. 661.: Switzerland : Springer, 2017. P. 353-369.
Nowadays many algorithms for mobile robot mapping in indoor environments have been created. In this work we use a Kinect 2.0 camera, a visible range cameras Beward B2720 and an infrared camera Flir Tau 2 for building 3D dense maps of indoor environments. We present the RGB-D Mapping and a new fusion algorithm combining visual ...
Added: April 22, 2017
Соколова А. Д., Savchenko A., В кн. : Материалы XXIII международной научно-технической конференции «Информационные системы и технологии-2017». : [б.и.], 2017. С. 870-875.
Рассматривается задача структурирования информации в программных системах видеонаблюдения с помощью группирования видеоданных, в которых присутствуют идентичные лица. Сделан акцент на эффективную кластеризацию видеопоследовательностей с использованием сверточных нейронных сетей для извлечения характерных признаков. Разработан новый алгоритм кластеризации фрагментов видео на основе технологий глубокого обучения и статистического подхода. Приведены предварительные результаты экспериментального исследования точности и быстродействия предложенного ...
Added: October 24, 2017
Kudriavtseva P., Kashkinov M., Kertész-Farkas A., Journal of Proteome Research 2021 Vol. 20 No. 10 P. 4708-4717
Spectrum annotation is a challenging task due to the presence of unexpected peptide fragmentation ions as well as the inaccuracy of the detectors of the spectrometers. We present a deep convolutional neural network, called Slider, which learns an optimal feature extraction in its kernels for scoring mass spectrometry (MS)/MS spectra to increase the number of ...
Added: August 30, 2021
Savchenko A., Miasnikov E., , in : Advances in Intelligent Data Analysis XVIII (IDA 2020). Vol. 12080.: Cham : Springer, 2020. Ch. 33. P. 418-430.
In this paper, we consider the problem of event recognition on single images. In contrast to conventional fine-tuning of convolutional neural networks (CNN), we proposed to use image captioning, i.e., a generative model that converts images to textual descriptions. The motivation here is the possibility to combine conventional CNNs with a completely different approach in ...
Added: May 17, 2020
Miasnikov E., Savchenko A., , in : Proceedings of International Conference on Image Analysis and Recognition (ICIAR 2020). Vol. 12131.: Cham : Springer, 2020. Ch. 9. P. 83-94.
Food analysis is one of the most important parts of user preference prediction engines for recommendation systems in the travel domain. In this paper, we describe and study the neural network method that allows you to recognize food in a gallery of photos taken with mobile devices. The described method consists of three main stages, ...
Added: October 1, 2020
Медведев Д. С., Ignatov A., Научно-технический вестник информационных технологий, механики и оптики 2022 Т. 22 № 2 С. 410-414
A method of arm aiming direction estimation for low performance Internet of Things devices is proposed. It uses Human Pose Estimation (HPE) algorithms for retrieving human skeleton key points. Having these key points, arm aiming directions model is calculated. Two well-known HPE methods (PoseNet and OpenPose) are examined. These algorithms have been tested and compared ...
Added: June 1, 2022
Makarov I., Mikhail Tokmakov, Pavel Polyakov et al., , in : Proceedings of the 24th ACM international conference on Multimedia (ACM MM'16), Amsterdam, Netherlands, 15-19 October 2016. : NY : Association for Computing Machinery (ACM), 2016. P. 735-736.
We present a multiplayer first-person shooter (FPS) game with advanced intelligent non-playable characters (NPC) under computer control. The game is specially adapted for playing in VR headset so the simulator sickness symptoms are significantly reduced.
The demo allows users to play with the other human and NPC players in a shooter game made in Unreal Engine ...
Added: August 28, 2016
Demochkin K., Savchenko A., , in : Analysis of Images, Social Networks and Texts. 8th International Conference, AIST 2019, Lecture Notes in Computer Science, Revised Selected Papers. Vol. 11832.: Cham : Springer, 2019. Ch. 26. P. 291-297.
In this paper we focus on the problem of multi-label image recognition for visually-aware recommender systems. We propose a two stage approach in which a deep convolutional neural network is firstly fine-tuned on a part of the training set. Secondly, an attention-based aggregation network is trained to compute the weighted average of visual features in ...
Added: December 22, 2019
NY : IEEE, 2019
The IEEE ISMAR is the leading international academic conference in the fields of Augmented Reality and Mixed Reality. The symposium is organized and supported by the IEEE Computer Society, IEEE VGTC and ACM SIGCHI.
The congress organized by TU Munich and ETH Zurich will be held at MOC Events Center in Munich (Germany), on October 16-20, 2018. ...
Added: July 29, 2019
Kharchevnikova A., Savchenko A., В кн. : Материалы XXIII международной научно-технической конференции «Информационные системы и технологии-2017». : [б.и.], 2017. С. 864-869.
Рассматривается задача построения интеллектуальных систем контекстной рекламы с автоматической настройкой на потенциальные предпочтения пользователя. Выполнен аналитический обзор современных публикаций, посвященных распознаванию пола и возраста по видеоизображению лица, в том числе на основе глубоких сверточных нейронных сетей. Проведен сравнительный анализ способов агрегации решений, полученных при распознавании каждого видеокадра. Приведены результаты экспериментального исследования их точности и быстродействия. ...
Added: October 24, 2017
Savchenko A., , in : Proceedings of International Joint Conference on Neural Networks 2020 (IJCNN 2020). : Piscataway : IEEE, 2020. P. 1-8.
In this paper a new formulation of event recognition task is examined: it is required to predict event categories given a gallery of images, for which albums (groups of photos corresponding to a single event) are unknown. The novel two-stage approach is proposed. At first, features are extracted in each photo using the pre-trained convolutional ...
Added: October 15, 2020
Kharchevnikova A., Savchenko A., Компьютерная оптика 2020 Т. 44 № 4 С. 618-626
В работе рассматривается задача извлечения предпочтений пользователя по его фотоальбому. Предложен новый подход на основе автоматического порождения текстовых описаний фотографий и последующей классификации таких описаний. Проведен анализ известных методов создания аннотаций по изображению на основе свёрточных и рекуррентных (Long short-term memory) нейронных сетей. С использованием набора данных Google’s Conceptual Captions обучены новые модели, в которых ...
Added: September 16, 2020
Makarov I., 501502591, Maxim Chertkov et al., , in : 2019 42nd International Conference on Telecommunications and Signal Processing (TSP). : NY : IEEE, 2019. P. 726-729.
In this paper, we compare several real-time sign language dactyl recognition systems and present a new model based on deep convolutional neural networks. These systems are able to recognize Russian alphabet letters presented as static signs in Russian Sign language used by people from deaf community. In such an approach, we recognize words from Russian ...
Added: July 29, 2019