?
Pose Networks Unveiled: Bridging the Gap for Monocular Depth Perception
P. 584–587.
Depth estimation is essential in Augmented Reality applications, enabling realistic object placement, scene understanding, spatial mapping, interaction, and environment awareness. This paper proposes a method to enhance depth model performance without increasing inference costs by improving the pose network in a selfsupervised learning setup. In particular, we enrich spatial information in the pose network by incorporating features from different scales and normalized coordinates. It is experimentally shown on the KITTI dataset that our approach achieves a 2-7% improvement in the abs rel metric when compared to baseline techniques.
Keywords: 3D visionSelf-supervised learningMonocular Depth Estimationpose network ego-motion estimation
Publication based on the results of:
Saleh H., Goncharov D., Shadi S. et al., , in: Proceedings 2026 IEEE 11th International Conference on Smart Cloud SmartCloud 2026 8-10 May 2026.: Los Alamitos: IEEE Computer Society, 2026. P. 78–85.
Estimating depth is a necessary task to understand and navigate the environment surrounding us. Over the years,
many active sensors have been developed to measure depth, but they are expensive and require additional space for mounting. A cheaper alternative is to estimate depth from a single RGB image taken by an ordinary monocular camera, which can ...
Added: May 12, 2026
Chebotareva E., Mukhamedshin A., Imamov N. et al., , in: 2025 11th International Conference on Automation, Robotics, and Applications (ICARA), 12-14 Feb. 2025.: IEEE, 2025. Ch. 2025 P. 252–256.
Added: March 17, 2026
Semenkov I., Karpov A., Savchenko A. et al., IEEE Access 2024 Vol. 12 P. 5163–5176
Visual place recognition is one of the core modern computer vision tasks concerned with identifying location based on the image taken there. Modern state-of-the-art approaches heavily rely on RGB images which are largely affected by changes in the same scene such as varying daytime, illumination, seasonal changes, and presence of dynamic objects (people, vehicles). This ...
Added: March 15, 2024
Saleh S., Saleh H., Dmitry Goncharov et al., , in: 2023 International Symposium ELMAR, 11-13 September 2023, Zadar, Croatia.: IEEE, 2023. P. 23–27.
Estimating depth is necessary to understand and navigate the environment surrounding us. Over the years, many active sensors have been developed to measure depth, but they are expensive and require additional space for mounting. A cheaper alternative is estimating depth from a single RGB image taken by an ordinary monocular camera, which can be placed ...
Added: January 26, 2024
Maksim Golyadkin, Vitaliy Pozdnyakov, Leonid Zhukov et al., Artificial Intelligence 2023 Vol. 324 Article 104012
Modern industrial facilities generate large volumes of raw sensor data during the production process. This data is used to monitor and control the processes and can be analyzed to detect and predict process abnormalities. Typically, the data has to be annotated by experts in order to be used in predictive modeling. However, manual annotation of ...
Added: September 20, 2023
Li X., Makarov I., Kiselev D., IEEE Access 2023 Vol. 11 P. 91842–91849
Predicting molecular properties with Graph Neural Networks (GNNs) has recently drawn a lot of attention, with compound toxicity prediction being one of the biggest challenges. In cases where there is insufficient labeled molecule data, an effective approach is to pre-train GNNs on large-scale unlabeled molecular data and then fine-tune them for downstream tasks. Among pre-training ...
Added: August 30, 2023
Morozov N., Rakitin D., Oleg Desheulin et al., , in: Neural Fields across Fields: Methods and Applications of Implicit Neural Representations. ICLR 2023 Workshop.: [б.и.], 2023. Ch. 8.
In view synthesis, a neural radiance field approximates underlying density and radiance fields based on a sparse set of scene pictures. To generate a pixel of a novel view, it marches a ray through the pixel and computes a weighted sum of radiance emitted from a dense set of ray points. This rendering algorithm is ...
Added: July 18, 2023
[б.и.], 2023.
Addressing problems in different science and engineering disciplines often requires solving optimization problems, including via machine learning from large training data. One class of methods has recently gained significant attention for problems in computer vision and visual computing: coordinate-based neural networks parameterizing a field, such as a neural network that maps a 3D spatial coordinate ...
Added: July 18, 2023
Kiselev D., Makarov I., IEEE Access 2022 Vol. 10 P. 123614–123621
Temporal graph networks are powerful tools for solving the cold-start problem in sequential recommender systems. However, graph models are susceptible to feedback loops and data distribution shifts. The paper proposes a simple yet efficient graph-based exploration method for the mitigation of the issues above. It adopts the counter-based state exploration from reinforcement learning to the ...
Added: September 5, 2022
Makarov I., Bakhanova M., Nikolenko S. et al., PeerJ Computer Science 2022 Vol. 8 Article e865
Depth estimation has been an essential task for many computer vision applications, especially in autonomous driving, where safety is paramount. Depth can be estimated not only with traditional supervised learning but also via a self-supervised approach that relies on camera motion and does not require ground truth depth maps. Recently, major improvements have been introduced ...
Added: February 1, 2022
Sadrtdinov I., Chirkova N., Lobacheva E., , in: ICML 2021 Workshop, Overparameterization: Pitfalls & Opportunities.: [б.и.], 2021.
Memorization studies of deep neural networks (DNNs) help to understand what patterns and how do DNNs learn, and motivate improvements to DNN training approaches. In this work, we investigate the memorization properties of SimCLR, a widely used contrastive self-supervised learning approach, and compare them to the memorization of supervised learning and random labels training. We ...
Added: January 25, 2022
Makarov I., Korovina K., Kiselev D., IEEE Access 2021 Vol. 9 P. 144646–144659
Recently, graph embedding models significantly improved the quality of graph machine learning tasks, such as node classification and link prediction. In this work, we propose a model called JONNEE (JOint Network Nodes and Edges Embedding), which learns node and edge embeddings under self-supervision via joint constraints in a given graph and its edge-to-vertex dual representation ...
Added: October 30, 2021
Sawada T., Symmetry 2020 Vol. 12 No. 11: 1863 P. 1–12
An object is 3D centro-symmetrical if the object can be segmented into two halves and the relationship between them can be represented by a combination of reflection about a plane and a rotation through 180° about an axis that is normal to the plane. A 2D orthographic image of the 3D centro-symmetrical object is always ...
Added: November 12, 2020
Trubochkina N. K., Journal of Physics: Conference Series 2018 Vol. 955 P. 1–6
Abstract. A three-dimensional artistic fractal tomography method that implements a non-glasses 3D visualization of fractal worlds in layered media is proposed. It is designed for the glasses-free 3D vision of digital art objects and films containing fractal content. Prospects for the development of this method in art galleries and the film industry are considered. ...
Added: January 29, 2018
Trubochkina N. K., Кондратьев Н. В., Мир техники кино 2017 Т. - № 2 (11) С. 26–35
A method of three-dimensional artistic fractal tomography is proposed that implements a glasses-free 3D visualization of fractal worlds in layered media. Designed for the glasses free 3D vision of digital art objects and films, containing fractal content. Prospects for the development of this method in art galleries and the film industry are considered. ...
Added: December 11, 2017