?
Monocular Depth Estimation Based on Active Learning
Estimating depth is a necessary task to understand and navigate the environment surrounding us. Over the years,
many active sensors have been developed to measure depth, but they are expensive and require additional space for mounting. A cheaper alternative is to estimate depth from a single RGB image taken by an ordinary monocular camera, which can be placed even inside the smartphone. However, it is a well-known problem that neural networks require huge amount of labeled data to be effectively learned. That fact serves a barrier to the further development of the monocular depth estimation. In this paper, we address this problem. We propose a novel active deep learning training framework that reduces the dataset volume ratio by adaptively selecting the most informative data for labeling that focus on the most relevant human vision features for monocular depth estimation, which help us identify the image pixels that are most relevant for depth estimation. Our methodology indicates that it is possible to reduce the amount of labeled training data by 81% and at the same time preserve the comparable accuracy on the KITTI Odometry dataset.