A system for large-scale automatic traffic sign recognition and mapping
We present a system for the large-scale automatic traffic signs recognition and mapping and experimentally justify design choices made for different components of the system. Our system works with more than 140 different classes of traffic signs and does not require labor -intensivelabellingof a large amount of training data due to the training on synthetically generated images. We evaluated our system on the large dataset of Russian traffic signs and made this dataset publically available to encourage futurecomparison.
We investigate the specific problem of machine vision, namely, video-based detection of the moving forklift truck. It is shown that the detection quality of the state-of-the-art local descriptors (SURF, SIFT, etc.) is not satisfactory if the resolution is low and the illumination is changed dramatically. In this paper, we propose to use a simple mathematical morphological algorithm to detect the presence of a cargo on the forklift truck. At first, the movement direction is estimated by the updating motion history image method and the front part of the moving object is obtained. Next, contours are detected and morphological operations in front of the moving object are used to estimate simple geometric features of empty forklift. In the experimental study it has been shown that the proposed method has 40% lower FAR and 27% lower FRR in comparison with conventional matching of local descriptors. Moreover, our algorithm is 7 times faster.
The sixteen-volume set comprising the LNCS volumes 11205-11220 constitutes the refereed proceedings of the 15th European Conference on Computer Vision, ECCV 2018, held in Munich, Germany, in September 2018.
The 776 revised papers presented were carefully reviewed and selected from 2439 submissions. The papers are organized in topical sections on learning for vision; computational photography; human analysis; human sensing; stereo and reconstruction; optimization; matching and recognition; video attention; and poster sessions.
Person detection is a key problem for many computer vision tasks. While face detection has reached maturity, detecting people under full variation of camera view-points, human poses, lighting conditions and occlusions is still a difficult challenge. In this work we focus on detecting human heads in natural scenes. Starting from the recent R-CNN object detector, we extend it in two ways. First, we leverage person-scene relations and propose a global CNN model trained to predict positions and scales of heads directly from the full image. Second, we explicitly model pairwise relations among the objects via energy-based model where the potentials are computed with a CNN framework. Our full combined model complements R-CNN with contextual cues derived from the scene. To train and test our model, we introduce a large dataset with 369,846 human heads annotated in 224,740 movie frames. We evaluate our method and demonstrate improvements of person head detection compared to several recent baselines on three datasets. We also show improvements of the detection speed provided by our model.
The sixteen-volume set comprising the LNCS volumes 11205-11220 constitutes the refereed proceedings of the 15th European Conference on Computer Vision, ECCV 2018, held in Munich, Germany, in September 2018. The 776 revised papers presented were carefully reviewed and selected from 2439 submissions. The papers are organized in topical sections on learning for vision; computational photography; human analysis; human sensing; stereo and reconstruction; optimization; matching and recognition; video attention; and poster sessions.
In this paper we consider general scene recognition problem for analysis of user preferences based on his or her photos on mobile phone. Special attention is paid to out-of-class detections and efficient processing using MobileNet-based architectures. We propose the three stage procedure. At first, pre-trained convolutional neural network (CNN) is used extraction of input image embeddings at one of the last layers, which are used for training a classifier, e.g., support vector machine or random forest. Secondly, we fine-tune the pre-trained network on the given training set and compute the predictions (scores) at the output of the resulted CNN. Finally, we perform object detection in the input image, and the resulted sparse vector of detected objects is classified. The decision is made based on a computation of a weighted sum of the class posterior probabilities estimated by all three classifiers. Experimental results with a subset of ImageNet dataset demonstrate that the proposed approach is up to 5% more accurate when compared to conventional fine-tuned models.
The problem of automatic detection of the moving forklift truck in video data is explored. This task is formulated in terms of computer vision approach as a moving object detection in noisy environment. It is shown that the state-of-the-art local descriptors (SURF, SIFT, FAST, ORB) are not characterized with satisfactory detection quality if the camera resolution is low, the lighting is changed dramatically and shadows are observed. In this paper we propose to use a simple mathematical morphological algorithm to detect the presence of a cargo on the forklift truck. Its first step is the estimation of the movement direction and the front part of the truck by using the updating motion history image. The second step is the application of Canny contour detection and binary morphological operations in front of the moving object to estimate simple geometric features of empty forklift. The algorithm is implemented with the OpenCV library. Our experimental study shows that the best results are achieved if the difference of the width of bounding rectangles is used as a feature. Namely, the detection accuracy is 78.7% (compare with 40% achieved by the best local descriptor), while the average frame processing time is only 5 ms (compare with 35 ms for the fastest descriptor).