User Modeling on Mobile Device Based on Facial Clustering and Object Detection in Photos and Videos
The article describes an approach for extraction of user preferences based on the analysis of a gallery of photos and videos on mobile device. It is proposed to firstly use fast SSD-based methods in order to detect objects of interests in offline mode directly on mobile device. Next we perform facial analysis of all visual data: extract feature vectors from detected facial regions, cluster them and select public photos and videos which do not contain faces from the large clusters of an owner of mobile device and his or her friends and relatives. At the second stage, these public images are processed on the remote server using very accurate but rather slow object detectors. Experimental study of several contemporary detectors is presented with the specially designed subset of MS COCO, ImageNet and Open Images datasets.
The Tenth International Conference on Mobile Ubiquitous Computing, Systems, Services and Technologies (UBICOMM 2016), held between October 9 and 13, 2016 in Venice, Italy, continued a series of events addressing fundamentals of ubiquitous systems and the new applications related to them. The rapid advances in ubiquitous technologies make fruition of more than 35 years of research in distributed computing systems, and more than two decades of mobile computing. The ubiquity vision is becoming a reality. Hardware and software components evolved to deliver functionality under failure-prone environments with limited resources. The advent of web services and the progress on wearable devices, ambient components, user-generated content, mobile communications, and new business models generated new applications and services. The conference created a bridge between issues with software and hardware challenges through mobile communications. Advances in web services technologies along with their integration into mobility, online and new business models provide a technical infrastructure that enables the progress of mobile services and applications. These include dynamic and on-demand service, context-aware services, and mobile web services. While driving new business models and new online services, particular techniques must be developed for web service composition, web service-driven system design methodology, creation of web services, and on-demand web services. As mobile and ubiquitous computing becomes a reality, more formal and informal learning will take pace out of the confines of the traditional classroom. Two trends converge to make this possible; increasingly powerful cell phones and PDAs, and improved access to wireless broadband. At the same time, due to the increasing complexity, modern learners will need tools that operate in an intuitive manner and are flexibly integrated in the surrounding learning environment. Educational services will become more customized and personalized, and more frequently subjected to changes. Learning and teaching are now becoming less tied to physical locations, co-located members of a group, and co-presence in time. Learning and teaching increasingly take place in fluid combinations of virtual and "real" contexts, and fluid combinations of presence in time, space and participation in community. To the learner full access and abundance in communicative opportunities and information retrieval represents new challenges and affordances. Consequently, the educational challenges are numerous in the intersection of technology development, curriculum development, content development and educational infrastructure. The event was very competitive in its selection process and very well perceived by the international scientific and industrial communities. As such, it has attracted excellent contributions and active participation from all over the world. We were very pleased to receive a large amount of top quality contributions. 2 / 229 The conference had the following tracks: Ubiquitous Software and Security Mobility Context-awareness in Intelligent Systems and Smart Spaces Ubiquitous Mobile Services Trends and Challenges Users, Applications, and Business models Ubiquitous Devices and Operative Systems Collaborative Ubiquitous Systems Smart Spaces and Internet of Things Toward Emerging Technology for Harbor Systems and Services We take here the opportunity to warmly thank all the members of the UBICOMM 2016 technical program committee, as well as the numerous reviewers. The creation of such a high quality conference program would not have been possible without their involvement. We also kindly thank all the authors that dedicated much of their time and effort to contribute to UBICOMM 2016. We truly believe that, thanks to all these efforts, the final conference program consisted of top quality contributions. Also, this event could not have been a reality without the support of many individuals, organizations and sponsors. We also gratefully thank the members of the UBICOMM 2016 organizing committee for their help in handling the logistics and for their work that made this professional meeting a success. We hope UBICOMM 2016 was a successful international forum for the exchange of ideas and results between academia and industry and to promote further progress in the field of ubiquitous systems and the new applications related to them. We also hope that Venice, Italy, provided a pleasant environment during the conference and everyone saved some time to enjoy the unique charm of the city.
The sixteen-volume set comprising the LNCS volumes 11205-11220 constitutes the refereed proceedings of the 15th European Conference on Computer Vision, ECCV 2018, held in Munich, Germany, in September 2018.
The 776 revised papers presented were carefully reviewed and selected from 2439 submissions. The papers are organized in topical sections on learning for vision; computational photography; human analysis; human sensing; stereo and reconstruction; optimization; matching and recognition; video attention; and poster sessions.
We present a system for the large-scale automatic traffic signs recognition and mapping and experimentally justify design choices made for different components of the system. Our system works with more than 140 different classes of traffic signs and does not require labor -intensivelabellingof a large amount of training data due to the training on synthetically generated images. We evaluated our system on the large dataset of Russian traffic signs and made this dataset publically available to encourage futurecomparison.
A new public dataset of traffic sign images is presented. The dataset is intended for training and testing the algorithms of traffic sign recognition. We describe the dataset structure and guidelines for working with the dataset, comparing it with the previously published traffic sign datasets. The evaluation of modern detection and classification algorithms conducted using the proposed dataset has shown that existing methods of recognition of a wide class of traffic signs do not achieve the accuracy and completeness required for a number of applications.
The EPiC Series in Language and Linguistics publishes high quality collections of papers in language, linguistics and related areas.
It has been shown that the activations invoked by an image within the top layers of a large convolutional neural network provide a high-level descriptor of the visual content of the image. In this paper, we investigate the use of such descriptors (neural codes) within the image retrieval application. In the experiments with several standard retrieval benchmarks, we establish that neural codes perform competitively even when the convolutional neural network has been trained for an unrelated classification task (e.g. Image-Net). We also evaluate the improvement in the retrieval performance of neural codes, when the network is retrained on a dataset of images that are similar to images encountered at test time. We further evaluate the performance of the compressed neural codes and show that a simple PCA compression provides very good short codes that give state-of-the-art accuracy on a number of datasets. In general, neural codes turn out to be much more resilient to such compression in comparison other state-of-the-art descriptors. Finally, we show that discriminative dimensionality reduction trained on a dataset of pairs of matched photographs improves the performance of PCA-compressed neural codes even further. Overall, our quantitative experiments demonstrate the promise of neural codes as visual descriptors for image retrieval.
We investigate the specific problem of machine vision, namely, video-based detection of the moving forklift truck. It is shown that the detection quality of the state-of-the-art local descriptors (SURF, SIFT, etc.) is not satisfactory if the resolution is low and the illumination is changed dramatically. In this paper, we propose to use a simple mathematical morphological algorithm to detect the presence of a cargo on the forklift truck. At first, the movement direction is estimated by the updating motion history image method and the front part of the moving object is obtained. Next, contours are detected and morphological operations in front of the moving object are used to estimate simple geometric features of empty forklift. In the experimental study it has been shown that the proposed method has 40% lower FAR and 27% lower FRR in comparison with conventional matching of local descriptors. Moreover, our algorithm is 7 times faster.
The problem of automatic detection of the moving forklift truck in video data is explored. This task is formulated in terms of computer vision approach as a moving object detection in noisy environment. It is shown that the state-of-the-art local descriptors (SURF, SIFT, FAST, ORB) are not characterized with satisfactory detection quality if the camera resolution is low, the lighting is changed dramatically and shadows are observed. In this paper we propose to use a simple mathematical morphological algorithm to detect the presence of a cargo on the forklift truck. Its first step is the estimation of the movement direction and the front part of the truck by using the updating motion history image. The second step is the application of Canny contour detection and binary morphological operations in front of the moving object to estimate simple geometric features of empty forklift. The algorithm is implemented with the OpenCV library. Our experimental study shows that the best results are achieved if the difference of the width of bounding rectangles is used as a feature. Namely, the detection accuracy is 78.7% (compare with 40% achieved by the best local descriptor), while the average frame processing time is only 5 ms (compare with 35 ms for the fastest descriptor).
We analyzed the way to increase computational efficiency of video-based image recognition methods with matching of high dimensional feature vectors extracted by deep convolutional neural networks. We proposed an algorithm for approximate nearest neighbor search. At the first step, for a given video frame the algorithm verifies a reference image obtained when recognizing the previous frame. After that the frame is compared with a few number of reference images. Each next examined reference image is chosen so that to maximize conditional probability density of distances to the reference instances tested at previous steps. To decrease the required memory space we beforehand calculate only distances from all the images to small number of instances (pivots). When experimenting with either face photos from Labeled Faces in the Wild and PubFig83 datasets or with video data from YouTube Faces we showed that our algorithm allows accelerating the recognition procedure by 1.4–4 times comparing with known approximate nearest neighbor methods.