People Tracking Algorithm for Human Height Mounted Cameras
We present a new people tracking method for human height mounted camera, e.g. the one attached near information or advertising stand. We use state-of-the-art particle filter approach and improve it by explicitly modeling of object visibility which makes the method able to cope with difficult object overlapping. We employ our own method based on online-boosting classifiers to resolve occlusions and show that it is well suited for tracking multiple objects. In addition to training an online-classifier which is updated each frame we propose to store object appearance and update it with a certain lag. It helps to correctly handle situations when a person enters the scene while another one leaves it at the same time. We demonstrate the perfomance of our algorithm and advantages of our contributions on our own video dataset
We present a new combined approach for monocular model-based 3D tracking. A preliminary object pose is estimated by using a keypoint-based technique. The pose is then refined by optimizing the contour energy function. The energy determines the degree of correspondence between the contour of the model projection and the image edges. It is calculated based on both the intensity and orientation of the raw image gradient. For optimization, we propose a technique and search area constraints that allow overcoming the local optima and taking into account information obtained through keypoint-based pose estimation. Owing to its combined nature, our method eliminates numerous issues of keypoint-based and edge-based approaches. We demonstrate the efficiency of our method by comparing it with state-of-the-art methods on a public benchmark dataset that includes videos with various lighting conditions, movement patterns, and speed.
The article is devoted to the history and problems of creating interfaces. Shows the complexity and importance of effective interfaces, noted that this problem is a system of multilevel interdisciplinary. The new systems should be given serious attention to issues of human efficiency level. Man is still the leading element in determining the efficiency of any ergatic system. The main means of control in ergatic systems including computers, is the graphic manipulator (GM), with which to control the on-screen controls. Are the main styles of user interface. The most popular are GUI-interface (GUI - GraphicalUserInterface) and based on them WUI-interface (WUI-WebUserInterface). The development of equipment and technology of computer modeling led to the active introduction of virtual reality technology to ensure the inclusion of people in artificial worlds. Their main feature - full control of all the parameters of the development and the emergence of a sense of presence in people who live in these environments, which are called immersive. Technology induced environments allow a number of new, not generally applicable to the present, of interfaces using specially engineered virtual environments. Much attention is paid to creating the most advanced systems - systems contact management, which are the camera and sophisticated software. The drawbacks of modern non-contact control. Is being developed to create a contactless intelligent interface, which will allow: to control with data from a video camera, which is installed on your computer have a high noise immunity, clearly identify the user to recognize the situational environment, have an acceptable cost.
The sixteen-volume set comprising the LNCS volumes 11205-11220 constitutes the refereed proceedings of the 15th European Conference on Computer Vision, ECCV 2018, held in Munich, Germany, in September 2018. The 776 revised papers presented were carefully reviewed and selected from 2439 submissions. The papers are organized in topical sections on learning for vision; computational photography; human analysis; human sensing; stereo and reconstruction; optimization; matching and recognition; video attention; and poster sessions.
Welcome to the 23rd International ACM Conference on 3D Web Technology - Web3D 2018, organized in cooperation with the Web3D Consortium at the Poznań University of Economics and Business in Poznań, Poland on June 20-22, 2018.
This year's theme "3D Everywhere" emphasizes the global scope and impact of current and future 3D technology. Web3D fosters and supports the increasing development, use, and utility of 3D technologies for researchers, entrepreneurs, developers, domain experts as well as users. The goal of the conference is to share innovative and creative ideas that enable development of 3D applications for a wide range of 3D environments, including the web, mobile systems as well as virtual and augmented reality (VR & AR) setups.
The volume contains the abstracts of the 12th International Conference "Intelligent Data Processing: Theory and Applications". The conference is organized by the Russian Academy of Sciences, the Federal Research Center "Informatics and Control" of the Russian Academy of Sciences and the Scientific and Coordination Center "Digital Methods of Data Mining". The conference has being held biennially since 1989. It is one of the most recognizable scientific forums on data mining, machine learning, pattern recognition, image analysis, signal processing, and discrete analysis. The Organizing Committee of IDP-2018 is grateful to Forecsys Co. and CFRS Co. for providing assistance in the conference preparation and execution. The conference is funded by RFBR, grant 18-07-20075. The conference website http://mmro.ru/en/.
In this paper, we take up the long-standing problem of how to recover 3-D shapes represented by a 2-D image, such as the image on the retina of the eye, or in a video camera. Our approach is biologically grounded in a theory of how the human visual system solves this problem, focusing on shapes that are mirror symmetrical in 3-D. A 3-D mirror-symmetrical shape can be recovered from a single 2-D orthographic or perspective image by applying several a priori constraints: 3-D mirror symmetry, 3-D compactness, and planarity of contours. From the computational point of view, the application of a 3-D symmetry constraint is challenging because it requires establishing 3-D symmetry correspondence among features of a 2-D image, which itself is asymmetrical for almost all viewing directions relative to the 3-D symmetrical shape. We describe new invariants of a 3-D to 2-D projection for the case of a pair of mirror-symmetrical planar contours, and we formally state and prove the necessary and sufficient conditions for detection of this type of symmetry in a single orthographic and perspective image.
This paper proposes a deep learning architecture based on Residual Network that dynamically adjusts the number of executed layers for the regions of the image. This architecture is end-to-end trainable, deterministic and problem-agnostic. It is therefore applicable without any modifications to a wide range of computer vision problems such as image classification, object detection and image segmentation. We present experimental results showing that this model improves the computational efficiency of Residual Networks on the challenging ImageNet classification and COCO object detection datasets. Additionally, we evaluate the computation time maps on the visual saliency dataset cat2000 and find that they correlate surprisingly well with human eye fixation positions.
The Shape Boltzmann Machine (SBM) and its multilabel version MSBM have been recently introduced as deep generative models that capture the variations of an object shape. While being more flexible MSBM requires datasets with labeled parts of the objects for training. In the paper we present an algorithm for training MSBM using binary masks of objects and the seeds which approximately correspond to the locations of objects parts. The latter can be obtained from part-based detectors in an unsupervised manner. We derive a latent variable model and an EM-like training procedure for adjusting the weights of MSBM using a deep learning framework. We show that the model trained by our method outperforms SBM in the tasks related to binary shapes and is very close to the original MSBM in terms of quality of multilabel shapes.