Markov chain Monte Carlo based video tracking algorithm
The paper considers a problem of multiple person tracking. We present the algorithm to automatic people tracking on surveillance videos recorded by static cameras. Proposed algorithm is an extension of approach based on tracking-by-detection of people heads and data association using Markov chain Monte Carlo (MCMC). Short track fragments (tracklets) are built by local tracking of people heads. Tracklet postprocessing and accurate results interpolation were shown to reduce number of false positives. We use position deviations of tracklets and revised entry/exit points factor to separate pedestrians from false positives. The paper presents a new method to estimate body position, that increases precision of tracker. Finally, we switched HOG-based detector to cascade one. Our evaluation shows proposed modifications significantly increase tracking quality.
This book constitutes the refereed proceedings of the 11th International Conference on Intelligent Data Processing, IDP 2016, held in Barcelona, Spain, in October 2016.
The 11 revised full papers were carefully reviewed and selected from 52 submissions. The papers of this volume are organized in topical sections on machine learning theory with applications; intelligent data processing in life and social sciences; morphological and technological approaches to image analysis.
The Shape Boltzmann Machine (SBM) and its multilabel version MSBM have been recently introduced as deep generative models that capture the variations of an object shape. While being more flexible MSBM requires datasets with labeled parts of the objects for training. In the paper we present an algorithm for training MSBM using binary masks of objects and the seeds which approximately correspond to the locations of objects parts. The latter can be obtained from part-based detectors in an unsupervised manner. We derive a latent variable model and an EM-like training procedure for adjusting the weights of MSBM using a deep learning framework. We show that the model trained by our method outperforms SBM in the tasks related to binary shapes and is very close to the original MSBM in terms of quality of multilabel shapes.
We present a new combined approach for monocular model-based 3D tracking. A preliminary object pose is estimated by using a keypoint-based technique. The pose is then refined by optimizing the contour energy function. The energy determines the degree of correspondence between the contour of the model projection and the image edges. It is calculated based on both the intensity and orientation of the raw image gradient. For optimization, we propose a technique and search area constraints that allow overcoming the local optima and taking into account information obtained through keypoint-based pose estimation. Owing to its combined nature, our method eliminates numerous issues of keypoint-based and edge-based approaches. We demonstrate the efficiency of our method by comparing it with state-of-the-art methods on a public benchmark dataset that includes videos with various lighting conditions, movement patterns, and speed.
In our recent papers, we proposed a new family of residual convolutional neural networks trained for semi-dense and sparse depth reconstruction without use of RGB channel. The proposed models can be used in low-resolution depth sensors or SLAM methods estimating partial depth with certain distributions. We proposed using perceptual loss for training depth reconstruction in order to better preserve edge structure and reduce over-smoothness of models trained on MSE loss alone.
This paper contains reproducibility companion guide on training, running and evaluating suggested methods, while also presenting links on further studies in view of reviewers comments and related problems of depth reconstruction.
Modern biometric systems based on face recognition demonstrate high recognition quality, but they are vulnerable to face presentation attacks, such as photo or replay attack. Existing face anti-spoofing methods are mostly based on texture analysis and due to lack of training data either use hand-crafted features or ﬁne-tuned pretrained deep models. In this paper we present a novel CNN-based approach for face anti-spoofing, based on joint analysis of the presence of a spoofing medium and eye blinking. For training our classifiers we propose the procedure of synthetic data generation which allows us to train powerful deep models from scratch. Experimental analysis on the challenging datasets (CASIA-FASD, NUUA Imposter) shows that our method can obtain state-of-the-art results.
ICCV is the premier international computer vision event comprising the main conference and several co-located workshops and tutorials. With its high quality and low cost, it provides an exceptional value for students, academics and industry researchers.
The authors consider the problem of human pose estimation using probabilistic convolutional neural networks. They explore ways to improve human pose estimation accuracy on standard pose estimation benchmarks MPII human pose and Leeds Sports Pose (LSP) datasets using frameworks for probabilistic deep learning. Such frameworks transform deterministic neural network into a probabilistic one and allow sampling of independent and equiprobable hypotheses (different outputs) for a given input. Overlapping body parts and body joints hidden under clothes or other obstacles make the problem of human pose estimation ambiguous. In this context to get accurate estimation of joints’ position they use uncertainty in network's predictions, which is represented by variance of hypotheses, provided by a probabilistic convolutional neural network, and confidence is characterised by mean of them. Their work is based on current CNN cascades for pose estimation. They propose and evaluate three probabilistic convolutional neural networks built on top of deterministic ones with two probabilistic deep learning frameworks – DISCO networks and Bayesian SegNet. The authors evaluate their models on standard pose estimation benchmarks and show that proposed probabilistic models outperform base deterministic ones.
Human gait or walking manner is a biometric feature that allows identification of a person when other biometric features such as the face or iris are not visible. In this study, the authors present a new pose-based convolutional neural network model for gait recognition. Unlike many methods that consider the full-height silhouette of a moving person, they consider the motion of points in the areas around human joints. To extract motion information, they estimate the optical flow between consecutive frames. They propose a deep convolutional model that computes pose-based gait descriptors. They compare different network architectures and aggregation methods and experimentally assess various body parts to determine which are the most important for gait recognition. In addition, they investigate the generalisation ability of the developed algorithms by transferring them between datasets. The results of these experiments show that their approach outperforms state-of-the-art methods.
A model for organizing cargo transportation between two node stations connected by a railway line which contains a certain number of intermediate stations is considered. The movement of cargo is in one direction. Such a situation may occur, for example, if one of the node stations is located in a region which produce raw material for manufacturing industry located in another region, and there is another node station. The organization of freight traﬃc is performed by means of a number of technologies. These technologies determine the rules for taking on cargo at the initial node station, the rules of interaction between neighboring stations, as well as the rule of distribution of cargo to the ﬁnal node stations. The process of cargo transportation is followed by the set rule of control. For such a model, one must determine possible modes of cargo transportation and describe their properties. This model is described by a ﬁnite-dimensional system of diﬀerential equations with nonlocal linear restrictions. The class of the solution satisfying nonlocal linear restrictions is extremely narrow. It results in the need for the “correct” extension of solutions of a system of diﬀerential equations to a class of quasi-solutions having the distinctive feature of gaps in a countable number of points. It was possible numerically using the Runge–Kutta method of the fourth order to build these quasi-solutions and determine their rate of growth. Let us note that in the technical plan the main complexity consisted in obtaining quasi-solutions satisfying the nonlocal linear restrictions. Furthermore, we investigated the dependence of quasi-solutions and, in particular, sizes of gaps (jumps) of solutions on a number of parameters of the model characterizing a rule of control, technologies for transportation of cargo and intensity of giving of cargo on a node station.
Event logs collected by modern information and technical systems usually contain enough data for automated process models discovery. A variety of algorithms was developed for process models discovery, conformance checking, log to model alignment, comparison of process models, etc., nevertheless a quick analysis of ad-hoc selected parts of a journal still have not get a full-fledged implementation. This paper describes an ROLAP-based method of multidimensional event logs storage for process mining. The result of the analysis of the journal is visualized as directed graph representing the union of all possible event sequences, ranked by their occurrence probability. Our implementation allows the analyst to discover process models for sublogs defined by ad-hoc selection of criteria and value of occurrence probability
The geographic information system (GIS) is based on the first and only Russian Imperial Census of 1897 and the First All-Union Census of the Soviet Union of 1926. The GIS features vector data (shapefiles) of allprovinces of the two states. For the 1897 census, there is information about linguistic, religious, and social estate groups. The part based on the 1926 census features nationality. Both shapefiles include information on gender, rural and urban population. The GIS allows for producing any necessary maps for individual studies of the period which require the administrative boundaries and demographic information.
Existing approaches suggest that IT strategy should be a reflection of business strategy. However, actually organisations do not often follow business strategy even if it is formally declared. In these conditions, IT strategy can be viewed not as a plan, but as an organisational shared view on the role of information systems. This approach generally reflects only a top-down perspective of IT strategy. So, it can be supplemented by a strategic behaviour pattern (i.e., more or less standard response to a changes that is formed as result of previous experience) to implement bottom-up approach. Two components that can help to establish effective reaction regarding new initiatives in IT are proposed here: model of IT-related decision making, and efficiency measurement metric to estimate maturity of business processes and appropriate IT. Usage of proposed tools is demonstrated in practical cases.