Deep neural networks and maximum likelihood search for approximate nearest neighbor in video-based image recognition
We analyzed the way to increase computational efficiency of video-based image recognition methods with matching of high dimensional feature vectors extracted by deep convolutional neural networks. We proposed an algorithm for approximate nearest neighbor search. At the first step, for a given video frame the algorithm verifies a reference image obtained when recognizing the previous frame. After that the frame is compared with a few number of reference images. Each next examined reference image is chosen so that to maximize conditional probability density of distances to the reference instances tested at previous steps. To decrease the required memory space we beforehand calculate only distances from all the images to small number of instances (pivots). When experimenting with either face photos from Labeled Faces in the Wild and PubFig83 datasets or with video data from YouTube Faces we showed that our algorithm allows accelerating the recognition procedure by 1.4–4 times comparing with known approximate nearest neighbor methods.
The article is devoted to pattern recognition task with the database containing small number of samples per class. By mapping of local continuous feature vectors to a discrete range, this problem is reduced to statistical classification of a set of discrete finite patterns. It is demonstrated that Bayesian decision under the assumption that probability distributions can be estimated using the Parzen kernel and the Gaussian window with a fixed variance for all the classes, implemented in the PNN, is not optimal in the classification of a set of patterns. We presented here the novel modification of the PNN with homogeneity testing which gives an optimal solution of the latter task under the same assumption about probability densities. By exploiting the discrete nature of patterns our modification prevents the well-known drawbacks of the memory-based approach implemented in both the PNN and the PNN with homogeneity testing, namely, low classification speed and high requirements to the memory usage. Our modification only requires the storage and processing of the histograms of input and training samples. We present the results of an experimental study in two practically important tasks: 1) the problem of Russian text authorship attribution with character n-grams features; and 2) face recognition with well-known datasets (AT&T, FERET and JAFFE) and comparison of color- and gradient-orientation histograms. Our results support the statement that the proposed network provides better accuracy (1-7%) and is much more resistant to change of the smoothing parameter of Gaussian kernel function in comparison with the original PNN.
On the informatics and the software sides the questions of practical security are linked to the unstructured information processing algorithms applicable for the video array frames obtained by cross platform registration systems. Compression solutions become crucially important when the temporal evolution of the video stream exceeds the traffic capacity of the communication network. The basic image processing approach we exploited is to maintain of the highest resolution degree for the main part of the object we survey (for example, a man’s face or figure) whilst minimizing the information traffic from the image background by its artificial substitution with a homogeneous color filling. This method allowed us to obtain a significant compression rate (up to 7000).
The article is devoted to the problem of image recognition in real-time applications with a large database containing hundreds of classes. The directed enumeration method as an alternative to exhaustive search is examined. This method has two advantages. First, it could be applied with measures of similarity which do not satisfy metric properties (chi-square distance, Kullback-Leibler information discrimination, etc). Second, the directed enumeration method increases recognition speed even in the most difficult cases which seem to be very important in practical terms. In these cases many neighbors are located at very similar distances. In this paper we present the results of an experimental study of the directed enumeration method with comparison of color- and gradient-orientation histograms in solving the problem of face recognition with well-known datasets (Essex, FERET). It is shown that the proposed method is characterized by increased computing efficiency of automatic image recognition (3-12 times in comparison with a conventional nearest neighbor classifier).
An ensemble of classifiers has been built to solve the problem of video image recognition. The paper offers a way to estimate the a posteriori probability of an image belonging to a particular class in the case of an arbitrary distance and nearest neighbor method. The estimation is shown to be equivalent to the optimal naive Bayesian estimate given Kullback-Leibler divergence being used as proximity measure. The block diagram of a video image recognition system is presented. The system features automatic adaptation of the list of images of identical objects which is fed to the committee machine input. The system is tested in face recognition task using popular data bases (FERET, AT&T, Yale) and the results are discussed.
The problem of automatic detection of the moving forklift truck in video data is explored. This task is formulated in terms of computer vision approach as a moving object detection in noisy environment. It is shown that the state-of-the-art local descriptors (SURF, SIFT, FAST, ORB) are not characterized with satisfactory detection quality if the camera resolution is low, the lighting is changed dramatically and shadows are observed. In this paper we propose to use a simple mathematical morphological algorithm to detect the presence of a cargo on the forklift truck. Its first step is the estimation of the movement direction and the front part of the truck by using the updating motion history image. The second step is the application of Canny contour detection and binary morphological operations in front of the moving object to estimate simple geometric features of empty forklift. The algorithm is implemented with the OpenCV library. Our experimental study shows that the best results are achieved if the difference of the width of bounding rectangles is used as a feature. Namely, the detection accuracy is 78.7% (compare with 40% achieved by the best local descriptor), while the average frame processing time is only 5 ms (compare with 35 ms for the fastest descriptor).
The problem of management of the nonlinear object which is exposed to impact of uncontrollable indignations, is considered in a key of differential game. Synthesis of optimum managements is made with application of transformation of the nonlinear equation of initial object in the differential equation with the parameters depending on a condition. The square-law functional of quality allows to formulate synthesis conditions in the form of need of search of solutions of the equation of Rikkati. The solution of the equation of Rikkati with the parameters depending on a condition, is in a symbolical view with application of algebraic methods that allows to generalize a number of earlier published theoretical results, to receive rather constructive decisions in a number of statements of problems of management.
The article is based upon the fact that the growing demand for master data management systems has not yet produced a commonly accepted metodology for their design and development/ The article offers two mathematical models? that allow a master data management systems designer a way to formally describe their system before development and verify the system quality by measurements? unique to master data management systems.