Сборник трудов IV Международной конференции и молодёжной школы "Информационные технологии и нанотехнологии" (ИТНТ 2018)
The task of organizing information in video surveillance systems is implemented by grouping the video tracks, which contain identical faces. We examine aggregation methods for the features of individual frames extracted using deep convolutional neural networks. The tracks with identical faces are grouped based on known face verification algorithms and clustering methods. Experimental study on the YouTubeFaces dataset demonstrates results of combining frame features in order to obtain a descriptor of video track. It is shown that the most accurate method is L2-normalization of average unnormalized features of individual frames of each video track.
In this paper we examine the age and gender video-based recognition problem using deep convolutional neural networks. The comparative analysis of classifier fusion algorithms to aggregate decisions for individual frames is presented. In order to improve the age and gender identification accuracy we implement the video-based recognition system with several aggregation methods. We provide the experimental comparison for IJB-A, Indian Movies and Kinect datasets. It is demonstrated that the most accurate decisions are obtained using the geometric mean and mathematical expectation of the outputs at softmax layers of the convolutional neural networks for gender recognition and age prediction, respectively.
To improve the performance of remote sensing images multiclass classification we propose two greedy algorithms of feature selection. The discriminant analysis criterion and regression coefficients are used as the measure of feature subset effectiveness in the first and second methods, respectively. The main benefit of the built algorithms is that they estimate not the individual criterion for each feature, but the general effectiveness of the feature subset. As there is a big limitation on the number of real remote sensing images, available for the analysis, we apply the Markov random model to enlarge the image dataset. As the pattern for image modelling, a random image belonging to one of the 7 classes from the UC Merced Land-Use dataset has been used. Feature shave been extracted with help of MaZda software. As the result, the largest fraction of correctly classified images accounts for 95%. Dimension of the initial feature space consisting of 218 features has been reduced to 15 features, using the greedy strategy of removing a feature, based on the linear regression model.