Phonetic encoding method in the isolated words recognition problem
A phonetic approach to the problem of automatic recognition of isolated words is investigated.The phonetic encoding method whereby each word from a vocabulary is associated with the code sequenceof stable phonemes is proposed. The informationtheoretical estimate of vocabulary confusability, the calcuations of which rely on the phonetic database of a speaker and the communications channel SNR, is synthesized using the Kullback–Leibler divergence properties. In the experimental study of the proposed method,the mutual influence between the recognition quality and the proposed estimate of confusability is demonstrated by solving the problem of recognition of words in the Russian speech. It is established that the introduced requirement to isolated syllable pronunciation makes it possible to attain the 90–95% accuracy of recognition for vocabularies containing 2000 words.
The definition of a phoneme as a fuzzy set of minimal speech units from the model database is proposed. On the basis of this definition and the Kullback-Leibler minimum information discrimination principle the novel phoneme recognition algorithm has been developed as an enhancement of the phonetic decoding method. The experimental results in the problems of isolated vowels recognition and word recognition in Russian are presented. It is shown that the proposed method is characterized by the increase of recognition accuracy and reliability in comparison with the phonetic decoding method
The paper considers the phoneme recognition by facial expressions of a speaker in voice-activated control systems. We have developed a neural network recognition algorithm by using the phonetic words decoding method and the requirement for isolated syllable pronunciation of voice commands. The paper presents the experimental results of viseme (facial and lip position corresponding to a particular phoneme) classification of Russian vowels. We show the dependence of the classification accuracy on the used classifier (multilayer feed-forward network, support vector machine, k-nearest neighbor method), image features (histogram of oriented gradients, eigenvectors, SURF local descriptors) and the type of camera (built-in or Kinect one). The best accuracy of speaker-dependent recognition is shown to be 85% for a built-in camera and 96% for Kinect depth maps when the classification is performed with the histogram of oriented gradients and the support vector machine.
In this paper we consider the automatic emotions recognition problem, especially the case of digital audio signal processing. We consider and verify an approach in which the classification of a sound fragment is reduced to the problem of image recognition. The waveform and spectrogram are used as a visual representation of the image. The computational experiment was done based on Radvess open dataset including 8 different emotions: "neutral", "calm", "happy," "sad," "angry," "scared", "disgust", "surprised". The best accuracy result was 64%, which was produced by a combination of “|spectrogram + convolution neural network VGG-11”
In this paper we consider the automatic emotions recognition problem, especially the case of digital audio signal processing. We consider and verify an straight forward approach in which the classification of a sound fragment is reduced to the problem of image recognition. The waveform and spectrogram are used as a visual representation of the image. The computational experiment was done based on Radvess open dataset including 8 different emotions: “neutral”, “calm”, “happy,” “sad,” “angry,” “scared”, “disgust”, “surprised”. Our best accuracy result 71% was produced by combination “melspectrogram + convolution neural network VGG-16”.
The problem of management of the nonlinear object which is exposed to impact of uncontrollable indignations, is considered in a key of differential game. Synthesis of optimum managements is made with application of transformation of the nonlinear equation of initial object in the differential equation with the parameters depending on a condition. The square-law functional of quality allows to formulate synthesis conditions in the form of need of search of solutions of the equation of Rikkati. The solution of the equation of Rikkati with the parameters depending on a condition, is in a symbolical view with application of algebraic methods that allows to generalize a number of earlier published theoretical results, to receive rather constructive decisions in a number of statements of problems of management.
The article is based upon the fact that the growing demand for master data management systems has not yet produced a commonly accepted metodology for their design and development/ The article offers two mathematical models? that allow a master data management systems designer a way to formally describe their system before development and verify the system quality by measurements? unique to master data management systems.