In this article, we focus on the isolated voice command recognition for autonomous man-machine and intelligent robotic systems. We propose to create a grammar model for a small testing command set with self-loops for each state to return blank symbols for noise and out-of-vocabulary words. In addition, we use single arc connected beginning and ending of the grammar in order to filter unknown commands. As a result, the grammar is resistant to distortions and unexpected words near or inside of command. We implemented the proposed approach using Finite State Transducers in the Kaldi framework and examined it using self-recorded noised data with various level of signal-to-noise ratio. We compared recognition accuracy and average decision-making time of our approach with the state-of-the-art continuous speech recognition engines based on language models. It was experimentally shown that our approach is characterized by up to 60% higher accuracy than conventional offline speech recognition methods based on language models. The speed of utterance recognition is 3 times higher than speed of traditional continuous speech recognition algorithms.
In this paper, we studied the phonetic approach for voice processing. A method for automatic recognition of speech signals, in which each quasistationary segment is associated with a fuzzy set of phonemes, was developed. We proposed the operation of the probabilistic triangular norm for fuzzy sets corresponding to the input frame and the nearest reference phoneme. The developed method was experimentally shown to allow a 1.5–5% reduction in the probability of erroneous recognition in comparison with known analogues.