Voice command recognition in intelligent systems using deep neural networks
In this article, we focus on the isolated voice command recognition for autonomous man-machine and intelligent robotic systems. We propose to create a grammar model for a small testing command set with self-loops for each state to return blank symbols for noise and out-of-vocabulary words. In addition, we use single arc connected beginning and ending of the grammar in order to filter unknown commands. As a result, the grammar is resistant to distortions and unexpected words near or inside of command. We implemented the proposed approach using Finite State Transducers in the Kaldi framework and examined it using self-recorded noised data with various level of signal-to-noise ratio. We compared recognition accuracy and average decision-making time of our approach with the state-of-the-art continuous speech recognition engines based on language models. It was experimentally shown that our approach is characterized by up to 60% higher accuracy than conventional offline speech recognition methods based on language models. The speed of utterance recognition is 3 times higher than speed of traditional continuous speech recognition algorithms.