Intonation Control in Speech Synthesis for Game Voiceovers

?

Intonation Control in Speech Synthesis for Game Voiceovers

Portnova E., Butenko E., Sablin A., Monina M., Klyshinskiy E., Almasizadeh P.

In press

Voicing game cues is one of the important tasks in game development. To avoid the need to record the voice of live
actors, neural network speech synthesis technologies can be used. Despite the rapid development of generative technologies, synthesized voices often lack the variety of intonations inherent in human speech. The problem with "one-to-many" is that one text may have several suitable intonation options. The model averages expressiveness during training, which leads to a robotic sound.
The solution to this problem is possible by using prosody control to introduce variability into synthesized speech. Existing approaches often do not take into account all prosodic characteristics, or require manual marking, creating problems in controlling intonation after training. The main task of the work is to develop a method of uncontrolled prosody modeling based on acoustic and linguistic characteristics using discrete markup to improve parametric speech synthesis for voicing game.

Language: English

In book

Interdisciplinary Research in Technology and Management 2024

CRC Press Taylor and Francis, 2024.

Method of a voice source acoustic analysis in real time

Savchenko V. V., L. V. Savchenko, Measurement Techniques 2025 Vol. 68 P. 453–463

The problem of non-invasive analysis of the vocal function of the speech apparatus based on the speaker’s speech signal is addressed. A new method of acoustic analysis of a pulse-type voice source based on a two-stage measurement procedure has been developed. The first stage of measurements provides for filtering of the voice excitation signal of the vocal tract ...

Added: January 13, 2026

Integral Robot Technologies and Speech Behavior

Kharlamov A. A., Pantiukhin D., Borisov V. et al., Newcastle upon Tyne: Cambridge Scholars Publishing, 2024.

The monograph presents papers on the subject domain “Integral robot. Speech behavior”. These cover issues of a theoretical nature, including representation and processing of speech information in the human mind in the process of both text analysis and text generation, and specifically the need to use jointly working linguistic and extralinguistic models of the world, ...

Added: December 1, 2023

InterSpeech 2022

International Speech Communication Association, 2022.

Added: October 31, 2022