Home // CONTENT 2014, The Sixth International Conference on Creative Content Technologies // View article


RPKOM-GEN - A System for Testing Speech Recognition in Adverse Acoustic Conditions Using Speech Synthesis

Authors:
Marián Trnka
Milan Rusko
Sakhia Darjaa
Róbert Sabo
Juraj Pálfy
Štefan Beňuš
Marian Ritomský
Martin Dravecký

Keywords: speech recognition; adverse conditions; noise; speech synthesis

Abstract:
Training and testing of current state-of-the-art speech recognition systems require huge speech databases whose creation is time-consuming and expensive. This paper presents a novel approach for testing speech recognition in adverse acoustic conditions that uses speech synthesis, which facilitates optimizing and adjusting speech recognition to various environmental conditions. RPKOM-GEN is a complex system of multiple synthesizers that generates synthetic speech and testing signals with well defined characteristics. It might be used to produce public announcements, sets of utterances for spoken dialogue systems or other speech excerpts. The acoustic parameters of synthetic voices, such as speech rate, pitch, intensity, and others, can be pre-defined from a broad range of options. By using this novel technique, the system can also vary vocal effort imitating thus the Lombard effect and so-called long-distance speech. It is also possible to model the characteristics of the transmission channel since the system includes noise generators and digital effects such as the setting of environmental noise or reverberation levels. The paper presents the system architecture, describes graphical user interface and a rich array of usage possibilities, and discusses the results of pilot experiments testing the effect of added noise on speech recognition accuracy.

Pages: 17 to 21

Copyright: Copyright (c) IARIA, 2014

Publication date: May 25, 2014

Published in: conference

ISSN: 2308-4162

ISBN: 978-1-61208-342-1

Location: Venice, Italy

Dates: from May 25, 2014 to May 29, 2014