| |
| |
| |
|
Použití rozpoznávání řeči a rýmových testů k hodnocení českých systémů pro zpracování řeči
Vích, Robert ; Nouza, Jan
Speech intelligibility is the most important parameter of speech quality. In the contribution a new objective intelligibility assessment of speech processing algorithms, e.g., speech coding, synthesis, enhancement, conversion, is proposed. It is based on automatic speech recognition of rhyme tests. The idea is illustrated by comparison with listening evaluation of Czech rhyme tests and is applied for intelligibility assessment of Czech text-to-speech system and voice conversion.
|
|
Konverse hlasu
Vích, Robert ; Vondra, Martin
Different techniques for voice conversion based on nonlinear spectral envelope warping are presented. They enable transformation of an utterance of a source speaker into the utterance of a target speaker. The voice conversion algorithm is based on spectral speech analysis, frequency transformation, spectrum envelope warping and high quality cepstral speech synthesis. The described voice conversion approach also considers the change of speech prosody.
|
|
Lze použít automatické rozpoznávání řeči k hodnocení kvality řeči?
Nouza, Jan ; Vích, Robert ; Vondra, Martin
In the contribution several case studies are presented in which automatic speech recognition was tested as a means for evaluating of speech quality, either human or synthetic. Usually, speech quality is measured by subjective listening tests. Our aim is to investigate, whether these tests, which request considerable amount of human time and experience, could be replaced or supplemented by techniques based on ASR.
|
|
Konverze pohádkových hlasů pro TTS systém s kepstrálním popisem
Přibil, Jiří ; Přibilová, Anna
Our recent research in improvement of text-to-speech (TTS) synthesis was aimed at storytelling speaking style in addition to its multi-voice realization and expression of emotional states. Storytelling speaking style is suitable for applications aimed at children as well as applications aimed at blind people. In this contribution the experiments with the storytelling voice conversion performed on the short sentences of stories in Slovak and Czech are described.
|
|
Návrh vhodných prozodických modelů pro dialogové systémy
Horák, Petr
This paper deals with the improving of the synthetic prosody modeling especially with the improving of the intonation modeling. A mathematical model of the pitch contour modeling can significantly limit the complexity of intonation rules creation and increase the naturalness of resulting synthetic speech. The linear prediction intonation model implemented in TTS system Epos uses excitation by rules and provides in conjunction with a triphone time domain inventories more naturalness synthetic speech.
|
|
Současný stav vývoje českého TTS systému EPOS
Chaloupka, Zdeněk ; Horák, Petr
This contribution is focused on the current state of the Epos Text-To-Speech (TTS) system. Recently, a MBROLA-like synthesis interface has been developed. This interface synthesizes speech phone by phone, so the length and prosody points of each phone are specified. Several problems were encountered while performing MBROLA-like synthesis. These problems are associated with the speech inventory, which is primarily designed for the time domain Pitch-Synchronous OverLap-and-Add (PSOLA) synthesis.
|
| |