|
Microprosody analysis
Přibil, Jiří ; Vích, Robert
In the contribution the statistical and spectral analyses of the microintonation component for several speakers are performed and used for the synthesis of a FIR digital filter for suppresion of the microintonation signal prior to the decomposition of the virtual melody contour into the sentence and word melody.
|
|
Czech triphone synthesis of female voice
Horák, Petr ; Hesounová, Alžběta
The new triphone inventory of a female voice for TTS system has been finished. The motivation for its creation was the fact that there was no female voice synthesis for Czech that would be at our disposal, although it is needed in various applications. The corpus of the texts used for labelling of the new inventory consisted of 550 sentences. The texts were read by a professional female speaker who was instructed to pronounce the sentences with ideally a monotonous prosody, at a constant speech rate.
|
| |
|
NNLab - platform base for text-to-speech synthesis
Santarius, J. ; Tučková, Jana
Prosody modelling in synthetic speech is one of the possible applications of artificial neural nets (ANN). The training and testing files for prosody modelling by ANN must be large; therefore it is useful to automatize the process. The modular program system NNLab is a powerful tool for an easy-to-operate ANN system for prosody modelling of synthetic speech. In the center of NNL environment is a database that contains data that characterize the ANN. The system is done in MATLAB, V5.2, NN Toolbox V2.O.
|
|
FIR vocal tract model
Vích, Robert
In the paper a new parametric speech modelling approach based on homomorphic signal processing using spectral analysis is presented. The exponential function in the cepstral vocal tract model is approximated by a finite MacLaurin expansion, which is implemented by a FIR digital filter. The cepstral coefficients are used as coefficients of an another FIR digital filter, which is introduced in the FIR digital filter instead of the delay blocks.
|
|
New design of combined inventory for Czech text-to-speech synthesis
Hesounová, Alžběta
A new inventory for Czech text-to-speech synthesis is currently developed. Its core consists of triphone segments, the number of all segments being about 1850. Apart from triphones, the inventory will also contain separate segments for vowel bodies and sentence-initial and sentence-final consonants. A special attention is given to consonants in clusters that are treated with respect to the neighbouring speech sounds. The new system is going to work on 16kHz sampling frequency.
|
| |
| |
| |
|
Cepstral speech model, Padé approximation, excitation and gain matching in cepstral speech synthesis
Vích, Robert
In the contribution the principle of cepstral speech modeling based on continued fraction expansion of exponential functions, the Padé approximation, is shortly described. Further an improvement of the cepstral speech synthesis is presented. The RMS value of the exact residual signal corresponding to the cepstral vocal tract model is estimated in the frequency domain using the Parseval's theorem and is applied to excitation and gain matching of the cepstral speech production model.
|