|
Estimation of formant frequencies using machine learning
Káčerová, Erika ; Galáž, Zoltán (referee) ; Mekyska, Jiří (advisor)
This Master's thesis deals with the issue of formant extraction. A system of scripts in Matlab interface is created to generate values of the first three formant frequencies from speech recordings with the use of Praat and Snack(WaveSurfer). Mel Frequency Cepstral Coefficients and Linear Predictive Coefficients are extracted from the audio files in order to be added to the database. This database is then used to train a neural network. Finally, the designed neural network is tested.
|
|
Assessing movement of articulatory organs based on acoustic analysis of speech
Novotný, Kryštof ; Galáž, Zoltán (referee) ; Mekyska, Jiří (advisor)
Hypokinetic dysarthria is a motor speech disorder often present during Parkinson’s disease. It affects the speech system, including articulatory abilities. There are several speech parameters describing this domain, so it is suggested to deal with their mutual comparison. This work aims to design and describe an algorithm for calculating the parameters of articulation, adapted for the Czech language, and then compare their discriminative power. The acoustic analysis of speech included in it is done via the Praat program and basic machine learning algorithms such as Expectation-Maximization, Kmeans and linear regression are used for the subsequent data processing. The Mann-Whitney U test and representatives of linear, nonlinear and ensemble machine learning models using cross-validation and balanced accuracy are used for evaluation. The results are scripts for automatic assessment of vowel space area, for calculating articulation parameters and for their evaluation. The outputs of the analysis of two different databases (PARCZ and CoBeN) prove that differences in articulation can indeed be observed between normal and dysarthric speech. Based on the mutual comparison of results, it is therefore proposed in the work which parameters and models of machine learning are being appropriate for further dealing with this issue.
|
| |
|
Assessing Movement of Articulatory Organs in Patients with Parkinson’s Disease
Novotný, K. ; Mekyska, J.
Hypokinetic dysarthria is a motor speech disorder often present during Parkinson’s disease. It affects the speech system, including articulatory abilities. There are several speech parameters describing this domain, so it is suggested to deal with their mutual comparison. This work aims to design and describe an algorithm for calculating the parameters of articulation, adapted for the Czech language, and then compare their discriminative power. The acoustic analysis of speech included in it is done via the Praat program and basic machine learning algorithms such as Expectation-Maximization, K-means and linear regression are used for the subsequent data processing. The Mann-Whitney U test, descriptive statistics and Random Forest machine learning model using cross-validation and balanced accuracy is used for evaluation. The results are scripts for automatic assessment of vowel space area, for calculating articulation parameters and for their evaluation. The outputs of the analysis of speech recording database prove that differences in articulation can indeed be observed between normal and dysarthric speech. Based on the mutual comparison of results, it is therefore proposed in the work which parameters are being appropriate for further dealing with this issue.
|
|
Assessing movement of articulatory organs based on acoustic analysis of speech
Novotný, Kryštof ; Galáž, Zoltán (referee) ; Mekyska, Jiří (advisor)
Hypokinetic dysarthria is a motor speech disorder often present during Parkinson’s disease. It affects the speech system, including articulatory abilities. There are several speech parameters describing this domain, so it is suggested to deal with their mutual comparison. This work aims to design and describe an algorithm for calculating the parameters of articulation, adapted for the Czech language, and then compare their discriminative power. The acoustic analysis of speech included in it is done via the Praat program and basic machine learning algorithms such as Expectation-Maximization, Kmeans and linear regression are used for the subsequent data processing. The Mann-Whitney U test and representatives of linear, nonlinear and ensemble machine learning models using cross-validation and balanced accuracy are used for evaluation. The results are scripts for automatic assessment of vowel space area, for calculating articulation parameters and for their evaluation. The outputs of the analysis of two different databases (PARCZ and CoBeN) prove that differences in articulation can indeed be observed between normal and dysarthric speech. Based on the mutual comparison of results, it is therefore proposed in the work which parameters and models of machine learning are being appropriate for further dealing with this issue.
|
|
Estimation of formant frequencies using machine learning
Káčerová, Erika ; Galáž, Zoltán (referee) ; Mekyska, Jiří (advisor)
This Master's thesis deals with the issue of formant extraction. A system of scripts in Matlab interface is created to generate values of the first three formant frequencies from speech recordings with the use of Praat and Snack(WaveSurfer). Mel Frequency Cepstral Coefficients and Linear Predictive Coefficients are extracted from the audio files in order to be added to the database. This database is then used to train a neural network. Finally, the designed neural network is tested.
|
| |
| |
|
Numerical investigation of acoustic characteristics of 3D human vocal tract model with nasal cavities
Vampola, T. ; Štorkán, J. ; Horáček, Jaromír ; Radolf, Vojtěch
Acoustic resonance characteristics of 3D human vocal tract model without and with nasal and\nparanasal cavities were computed. Nasal cavities (NC) form the side branches of the human vocal tract and exhibit antiresonance and resonance properties which influence the produced voice quality. Developed FE models of acoustic spaces of nasal and vocal tract for vowel /a:/ are used to study the influence of (NC) on phonation. Acoustics frequency-modal characteristics are studied by modal analysis and numerical simulation of acoustic signals in time domain is performed by transient analysis of the FE models.
|
|
Computational modelling of efect of tonsillectomy on production of czech vowels
Švancara, P. ; Horáček, Jaromír
Effects of tonsillectomy on production of Czech vowels are numerically examined. The finite element (FE) models of the acoustic spaces corresponding to the vocal tracts and acoustic space around the human head are used in numerical simulations of phonation. Models for vowels /a/,/e/,/i/,/o/ and /u/ are analyzed. The acoustic resonant characteristics of the FE models are studied using modal and transient analyses (excitation by a short pulse). The production of vowels is simulated in time domain using transient analysis of the FE models excited by Liljencrants-Fantś (LF) glottal signal. Calculated results show that tonsillectomy causes frequency shifts of some formant frequencies mostly down to lower frequencies. Biggest shifts were obtained for 2nd and 3rd formants for vowel /o/-300Hz down to lower frequencies and for 2nd (-450Hz) and 3rd(-150 Hz) formants for vowel /u/ down too. The frequency shifts of the formants are significantly dependent on position and size of the tonsils.
|