keywords:"mfcc" - Search Results - Digital Repository

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: keywords:"mfcc"

Search:



Search Tips :: Simple Search

Search collections:

Sort by:	Display results:	Output format:

	Logopedic defect analysis and recognition in speech utterances Diviš, Jan ; Atassi, Hicham (referee) ; Smékal, Zdeněk (advisor) This bachelor's thesis deals with logopaedia mistake called dyslalie and its characteristics. I described the process creation and representation of speech. There are presented bases of processing and analyses speech signal ( LPC, cepstral, MFCC). I presented characteristics of speech and calculation of LPC, cepstral and Mel-frequency cepstral coefficients in the programme MATLAB. The bachelor's thesis includes problems of incorrect pronunciation sound "r" and "ř". Detailed record
	Speech recognition using Sphinx-4 Kryške, Lukáš ; Uher, Václav (referee) ; Burget, Radim (advisor) This diploma thesis is aimed to find an effective method for continuous speech recognition. To be more accurate, it uses speech-to-text recognition for a keyword spotting discipline. This solution is able to be applicable for phone calls analysis or for a similar application. Most of the diploma thesis describes and implements speech recognition framework Sphinx-4 which uses Hidden Markov models (HMM) to define a language acoustic models. It is explained how these models can be trained for a new language or for a new language dialect. Finally there is in detail described how to implement the keyword spotting in the Java language. Detailed record
	Computer analysis of sport matches Židlík, Pavel ; Balík, Miroslav (referee) ; Atassi, Hicham (advisor) This work deals with the possibility of a fast football match analysis from audio part of record with the possibility of implementation of some methods for other than football matches as well. The first intention was concentrated on detection of whiz of the soccer whistle that has specific frequency in its specter, which is out of common speech frequency. After detection harmonic frequency , the attention was focused on the definition of whiz meaning. Referee was helpful with the issue as he informed me about the number of whiz styles and provided me with referential samples for whiz classification. Neural network with back propagation was used for definition of whiz meaning. Another subject for detection of important moments of the match was concentration on the commentator’s basic tone. In case the commentator is really excited with the match, his basic speech tone automatically intensifies with every important action of the game. Analysis of commentator’s intensified basic speech tone was realized in this work too. Also the national hymns of teams playing against each other are a significant moment of the match. That is why detection of a hymn became another subject of analysis. Advantages of MFCC were used to obtain audio signal feature, from which 20 coefficients were gained. These were used as an entrance for classifier based on neural network with back propagation. For easy usage of these methods a graphic user interface with possibility of well-arranged look on gained results and also with possibility of replaying chosen section was created. Detailed record
	Decoder for key word detection system Krotký, Jan ; Míča, Ivan (referee) ; Pfeifer, Václav (advisor) The essay presents the basic characteristics of human speech recognition, describes systems for the detection of key words and further deals with the proposal of each decoder blocks divided into three chapters. The first one describes the operations that are performed before the signal distribution of the framework and the segmentation. The second chapter describes the calculation of short-term energy, the number of zero passes and self-correlative, prediction and Mel-frequency cepstral coefficients. The third chapter, which describes the design of the block decoder, describes the method of dynamic time destruction and the method based on hidden Markov model. The final part of the essay describes decoders working with a speech and a proposal for a simple decoder working with isolated words, which was based issued and tested based on the preceding chapters. Detailed record
	Simple text-independent voice lock - speaker verification software system Kotulek, Milan ; Dolenský,, Jan (referee) ; Staněk, Miroslav (advisor) A brief introduction into biometrics is described in this thesis leading to description and to design a solution of verification system using speech analysis. The designed system provides firstly basic signal processing, then vowel recognition in fluent Czech speech. For each found vowel, observed speech features are calculated. The created GUI application was tested on created speaker database and its efficiency is approximately 54 % for short testing utterances, and approx. 88 % for long testing utterances respectively. Detailed record
	Analysis of Parkinson's disease using segmental speech parameters Mračko, Peter ; Mekyska, Jiří (referee) ; Smékal, Zdeněk (advisor) This project describes design of the system for diagnosis Parkinson’s disease based on speech. Parkinson’s disease is a neurodegenerative disorder of the central nervous system. One of the symptoms of this disease is disability of motor aspects of speech, called hypokinetic dysarthria. Design of the system in this work is based on the best known segmental features such as coefficients LPC, PLP, MFCC, LPCC but also less known such as CMS, ACW and MSC. From speech records of patients affected by Parkinson’s disease and also healthy controls are calculated these coefficients, further is performed a selection process and subsequent classification. The best result, which was obtained in this project reached classification accuracy 77,19%, sensitivity 74,69% and specificity 78,95%. Detailed record
	Fixed-Point Implementation Speech Recognizer Král, Tomáš ; Černocký, Jan (referee) ; Burget, Lukáš (advisor) Master thesis is related to the problematics of automatic speech recognition on systems with restricted hardware resources - embedded systems. The object of this work was to design and implement speech recognition system on embedded systems, that do not contain floating-point processing units. First objective was to choose proper hardware architecture. Based on the knowledge of available HW resources, the recognition system design was made. During the system development, optimalization was made on constituent elements so they could be mounted on chosen HW. The result of the the project is successful recognition of Czech numerals on embedded system. Detailed record
	Application of statistical analysis of speech in patients with Parkinson's disease Bijota, Jan ; Mžourek, Zdeněk (referee) ; Galáž, Zoltán (advisor) This thesis deals with speech analysis of people who suffer from Parkinson’s disease. Purpose of this thesis is to obtain statistical sample of speech parameters which helps to determine if examined person is suffering from Parkinson’s disease. Statistical sample is based on hypokinetic dysarthria detection. For speech signal pre-processing DC-offset removal and pre-emphasis are used. The next step is to divide signal into frames. Phonation parameters, MFCC and PLP coefficients are used for characterization of framed speech signal. After parametrization the speech signal can be analyzed by statistical methods. For statistical analysis in this thesis Spearman’s and Pearson’s correlation coefficients, mutual information, Mann-Whitney U test and Student’s t-test are used. The thesis results are the groups of speech parameters for individual long czech vowels which are the best indicator of the difference between healthy person and patient suffering from Parkinson’s disease. These result can be helpful in medical diagnosis of a patient. Detailed record
	Estimation of formant frequencies using machine learning Káčerová, Erika ; Galáž, Zoltán (referee) ; Mekyska, Jiří (advisor) This Master's thesis deals with the issue of formant extraction. A system of scripts in Matlab interface is created to generate values of the first three formant frequencies from speech recordings with the use of Praat and Snack(WaveSurfer). Mel Frequency Cepstral Coefficients and Linear Predictive Coefficients are extracted from the audio files in order to be added to the database. This database is then used to train a neural network. Finally, the designed neural network is tested. Detailed record
	Multiplatform Application for Speaker Verification Görig, Jan ; Matějka, Pavel (referee) ; Glembek, Ondřej (advisor) Bachelor thesis considers speaker recognition without knowledge of spoken message. There are described current feature extraction methods and their evaluation using Gaussian mixture model. The practical output of this work is application for visualization of the recognition process. Developed application is cross platform and it uses Qt and BSAPI libraries. Detailed record

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English