National Repository of Grey Literature 241 records found  beginprevious21 - 30nextend  jump to record: Search took 0.01 seconds. 
Automatic / Automated recogniton of emotional states based on utterance analysis
Pfeifer, Leon ; Atassi, Hicham (referee) ; Smékal, Zdeněk (advisor)
The diploma thesis deals with the analysis of human emotional states. The thesis consists of three parts. The first part is charcterize, the process of speech generating, from phonetic and psychological poin of view. In the second part there are proccesed metods and contextual things.(preprocessing of signal, voice activity detector). For calculation fundamental Frequency it was used metod of central clipping, another used metod is formant frequency analyse and the last is metod of determinatin of nuber of thorns and planes. In the thirt part there are proccesesed results of measurements performed by particural metods. It was scorred five different emotional states: neutral, anger, happiness, sadness and surprise. At the end of this part there are discussed results for each metod.
Comparison of analysis of speech in dependence on age and gender
Báňa, Josef ; Smékal, Zdeněk (referee) ; Atassi, Hicham (advisor)
This thesis deals with analysis of speech signal in dependence on the gender and the age of the speaker. We tried to investigate through the features to find the best set for the automatic classification of speakers. It also contains a brief discussion about the speech signal and its characteristics. We used a program called Praat for the speech analysis purpose. This program is also described in this work. We mainly focused on the suprasegmental features of speech. Our first step was to make our own speech corpus which should contain speech records from speakers with various age and gender. We made the analysis using Praat and reported it within this thesis. For the automatic classification purpose, twelve features were selected basing on there quality criteria and used with a neural network to classify the speakers to classes with different age and gender. As it was mentioned, a neural network was used as a classifier. We used “Neural Network Toolbox” in the Matlab program to create and train our networks.
Simple Dictation System
Hromádko, Michal ; Schwarz, Petr (referee) ; Szőke, Igor (advisor)
This master's thesis deals with design and developement of simple dictation system. It explains methods used for speech recognition and describes existing systems. Design of the system is focused primarily to create graphic user interface with large emphasis on user friendliness.
Keyword Detection in Speech Data
Pfeifer, Václav ; Makáň, Florian (referee) ; Dostál, Otto (referee) ; Balík, Miroslav (advisor)
Speech processing systems have been developed for many years but the integration into devices had started with the deployment of the modern powerful computational systems. This dissertation thesis deals with development of the keyword detection system in speech data. The proposed detection system is based on the Large Margin and Kernel methods and the key part of the system is phoneme classifier. Two hierarchical frame-based classifiers have been proposed -- linear and non-linear. An efficient training algorithm for each of the proposed classifier have been introduced. Simultaneously, classifier based on the Gaussian Mixture Models with the implementation of the hierarchical structure have been proposed. An important part of the detection system is feature extraction and therefor all algorithms were evaluated on the current most common feature techniques. A part of the thesis technical solution was implementation of the keyword detection system in MATLAB and design of the hierarchical phoneme structure for Czech language. All of the proposed algorithms were evaluated for Czech and English language over the DBRS and TIMIT speech corpus.
On-line annotation editor with audio visualization
Dorotovič, Viktor ; Herout, Adam (referee) ; Szőke, Igor (advisor)
The aim of this thesis is to create a web-based annotation editor, which displays the audio waveform alongside the transcribed text. A waveform viewer library was developed, which uses HTML5 canvas elements for rendering. The library allows scrolling and zooming of the waveform. Annotations are directly marked in the audio and the position of the transcribed text is synchronised with their location. The end goal is to replace an existing editor with the one being created. Therefore, a usability test was conducted to compare the two. The time needed to learn to use the application and to transcribe a short recording was reduced by 20%. The waveform viewer library was released under an open-source license.
Low bit rate voice encoders
Leitner, Jakub ; Mačák, Jaromír (referee) ; Pust, Radim (advisor)
The final thesis deals with coders and voice coders used in speech signal processing. The aim is to create an integral overview of coders and voice coders including a description of their properties, in the second part of the thesis a simulation of algorithms and methods of speech processing is performed in Matlab Simulink program.The basic methods of speech processing and a parametric LPC voice coder were simulated in time domain. In the LPC voice coder model there are implemented the algorithms for obtaining speech segment parameters. These are the algorithm for classification of voiced and unvoiced speech segment, LPC analysis and pitch detection. The output is a parametric signal that enables a receiver to synthesize a speech signal. The appendix 1 contains a list of names of coders or standard numbers of coders and their properties, the appendix 2 includes an overview of speech processing methods.
Parkinson disease diagnosis using speech signal analysis
Karásek, Michal ; Smékal, Zdeněk (referee) ; Mekyska, Jiří (advisor)
The thesis deals with the recognition of Parkinson's disease from the speech signal. The first part refers to the principles of speech signals and speech signals by patients suffering from Parkinson's disease. Further, it continues to describe the issues of speech signals processing, basic symptoms used for diagnosis of Parkinson's disease (e. g. VAI, VSA, FCR, VOT etc.) and reduction of these symptoms. The next part focuses on a block diagram of the program for the diagnosis of Parkinson's disease. The main objective of this thesis is comparison of two methods of feature selection (mRMR and SFFS). For classification have selected two different methods were used. The first method is classification kNN and second method of classification is Gaussian mixture model (GMM).
Playing of Video Depending on Speed of Speech
Hromádko, Michal ; Fapšo, Michal (referee) ; Szőke, Igor (advisor)
This bachelor's thesis discusses adding the PSOLA method into the VLC Media Player. PSOLA method is used for playing rate modification. It doesn't change base tone and understandableness of the speech.
Estimation of accuracy of speech technologies based on signal quality and audio content richness
Nezval, Jiří ; Smital, Lukáš (referee) ; Schwarz, Petr (advisor)
This thesis discusses theoretical analysis of the origin of speech, introduces applications of speech technologies and explains the contemporary approach to phonetical transcription of speech recordings. Furthermore, it describes the metrics of audio recordings quality assessment, which is split into two discrete classes. The first one groups signal quality metrics, while the other one groups content richness metrics. The first goal of the practical section is to create a statistical model for accuracy prediction of machine transcription of speech recordings based on a measurement of their quality. The second goal is to evaluate which partial metrics are the most essential for accuracy prediction of machine transcription.
Analysis of prosodic and spectral properties of voice communication in air traffic control
Simonides, Jakub ; Kopřiva, Tomáš (referee) ; Smékal, Zdeněk (advisor)
This thesis analyses the prosodic and spectral features of bi-directional air traffic control communication, describes how to communication was split to segments, according to the source, via transcription. After the splitting, the segments are deeply analyzed for their spectral and prosodic features. The analysis itself, focuses on the spectral aspects of intensity, fundamental frequency F0, slope and centroid. Additionally, tempo and voice activity detection data were measured, to support the spectral aspects as well. Because of the differences between the ATC controller’s and pilots’ spectral aspects, the direction of the communication can be automatically determined, with relatively high success percentage.

National Repository of Grey Literature : 241 records found   beginprevious21 - 30nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.