National Repository of Grey Literature 240 records found  beginprevious21 - 30nextend  jump to record: Search took 0.01 seconds. 
Comparison of analysis of speech in dependence on age and gender
Báňa, Josef ; Smékal, Zdeněk (referee) ; Atassi, Hicham (advisor)
This thesis deals with analysis of speech signal in dependence on the gender and the age of the speaker. We tried to investigate through the features to find the best set for the automatic classification of speakers. It also contains a brief discussion about the speech signal and its characteristics. We used a program called Praat for the speech analysis purpose. This program is also described in this work. We mainly focused on the suprasegmental features of speech. Our first step was to make our own speech corpus which should contain speech records from speakers with various age and gender. We made the analysis using Praat and reported it within this thesis. For the automatic classification purpose, twelve features were selected basing on there quality criteria and used with a neural network to classify the speakers to classes with different age and gender. As it was mentioned, a neural network was used as a classifier. We used “Neural Network Toolbox” in the Matlab program to create and train our networks.
Simple Dictation System
Hromádko, Michal ; Schwarz, Petr (referee) ; Szőke, Igor (advisor)
This master's thesis deals with design and developement of simple dictation system. It explains methods used for speech recognition and describes existing systems. Design of the system is focused primarily to create graphic user interface with large emphasis on user friendliness.
Keyword Detection in Speech Data
Pfeifer, Václav ; Makáň, Florian (referee) ; Dostál, Otto (referee) ; Balík, Miroslav (advisor)
Speech processing systems have been developed for many years but the integration into devices had started with the deployment of the modern powerful computational systems. This dissertation thesis deals with development of the keyword detection system in speech data. The proposed detection system is based on the Large Margin and Kernel methods and the key part of the system is phoneme classifier. Two hierarchical frame-based classifiers have been proposed -- linear and non-linear. An efficient training algorithm for each of the proposed classifier have been introduced. Simultaneously, classifier based on the Gaussian Mixture Models with the implementation of the hierarchical structure have been proposed. An important part of the detection system is feature extraction and therefor all algorithms were evaluated on the current most common feature techniques. A part of the thesis technical solution was implementation of the keyword detection system in MATLAB and design of the hierarchical phoneme structure for Czech language. All of the proposed algorithms were evaluated for Czech and English language over the DBRS and TIMIT speech corpus.
On-line annotation editor with audio visualization
Dorotovič, Viktor ; Herout, Adam (referee) ; Szőke, Igor (advisor)
The aim of this thesis is to create a web-based annotation editor, which displays the audio waveform alongside the transcribed text. A waveform viewer library was developed, which uses HTML5 canvas elements for rendering. The library allows scrolling and zooming of the waveform. Annotations are directly marked in the audio and the position of the transcribed text is synchronised with their location. The end goal is to replace an existing editor with the one being created. Therefore, a usability test was conducted to compare the two. The time needed to learn to use the application and to transcribe a short recording was reduced by 20%. The waveform viewer library was released under an open-source license.
Low bit rate voice encoders
Leitner, Jakub ; Mačák, Jaromír (referee) ; Pust, Radim (advisor)
The final thesis deals with coders and voice coders used in speech signal processing. The aim is to create an integral overview of coders and voice coders including a description of their properties, in the second part of the thesis a simulation of algorithms and methods of speech processing is performed in Matlab Simulink program.The basic methods of speech processing and a parametric LPC voice coder were simulated in time domain. In the LPC voice coder model there are implemented the algorithms for obtaining speech segment parameters. These are the algorithm for classification of voiced and unvoiced speech segment, LPC analysis and pitch detection. The output is a parametric signal that enables a receiver to synthesize a speech signal. The appendix 1 contains a list of names of coders or standard numbers of coders and their properties, the appendix 2 includes an overview of speech processing methods.
Parkinson disease diagnosis using speech signal analysis
Karásek, Michal ; Smékal, Zdeněk (referee) ; Mekyska, Jiří (advisor)
The thesis deals with the recognition of Parkinson's disease from the speech signal. The first part refers to the principles of speech signals and speech signals by patients suffering from Parkinson's disease. Further, it continues to describe the issues of speech signals processing, basic symptoms used for diagnosis of Parkinson's disease (e. g. VAI, VSA, FCR, VOT etc.) and reduction of these symptoms. The next part focuses on a block diagram of the program for the diagnosis of Parkinson's disease. The main objective of this thesis is comparison of two methods of feature selection (mRMR and SFFS). For classification have selected two different methods were used. The first method is classification kNN and second method of classification is Gaussian mixture model (GMM).
Playing of Video Depending on Speed of Speech
Hromádko, Michal ; Fapšo, Michal (referee) ; Szőke, Igor (advisor)
This bachelor's thesis discusses adding the PSOLA method into the VLC Media Player. PSOLA method is used for playing rate modification. It doesn't change base tone and understandableness of the speech.
Estimation of accuracy of speech technologies based on signal quality and audio content richness
Nezval, Jiří ; Smital, Lukáš (referee) ; Schwarz, Petr (advisor)
This thesis discusses theoretical analysis of the origin of speech, introduces applications of speech technologies and explains the contemporary approach to phonetical transcription of speech recordings. Furthermore, it describes the metrics of audio recordings quality assessment, which is split into two discrete classes. The first one groups signal quality metrics, while the other one groups content richness metrics. The first goal of the practical section is to create a statistical model for accuracy prediction of machine transcription of speech recordings based on a measurement of their quality. The second goal is to evaluate which partial metrics are the most essential for accuracy prediction of machine transcription.
Analysis of prosodic and spectral properties of voice communication in air traffic control
Simonides, Jakub ; Kopřiva, Tomáš (referee) ; Smékal, Zdeněk (advisor)
This thesis analyses the prosodic and spectral features of bi-directional air traffic control communication, describes how to communication was split to segments, according to the source, via transcription. After the splitting, the segments are deeply analyzed for their spectral and prosodic features. The analysis itself, focuses on the spectral aspects of intensity, fundamental frequency F0, slope and centroid. Additionally, tempo and voice activity detection data were measured, to support the spectral aspects as well. Because of the differences between the ATC controller’s and pilots’ spectral aspects, the direction of the communication can be automatically determined, with relatively high success percentage.
Identification of pauses in noisy speech signal
Kepák, Petr ; Míča, Ivan (referee) ; Smékal, Zdeněk (advisor)
The basic problem of speech is a complete separation of the natural noise which arise from correct articulation of voiced and unvoiced consonants from noise and disturbance environment. Objective of this master’s thesis is to find an effective method that could identify the pauses without speech activity, which can identify the properties of noise and disturbance. Once the noise is correctly identified, it is already possible to use different methods for its removal. The master’s thesis describes two methods of pauses identification. These methods are programmed in Matlab and tested on nine speech recordings. Methods analysis of the results was performed using the ROC (Receiver Operating Characteristic) curves. In the end are summarized results analysis of created methods.

National Repository of Grey Literature : 240 records found   beginprevious21 - 30nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.