National Repository of Grey Literature 53 records found  beginprevious23 - 32nextend  jump to record: Search took 0.01 seconds. 
Robust detection of keywords in speech signal
Vrba, Václav ; Sysel, Petr (referee) ; Atassi, Hicham (advisor)
The master thesis is divided into two parts theoretical and practical. The theoretical part is focused on methods of analysis and detection of speech signals. In the practical part the system for isolated word recognition was created in Matlab. The system is speaker independent separately for men and women. Also two speech databases were created for further use in the aircraft cockpit. Tests and evaluations were performed even with added noise.
Acoustic Scene Classification from Speech
Dobrotka, Matúš ; Glembek, Ondřej (referee) ; Matějka, Pavel (advisor)
The topic of this thesis is an audio recording classification with 15 different acoustic scene classes that represent common scenes and places where people are situated on a regular basis. The thesis describes 2 approaches based on GMM and i-vectors and a fusion of the both approaches. The score of the best GMM system which was evaluated on the evaluation dataset of the DCASE Challenge is 60.4%. The best i-vector system's score is 68.4%. The fusion of the GMM system and the best i-vector system achieves score of 69.3%, which would lead to the 20th place in the all systems ranking of the DCASE 2017 Challenge (among 98 submitted systems from all over the world).
Automatic vocal-oriented recognition of human emotions
Houdek, Miroslav ; Přinosil, Jiří (referee) ; Atassi, Hicham (advisor)
This master thesis concerns with emotional states and gender recognition on the basis of speech signal analysis. We used various prosodic and cepstral features for the description of the speech signal. In the text we describe non-invasive methods for glottal pulses estimation. The described features of speech were implemented in MATLAB. For their classification we used the GMM classifier, which uses the Gaussian probability distribution for modeling a feature space. Furthermore, we constructed a system for recognition of emotional states of the speaker and a system for gender recognition from speech. We tested the success of created systems with several features on speech signal segments of various lengths and compared the results. In the last part we tested the influence of speaker and gender on the success of emotional states recognition.
Robot with autonomous audio-video control
Dvořáček, Štěpán ; Mašek, Jan (referee) ; Přinosil, Jiří (advisor)
This thesis describes the design and realization of a mobile robot with autonomous audio-visual control. This robot is able of movement based on sensors consisting of camera and microphone. The mechanical part consists of components made with 3D print technology and omnidirectional Mecanum wheels. Software utilizes OpenCV library for image processing and computes MFCC a DTW for voice command detection.
Tool for Automatic Subtitle Alignment
Chudý, Daniel ; Chlubna, Tomáš (referee) ; Milet, Tomáš (advisor)
The main motivation of this work is to simplify the retiming of subtitles, where the inputs are the original and edited video files and subtitles corresponding to the original video. Example – a video editor needs to cut a scene from a video. Subtitles that correspond with the clipped part of the video must be manually removed and the subtitle part that follows the clipped part must be manually re-timed. The tool makes this work easier by automating it. From an arbitrarily edited version of the video file (cuts), the original file and the original subtitles, a version of subtitles will be created that fits the edited version of the video file. Simply put, the goal is to align the original subtitles with the edited video file. The solution is the conversion of video files to audio files (.wav, wavfile), extraction of MFCC (Mel-frequency cepstral coefficients) and subsequent mutual comparison with the Dynamic Time Warping (DTW) algorithm. From the alignment path (DTW output), signal differences (cuts in the video) are detected and subtitles are adjusted based on them. A dataset was created to test the application consisisting of public domain films and own recordings. The created application provides 69 – 90 % subtitle alignment success on a dataset that contains videos of length 1 – 60 minutes.
Real-time voice command recognition system
Šíbl, Evžen ; Kiac, Martin (referee) ; Přinosil, Jiří (advisor)
The bachelor thesis deals with the development of a system for voice command recognition. The classifier of this system was created using a neural network. In this thesis you will learn about the history and problems of speech recognition. A system has been created that detects a section in a recording containing a speech signal, which then uses the classifier to decide what word from the word table it is. Three models with the same architecture but with different training data were created. These models were then compared with each other. A simple user interface was created for the resulting system.
The relation of emotions and intonation curves
Gavlasová, Radka ; Smékal, Zdeněk (referee) ; Tučková,, Jana (advisor)
This thesis deals with intonation curves and their relation to human emotions. Besides the theoretical part where you can learn about speech production, signal processing and psychological distribution of emotions, there is also a unique database recorded with the help of two professional actors. The main goal of this thesis is to classify created data using artificial neural networks into four classes. Those classes are anger, joy, boredom and sadness. The practical part was implemented in a programming platform called Matlab using Classification Learner app. Features used for this method were variations of fundamental frequency and MFCC. The results were compared with a listening survey so that it could be determined whether the results provided by neural network are relevant to some kind of a human factor. Success rate of the trained models reached 82 %, new data testing reached 75 %. Listening survey confirmed that the results correspond to the assumption of human perception. Better success rate would be accomplished by using a bigger set of higher quality data.
Signal Processing For Vocal Recognition Of Sturnus Vulgaris
Lázničková, Jana
This paper describes the issue of sturnus vulgaris detection in the vineyards in order toscare these animals more effectively. The analysis and classification of bird singing is difficultbecause many problems can appear. One of the problems is background noise e.g. sounds of cars,trees, and also the singing of various bird species at once. Another problem is different types of birdsongs. For example, an alarm melody, search for food, and also communication between the birdsduring a flight. This article presents a solution to one of these problems in case when only audiorecordings are available.
Deep learning based sound records analysis
Kramář, Denis ; Říha, Kamil (referee) ; Přinosil, Jiří (advisor)
This master thesis deals with the problem of audio-classification of the chainsaw logging sound in natural environment using mainly convolutional neural networks. First, a theory of grafical representation of audio signal is discussed. Following part is devoted to the machine learning area. In third chapter, some of present works dealing with this problematics are given. Within the practical part, used dataset and tested neural networks are presented. Final resultes are compared by achieved accuracy and by ROC curves. The robustness of the presented solutions was tested by proposed detection program and evaluated using objective criteria.
Fixed-Point Implementation Speech Recognizer
Král, Tomáš ; Černocký, Jan (referee) ; Burget, Lukáš (advisor)
Master thesis is related to the problematics of automatic speech recognition on systems with restricted hardware resources - embedded systems. The object of this work was to design and implement speech recognition system on embedded systems, that do not contain floating-point processing units. First objective was to choose proper hardware architecture. Based on the knowledge of available HW resources, the recognition system design was made. During the system development, optimalization was made on constituent elements so they could be mounted on chosen HW. The result of the the project is successful recognition of Czech numerals on embedded system.

National Repository of Grey Literature : 53 records found   beginprevious23 - 32nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.