National Repository of Grey Literature 95 records found  beginprevious86 - 95  jump to record: Search took 0.22 seconds. 
Very limited Vocabulary Speech Recognizer
Vystavěl, Kamil ; Míča, Ivan (referee) ; Sysel, Petr (advisor)
This bachelor thesis deals with the implementation of voice diagnostic method with limited number of recognized words in Matlab environment. Recognizer is designed for recognition of isolated words and is based on the dynamic programming method. This method is realized by the dynamic time warping algorithm (DTW). Features of the speech signal are calculated by methods of short-term analysis in time and frequency domain and by methods that are based on cepstral analysis and linear predictive analysis. The representation of the word, which is generated from its features, is suitable for quantifying the degree of similarity with the representation of another word. In order to achieve the highest degree of similarity, the dynamic time warping algorithm eliminates influence of fluctuation of the speech rate by non-linear normalization time axis of one of the compared words. The degree of the similarity of the two compared words is enumerated as the words’ distance. The representations of known words are stored in a word-book. The unknown word is compared with all words in the word-book and recognizer calculates distances between every known word and the unknown word. The unknown word is defined as identical with the known word that has the shortest distance to the unknown word. The successfulness depends mainly on the choice of the features.
Online detection of simple voice commands in audiosignal
Zezula, Miroslav ; Březina, Lukáš (referee) ; Krejsa, Jiří (advisor)
This thesis describes the development of voice module, that can recognize simple speech commands by comparation of input sound with recorded templates. The first part of thesis contains a description of used algorithm and a verification of its functionality. The algorithm is based on Mel-frequency cepstral coefficients and dynamic time warping. Thereafter the hardware of voice module is designed, containing signal controller 56F805 from Freescale. The signal from microphone is conditioned by operational amplifiers and digital filter. The third part deals with the development of software for the controller and describes the fixed point implementation of the algorithm, respecting limited capabilities of the controller. Final test proves the usability of voice module in low-noise environment.
Voice Sample database design for speech recognition purposes
Grobelný, Petr ; Malý, Jan (referee) ; Pfeifer, Václav (advisor)
Práce se zabývá rozpoznáváním řeči a tvorbou řečové databáze, která bude sloužit jako trénovací a testovací data pro systém rozpoznávání řeči. Daný korpus je navrhnut jako databáze čtené řeči. V teoretické části je čtenář seznámen s pojmem rozpoznávání řeči a je hlouběji uveden do problematiky. Praktická část se skládá z podrobného postupu vytvoření databáze čtené řeči. Samotná databáze je prezentována na přiloženém médiu. V závěru práce je přiložena potřebná dokumentace celé databáze.
Speech recognition using Sphinx-4
Kryške, Lukáš ; Uher, Václav (referee) ; Burget, Radim (advisor)
This diploma thesis is aimed to find an effective method for continuous speech recognition. To be more accurate, it uses speech-to-text recognition for a keyword spotting discipline. This solution is able to be applicable for phone calls analysis or for a similar application. Most of the diploma thesis describes and implements speech recognition framework Sphinx-4 which uses Hidden Markov models (HMM) to define a language acoustic models. It is explained how these models can be trained for a new language or for a new language dialect. Finally there is in detail described how to implement the keyword spotting in the Java language.
Decoder for key word detection system
Krotký, Jan ; Míča, Ivan (referee) ; Pfeifer, Václav (advisor)
The essay presents the basic characteristics of human speech recognition, describes systems for the detection of key words and further deals with the proposal of each decoder blocks divided into three chapters. The first one describes the operations that are performed before the signal distribution of the framework and the segmentation. The second chapter describes the calculation of short-term energy, the number of zero passes and self-correlative, prediction and Mel-frequency cepstral coefficients. The third chapter, which describes the design of the block decoder, describes the method of dynamic time destruction and the method based on hidden Markov model. The final part of the essay describes decoders working with a speech and a proposal for a simple decoder working with isolated words, which was based issued and tested based on the preceding chapters.
Voice recognition of standard PILOT-CONTROLLER control commands
Kufa, Tomáš ; Polách, Petr (referee) ; Honzík, Petr (advisor)
The subject of this graduation thesis is an application of speech recognition into ATC commands. The selection of methods and approaches to automatic recognition of ATC commands rises from detailed air traffic studies. By the reason that there is not any definite solution in such extensive field like speech recognition, this diploma work is focused just on speech recognizer based on comparison with templates (DTW). This recognizor is in this thesis realized and compared with freely accessible HTK system from Cambrige University based on statistic methods making use of Hidden Markov models. The usage propriety of both methods is verified by practical testing and results evaluation.
Detection of speech disorders
Struhař, Michal ; Rajmic, Pavel (referee) ; Sysel, Petr (advisor)
This thesis deals with detection of speech disorders. One of the aims of this thesis is choosing suitable parameterization: short-time energy, zero-crossing rate, linear predictive analysis, perceptual linear predictive analysis, RASTA method, cepstral analysis and mel-frequency cepstral coefficient can be chosed for detections. Next aim is construction of detector of speech disorders based on DTW (Dynamic Time Warping) and artificial neuron network. Single detection proceeds on the base of collected tokens from chosen analysis and phonetic transcription of speech. Analyses, detector and phonetic transcription of Czech language are implemented in simulation environment of MATLAB.
Speech Recognition (digit)
Kantar, Martin ; Minář, Petr (referee) ; Matoušek, Radomil (advisor)
The aim of this diploma thesis is to explain what speech is and what are its constituents. I mention commonly used methods which are used for preparation of signals which we use for recognition. Schematic examples show principles of current recognizers of speech, their advantages and disadvantages. I made speech recognition program for 0-9 numerals in Matlab for neural nets learning.
Forms of text data input and processing in business information systems - trends and current practices
Válková, Jana ; Stanovská, Iva (advisor) ; Hais, Petr (referee)
This thesis introduces readers to the basic types of the text and information inputs and processing to the computer. Thesis also includes historical contexts, current trends and future perspective of computer data input technologies and their use in practice. The first part of the thesis is a summary of a particular forms of entering and processing of the text data and information. The following part presents technological trends on the market concentrated on the automatic speech recognition systems along with the possibilities of their application in the business sphere. The rest of the thesis consists of a survey between Czech IT companies and based on it's results comes a suggestion of which technologies should be used as a part of the information systems.
Použití RLPC inventářů systému Festival v Eposu
Chaloupka, Zdeněk ; Horák, Petr
The aim of this paper is to describe a possibility of the new voices implementation into the Epos text-to-speech (TTS) system. We implemented voices from the Festival TTS system. This system synthesizes text from speech units, which are stored in an inventory file as Residual Linear Prediction Coding (RLPC) coefficients. The inventory file provides every information needed for the text synthesis. The text is synthesized in the MROLA format, thus a phoneme length (and a prosody) can be determined directly.

National Repository of Grey Literature : 95 records found   beginprevious86 - 95  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.