National Repository of Grey Literature 134 records found  beginprevious41 - 50nextend  jump to record: Search took 0.00 seconds. 
Activity of Neural Network in Hidden Layers - Visualisation and Analysis
Fábry, Marko ; Grézl, František (referee) ; Karafiát, Martin (advisor)
Goal of this work was to create system capable of visualisation of activation function values, which were produced by neurons placed in hidden layers of neural networks used for speech recognition. In this work are also described experiments comparing methods for visualisation, visualisations of neural networks with different architectures and neural networks trained with different types of input data. Visualisation system implemented in this work is based on previous work of Mr. Khe Chai Sim and extended with new methods of data normalization. Kaldi toolkit was used for neural network training data preparation. CNTK framework was used for neural network training. Core of this work - the visualisation system was implemented in scripting language Python.
Fixed-Point Implementation Speech Recognizer
Král, Tomáš ; Černocký, Jan (referee) ; Burget, Lukáš (advisor)
Master thesis is related to the problematics of automatic speech recognition on systems with restricted hardware resources - embedded systems. The object of this work was to design and implement speech recognition system on embedded systems, that do not contain floating-point processing units. First objective was to choose proper hardware architecture. Based on the knowledge of available HW resources, the recognition system design was made. During the system development, optimalization was made on constituent elements so they could be mounted on chosen HW. The result of the the project is successful recognition of Czech numerals on embedded system.
Dynamic Decoder for Speech Recognition
Veselý, Michal ; Glembek, Ondřej (referee) ; Schwarz, Petr (advisor)
The result of this work is a fully working and significantly optimized implementation of a dynamic decoder. This decoder is based on dynamic recognition network generation and decoding by a modified version of the Token Passing algorithm. The implemented solution provides very similar results to the original static decoder from BSCORE (API of Phonexia company). Compared to BSCORE this implementation offers significant reduction of memory usage. This makes use of more complex language models possible. It also facilitates integration the speech recognition to some mobile devices or dynamic adding of new words to the system.
Online detection of simple voice commands in audiosignal
Zezula, Miroslav ; Březina, Lukáš (referee) ; Krejsa, Jiří (advisor)
This thesis describes the development of voice module, that can recognize simple speech commands by comparation of input sound with recorded templates. The first part of thesis contains a description of used algorithm and a verification of its functionality. The algorithm is based on Mel-frequency cepstral coefficients and dynamic time warping. Thereafter the hardware of voice module is designed, containing signal controller 56F805 from Freescale. The signal from microphone is conditioned by operational amplifiers and digital filter. The third part deals with the development of software for the controller and describes the fixed point implementation of the algorithm, respecting limited capabilities of the controller. Final test proves the usability of voice module in low-noise environment.
Implementation of Simple Speech Recognizer in Android
Flajšingr, Petr ; Herout, Adam (referee) ; Szőke, Igor (advisor)
The subject of this thesis is an implementation and optimization of speech recognizer for operating system Android. This work covers implementation of recording of an audio signal and the subsequent feature extraction using Mel filter banks and neural network. It also contains information about implementation of dynamic decoder. The work focuses on implementation in low-level tools such as Android NDK and Renderscript and evaluates the success rate of the recognizer and its memory and time requirements.
Application of Mean Normalized Stochastic Gradient Descent for Speech Recognition
Klusáček, Jan ; Hradiš, Michal (referee) ; Pešán, Jan (advisor)
Umělé neuronové sítě jsou v posledních letech na vzestupu. Jednou z možných optimalizačních technik je mean-normalized stochastic gradient descent, který navrhli Wiesler a spol. [1]. Tato práce dále vysvětluje a zkoumá tuto metodu na problému klasifikace fonémů. Ne všechny závěry Wieslera a spol. byly potvrzeny. Mean-normalized SGD je vhodné použít pouze pokud je síť dostatečně velká, nepříliš hluboká a pracuje-li se sigmoidou jako nelineárním prvkem. V ostatních případech mean-normalized SGD mírně zhoršuje výkon neuronové sítě. Proto nemůže být doporučena jako obecná optimalizační technika. [1] Simon Wiesler, Alexander Richard, Ralf Schluter, and Hermann Ney. Mean-normalized stochastic gradient for large-scale deep learning. In Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pages 180{184. IEEE, 2014.
Implementation of Simple Speech Recognizer in Android
Čuba, Eduard ; Glembek, Ondřej (referee) ; Szőke, Igor (advisor)
The goal of this project is to implement speech recognition software for Android platform. This paper outlines fundamental components of a speech recognizer and reviews the techniques used to optimize the process of speech recognition on Android devices. Firstly, it examines the implementation of the acoustic feature extraction and phoneme estimation processes. Then, it describes the design and implementation of a decoder used to process phoneme estimations into transcription, utilizing only limited resources of a mobile device. The project is divided into several modules, forming an Android library, which should be easy to extend and can be provided with custom models tailored for the desired use. Later, this paper discloses various approaches to modeling abstract data structures for recognition network representation, as well, as the ways of further development and applications of this project.
Voice control of cooperative robots
Bubla, Lukáš ; Němec, Zdeněk (referee) ; Lacko, Branislav (advisor)
The aim of the diploma thesis was to create a program with which it will be possible to control a collaborative robot by voice. First chapters contain a search of the current state in the field of collaborative robotics in terms of safety, work efficiency, robot programming and communication with the robot. Furthermore, the issue of machine processing of the human voice is discussed. In practical part was proposed an experiment in which we work with off-line simulation of UR3 robot in PolyScope 3.15.0 software. This simulation was linked to a Python program which uses SpeechRecognition and urx libraries. Simple voice instructions have been designed to move robot to defined position.
Hybrid Recognizer of Isoladed Words
Veselý, Karel ; Černocký, Jan (referee) ; Grézl, František (advisor)
The speaker independent isolated words recignizer has various practical applications. For example it can be used to control home gadgets by PC. Even more interesting is possibility that it can be built in the user interface of any application or even into operating system to perform command based control such as invocation of applications, or execution of any other specific action. The most remarkable application of isolated recognition is in electronical dictionaries. A voice controlled word lookup could be new feature of the next generation dictionaries. Very useful is the ability to ouptut ordered list of the most likely words, which gives the user ability to learn and distinguish similar words.
In the Traces of Leoš Janáček - Conversion of Speech to Music
Marciniak, Petr ; Glembek, Ondřej (referee) ; Černocký, Jan (advisor)
The aim of this bachelor thesis is to develop an application, which will automatically convert speech recording in WAV format to speech-melody-based music in MIDI format. At first, the problem is analyzed and the theoretical background is described. Basics of music generation from speech are introduced. Initial experiments like creation of the elementary melody, averaging of tones, syllables detection, etc. are discussed in order to establish, which of these techniques have a positive impact on the resulting music and therefore should be implemented in the resulting application. Basic criteria of beauty in music generation needed to be defined and different compositional techniques such as inversion of notes or tempo changes were investigated. Further, the implementation is described and user testing is evaluated. The conclusions are drawn and future directions of development are discussed. The user manual for the application as well as a "cook book" listing tools used in the application development can be found in the Appendix.

National Repository of Grey Literature : 134 records found   beginprevious41 - 50nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.