National Repository of Grey Literature 145 records found  previous11 - 20nextend  jump to record: Search took 0.01 seconds. 
Learning the Face Behind a Voice
Krušina, Josef ; Matějka, Pavel (referee) ; Plchot, Oldřich (advisor)
This work addresses the problem of mapping fixed representations (embeddings) of a speech signal to face embeddings and then generating a face from the mapped embedding using a generative adversarial network (GAN) that was trained for face generation. GANs are a type of neural networks that can generate data similar to the data they were trained on. The architecture of the proposed system is based on four components: a face embedding extractor, a voice embedding extractor, an algorithm on top of a GAN that can generate a face from a face embedding, and my mapping network used to map a voice embedding to a face embedding. The pre-trained neural networks FaceNet and SpeechBrain are adopted as embedding extractors. A model that uses a pre-trained StyleGAN2 is adopted for backward face generation. The contribution of this work is that it allows the extrapolation of a face from audio signal only.
Exploring New Paths in Neural-Network-Based Speaker Recognition
Sova, Damián ; Matějka, Pavel (referee) ; Glembek, Ondřej (advisor)
Since the assignment of this work is very broad, it was necessary to focus only on a certain area. In the end, this work aims to apply the Stochastic Weight Averaging optimization method to the training process of the Deep Neural Network. After presenting the necessary theoretical knowledge in the first part of the work, the second part with the experiments courses follows. In the theoretical part, the main focus is on presenting the complete lifecycle of the training and evaluation process, including a description of each component. The practical part provides a detailed look at each experiment, intended to demonstrate the effectiveness of the overall speaker recognition system's performance enhancement. The overall performance improvement is achieved by gradually applying various training configurations where the experience from previous experiments is taken into account. The key ingredient to the successful Stochastic Weight Averaging in the experiments was a sufficiently high Learning Rate value with the successive transition applied or Cyclic course of the Learning Rate.
Robust Speaker Verification with Deep Neural Networks
Profant, Ján ; Rohdin, Johan Andréas (referee) ; Matějka, Pavel (advisor)
The objective of this work is to study state-of-the-art deep neural networks based speaker verification systems called x-vectors on various conditions, such as wideband and narrowband data and to develop the system, which is robust to unseen language, specific noise or speech codec. This system takes variable length audio recording and maps it into fixed length embedding which is afterward used to represent the speaker. We compared our systems to BUT's submission to Speakers in the Wild Speaker Recognition Challenge (SITW) from 2016, which used previously popular statistical models - i-vectors. We observed, that when comparing single best systems, with recently published x-vectors we were able to obtain more than 4.38 times lower Equal Error Rate on SITW core-core condition compared to SITW submission from BUT. Moreover, we find that diarization substantially reduces error rate when there are multiple speakers for SITW core-multi condition but we could not see the same trend on NIST SRE 2018 VAST data.
Description of automatic tool change at milling machines
Podloucký, Milan ; Matějka, Petr (referee) ; Pavlík, Jan (advisor)
The aim of this bachelor thesis is to create a literature research and comprehensive classification of currently used equipment for automatic tool change at milling centers.
Industrial robot application in area forming technology
Coufal, Jiří ; Matějka, Petr (referee) ; Knoflíček, Radek (advisor)
The purpose of this bachelor thesis is to familiarize a reader with the application of the industrial robots in the forming technology area. It presents the chosen forming metal proceedings and the typical forming machines, especially the sorts of the industrial robots and the manipulators (allocation, classification, robot producers). The bachelor thesis describes the current industrial robots and the manipulators used in the industrial applications and the forming metal technology. The characteristic pictures with legends display all the related problems.
Voice Activity Detection
Břenek, Roman ; Grézl, František (referee) ; Matějka, Pavel (advisor)
This thesis describes techniques for voice activity detection in audio recordings. It is necessary to  correctly classify all non-speech segments and recognize speech with noisy background.  The whole process of voice activity detection (VAD) is described in this thesis, i.e. digitizing audio  signal, feature extraction, training of the system, post-processing and final evaluation. There are  three different systems compared within the thesis . The first one is based on phoneme recognition using neural network, the other two are variations of Gaussian Mixture Models (GMM). Each system was tested on three data sets - Tactical Speaker Identification Speech Corpus (TSID), Ham Radio (HR) and Rich Transcription Evaluation (RT05-RT07). The best results of each system are compared with the results of the third side.
Construction of Azimuth Fork Mount
Dostál, Jan ; Matějka, Petr (referee) ; Pavlík, Jan (advisor)
Purpose of this master´s thesis is the construction proposal of azimuth fork mount with load capacity to 20kg including both axes drives. The solution contains design options of azimuth fork mount, calculations, design proposal of the mount and drawings of the shaft, fork and the assembly.
Speaker Recognition Based on Long Temporal Context
Fér, Radek ; Matějka, Pavel (referee) ; Černocký, Jan (advisor)
Tato práce se zabývá extrakcí vhodných příznaků pro rozpoznávání řečníka z delších časových úseků. Po představení současných technik pro extrakci takových příznaků navrhujeme a popisujeme novou metodu pracující v časovém rozsahu fonémů a využívající známou techniku i-vektorů. Velké úsilí bylo vynaloženo na nalezení vhodné reprezentace temporálních příznaků, díky kterým by mohly být systémy pro rozpoznávání řečníka robustnější, zejména modelování prosodie. Náš přístup nemodeluje explicitně žádné specifické temporální parametry řeči, namísto toho používá kookurenci řečových rámců jako zdroj temporálních příznaků. Tuto techniku testujeme a analyzujeme na řečové databázi NIST SRE 2008. Z výsledků bohužel vyplývá, že pro rozpoznávání řečníka tato technika nepřináší očekávané zlepšení. Tento fakt diskutujeme a analyzujeme ke konci práce.
Construction of hydraulic wood-splitting machine
Šimčík, Jaroslav ; Opl, Miroslav (referee) ; Matějka, Petr (advisor)
Bc. Jaroslav Simcik Construction of hydraulic wood-splitting machine DP, Institute of production machines, systems and robotics, 2010, p. 60, fig. 30, appendices 7, This master ´s thesis is concerned with the wood-processing technology with a focus on the construction of hydraulic wood splitter machine with a force 120 kN.
Learning the Face Behind a Voice
Kyjonka, Mojmír ; Matějka, Pavel (referee) ; Plchot, Oldřich (advisor)
This thesis deals with face reconstruction based on voice. The state of the art of this problem is investigated and model for such problem is trained. Model used in this thesis is based on the work "Reconstructing faces from voices" which architecture is based on Generative Adversarial Network (GAN). In this work, we used VGGFace and VoxCeleb datasets, and additionally, we created a small audiovisual dataset of Czech speakers. This work was implemented using the Python scripting language and PyTorch library.

National Repository of Grey Literature : 145 records found   previous11 - 20nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.