National Repository of Grey Literature 1 records found  Search took 0.01 seconds. 
Data augmentation integration into Pytorch
Vašina, Ladislav ; Polok, Alexander (referee) ; Szőke, Igor (advisor)
This thesis presents a tool that creates a unified, simple, and user-friendly interface on top of the audio augmentation libraries that can be used in conjunction with PyTorch library. The implemented tool offers the possibility to use a wide spectrum of augmentations from different libraries and offers easy application of those augmentations on the datasets. The support of the large range of augmentations could be only achieved by using multiple interfaces of the individual libraries. The tool can receive a list of augmentations from the user with its parameters and then it decides which of the integrated libraries it should use to apply that specific augmentation. The created tool was tested on the task of fine-tuning the automatic speech recognition system called Whisper. The main contribution of this work is that it provides a solution to a large number of libraries for the augmentation of audio data, where each library provides a different number and types of augmentations of audio, while also having different features and interfaces.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.