Národní úložiště šedé literatury Nalezeno 3 záznamů.  Hledání trvalo 0.00 vteřin. 
Search in speech recordings based on semantic vectors
Boboš, Dominik ; Karafiát, Martin (oponent) ; Schwarz, Petr (vedoucí práce)
In the current era of information overload, efficient methods for information retrieval are crucial. This thesis summarises methods for obtaining vector representations for text and audio, also known as semantic vectors. We took a deeper look at joint-representation models such as SpeechT5 and SeamlessM4T, which transform these various forms of input into one shared vector space. Based on these models, we built a system which allows us to search in data regardless of the modality. In order to evaluate the proposed solution on semantic search tasks, apart from standard keyword spotting tasks, we labelled a dataset to capture similar semantic meanings of the keywords or phrases. Finally, we conducted several experiments, where we explored the possibilities of the models used by limiting the context seen during finetuning or involving text-to-speech (TTS) systems to improve overall performance.
Detection of Pre-Recorded Messages in Speech
Boboš, Dominik ; Matějka, Pavel (oponent) ; Černocký, Jan (vedoucí práce)
Recognition of pre-recorded messages in speech is useful for any follow-up speech data mining. This thesis summarises the theory of searching similar utterances in speech and efficient approaches to compare two sequences. To investigate identification of redundant information in audio, it is necessary to have a large amount of data with the exact phrases repeated multiple times. We generated a dataset by mixing pre-recorded messages into phone calls with variations in speed, volume and repetitions. Our system tackles known messages and unknown messages'' scenarios by using approaches like clustering or detection in chunks. Dynamic time warping, approximate string matching and recurrent quantification analysis are compared, and finally, all mentioned techniques are combined to obtain a precise and efficient system.
Detection of Pre-Recorded Messages in Speech
Boboš, Dominik ; Matějka, Pavel (oponent) ; Černocký, Jan (vedoucí práce)
Recognition of pre-recorded messages in speech is useful for any follow-up speech data mining. This thesis summarises the theory of searching similar utterances in speech and efficient approaches to compare two sequences. To investigate identification of redundant information in audio, it is necessary to have a large amount of data with the exact phrases repeated multiple times. We generated a dataset by mixing pre-recorded messages into phone calls with variations in speed, volume and repetitions. Our system tackles known messages and unknown messages'' scenarios by using approaches like clustering or detection in chunks. Dynamic time warping, approximate string matching and recurrent quantification analysis are compared, and finally, all mentioned techniques are combined to obtain a precise and efficient system.

Chcete být upozorněni, pokud se objeví nové záznamy odpovídající tomuto dotazu?
Přihlásit se k odběru RSS.