Národní úložiště šedé literatury Nalezeno 1 záznamů.  Hledání trvalo 0.00 vteřin. 
Large Language Models in Speech Recognition
Tomašovič, Martin ; Polok, Alexander (oponent) ; Beneš, Karel (vedoucí práce)
This thesis explores the conditions under which a Large Language Model (LLM) improves Automatic Speech Recognition (ASR) transcription. Specifically, the thesis focuses on n-best rescoring with masked and autoregressive language models. The n-best hypotheses are scored using LLM and then this score is interpolated with the scores from ASR. This approach is tested across different ASR settings and datasets. Results demonstrate that rescoring hypotheses from Wav2Vec 2.0 and Jasper ASR systems reduces the error rate. LLM fine-tuning proves to be very beneficial. Smaller fine-tuned models can surpass larger non-fine-tuned ones. The findings of this thesis broaden the knowledge of the conditions for LLM (autoregressive, masked) utilization in ASR rescoring. The thesis observes the influence of fine-tuning, normalization and separating scores from a CTC decoder on the decrease of word error rate.

Chcete být upozorněni, pokud se objeví nové záznamy odpovídající tomuto dotazu?
Přihlásit se k odběru RSS.