Národní úložiště šedé literatury Nalezeno 1 záznamů.  Hledání trvalo 0.00 vteřin. 
Large Language Models for Generating Code Focusing on Embedded Systems
Vadovič, Matej ; Nosko, Svetozár (oponent) ; Smrž, Pavel (vedoucí práce)
The goal of this work was to adapt a pre-trained language model for the purpose of generating code in the field of embedded systems. The work introduces a new dataset for fine-tuning code generation models, consisting of 50,000 pairs of source code and comments focused on embedded systems programming. This dataset is composed of collected source code from the GitHub platform. Two new language models for code generation, based on transformer architecture pre-trained models, were fine-tuned on the data of the new corpus. Model MicroCoder is based on the CodeLLaMA-Instruct 7B model, and during its fine-tuning, the QLoRA technique was used to minimize computational requirements. The second model, MicroCoderFIM, is based on the StarCoderBase 1B model and supports code infilling. The individual models were compared based on BLEU, CodeBLEU, ChrF++, and ROUGE-L metrics. Model MicroCoderFIM achieves the best adaptation results to the new task, with over 120% improvement in all measured metrics. The weights of the models along with the new dataset are freely accessible on a public repository.

Chcete být upozorněni, pokud se objeví nové záznamy odpovídající tomuto dotazu?
Přihlásit se k odběru RSS.