keywords:"nlp" - Search Results - Digital Repository

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: keywords:"nlp"

Search:



Search Tips :: Simple Search

Search collections:

Sort by:	Display results:	Output format:

	Generating Code from Textual Description of Functionality Šamánek, Jan ; Fajčík, Martin (referee) ; Smrž, Pavel (advisor) S pokračujícím nástupem strojového učení a stále větších modelů neuronových sítí, roste i potřeba GPU akcelerovaných zdrojů a algoritmů pro podporu těchto modelů. Vzhledem k tomu, že velké jazykové modely jsou již dnes využívány jako asistenti při programování v moderních programovacích jazycích, mohli by s tímto problémem pomoci. Pokud se tyto modely dokáží naučit i méně známá paradigmata, jako je CUDA, mohly by pomoci s vývojem a udržování těchto systémů. Tato práce zkoumá schopnosti moderních jazykových modelů pro učení se CUDA jako programovacího paradigmatu a také vytvoření nové trenovací sady, určené pro tyto účely. Detailed record
	Text Analysis in Specialized Translation: Accuracy and Error Rate Parobková, Alžbeta ; Marcoň, Petr (referee) ; Dohnal, Přemysl (advisor) Práca sa zameriava na prieskum a aplikáciu metód textovej analýzy, strojového prekladu na vyhodnotenie kvality technických textov, preložených práve pomocou strojového automatického prekladu. Praktická časť využíva tieto metódy na implementáciu algoritmu pre identifikáciu a klasifikáciu chýb. Ďaľšou časťou praktickej časti je aj aplikácia a natrénovanie neurónového modelu pre korekciu týchto chýb. Porovnanie chybovosti a presnosti prekladu rôznymi prekladačmi je potom preukázané nie len kvalitatívne, ale aj kvantitatívne pomocou štandartných metrík. Detailed record
	Semantic Analysis of Parish Record Kaňkovský, Adam ; Zbořil, František (referee) ; Rozman, Jaroslav (advisor) The aim of this work is to design and implement an application for semantic analysis of matrix records, which will take as input text obtained from a matrix scan. The extracted information will then be entered into the appropriate fields of the table. Detailed record
	Automatic detection and attribution of quotes Ustinova, Evgeniya ; Hana, Jiří (advisor) ; Vidová Hladká, Barbora (referee) Quotations extraction and attribution are important practical tasks for the media, but most of the presented solutions are monolingual. In this work, I present a complex machine learning-based system for extraction and attribution of direct and indirect quo- tations, which is trained on English and tested on Czech and Russian data. Czech and Russian test datasets were manually annotated as part of this study. This system is com- pared against a rule-based baseline model. Baseline model demonstrates better precision in extraction of quotation elements, but low recall. The machine learning-based model is better overall in extracting separate elements of quotations and full quotations as well. 1 Detailed record
	Security log anonymization tool focusing on artificial intelligence techniques Šťastná, Ariela ; Jurek, Michael (referee) ; Safonov, Yehor (advisor) Systémy SIEM zohrávajú v rámci bezpečnostného monitoringu zásadnú úlohu. Zozbierané záznamy agregujú, normalizujú a filtrujú, čo predstavuje základ pre aplikovanie techník dolovania dát. Týmto spôsobom SIEMy prezentujú výborný zdroj veľkých objemov normalizovaných dát. Tieto dáta nesú potenciál pre dosiahnutie pokroku v bezpečnostnom výskume, dolovaní dát a umelej inteligencii, kde môžu viesť k zlepšeniu existujúcich metód prieskumu, sprehľadneniu skenovania siete a odhaleniu sofistikovanejších vektorov útoku. Avšak jedným z hlavných problémov pre využívanie týchto dát je skutočnosť, že dáta v logových záznamoch sú v mnohých prípadoch citlivé a môžu predstavovať riziko z hľadiska bezpečnosti. Z toho dôvodu bol vytvorený nástroj pre anonymizáciu citlivých údajov v logových záznamoch, ktorý zachováva korelácie medzi dátami. Hlavným cieľom bakalárskej práce je zamerať sa na technické a právne aspekty spracovania logov a anonymizáciu pre umelú inteligenciu. V rámci výskumu bola vykonaná analýza najčastejšie sa vyskytujúcich dát v logoch spolu s vyhodnotením ich rizikovosti, výsledkom čoho je vytvorenie kategórií dát vzhľadom na ich citlivosť. V práci je ďalej prezentovaná analýza súčasných SIEM systémov spolu s meta kľúčmi, ktoré využívajú. Detailed record
	API for natural language robot control Etenkowski, Bartlomiej ; Hajič, Jan (advisor) ; Jurčíček, Filip (referee) Detailed record
	Assessing the impact of manual corrections in the Groningen Meaning Bank Weck, Benno ; Lopatková, Markéta (advisor) ; Vidová Hladká, Barbora (referee) The Groningen Meaning Bank (GMB) project develops a corpus with rich syntactic and semantic annotations. Annotations in GMB are generated semi-automatically and stem from two sources: (i) Initial annotations from a set of standard NLP tools, (ii) Corrections/refinements by human annotators. For example, on the part-of-speech level of annotation there are currently 18,000 of those corrections, so called Bits of Wisdom (BOWs). For applying this information to boost the NLP processing we experiment how to use the BOWs in retraining the part-of-speech tagger and found that it can be improved to correct up to 70% of identified errors within held-out data. Moreover an improved tagger helps to raise the performance of the parser. Preferring sentences with a high rate of verified tags in retraining has proven to be the most reliable way. With a simulated active learning experiment using Query-by-Uncertainty (QBU) and Query-by- Committee (QBC) we proved that selectively sampling sentences for retraining yields better results with less data needed than random selection. In an additional pilot study we found that a standard maximum-entropy part-of-speech tagger can be augmented so that it uses already known tags to enhance its tagging decisions on an entire sequence without retraining a new model first. Powered by... Detailed record
	The influence of artificial intelligence on digital communication and media and their future development Sukdol, Štěpán ; Hacker, Pavel (advisor) ; Báča, Ladislav (referee) Detailed record
	Generating Code from Textual Description of Functionality Kačur, Ján ; Ondřej, Karel (referee) ; Smrž, Pavel (advisor) The aim of this thesis was to design and implement system for code generation from textual description of functionality. In total, 2 systems were implemented. One of them served its purpose as a control prototype, the second one was the main product of this thesis. I focused on using smaller non-pre-trained models. Both systems used Transformer type model as their cores. The second system, unlike the first, used syntactic decomposition of both code and textual descriptions. Data used in both systems originated from project CodeSearchNet. Targer programming language to generate was Python. The second system achieved better quantitative results than the first one, with accuracy of 85% versus 60%. The system managed to auto-complete correct code to finish the function definition, with bigger time delay. This thesis is almost exclusively dedicated to the second system. Detailed record
	Phishing Detection Using Deep Learning Attention Techniques Safonov, Yehor In the modern world, electronic communication is defined as the most used technologyfor exchanging messages between users. The growing popularity of emails brings about considerablesecurity risks and transforms them into an universal tool for spreading phishing content. Even thoughtraditional techniques achieve high accuracy during spam filtering, they do not often catch up to therapid growth and evolution of spam techniques. These approaches are affected by overfitting issues,may converge into a poor local minimum, are inefficient in high-dimensional data processing andhave long-term maintainability problems. The main contribution of this paper is to develop and trainadvanced deep networks which use attention mechanisms for efficient phishing filtering and text understanding.Key aspects of the study lie in a detailed comparison of attention based machine learningmethods, their specifics and accuracy during the application to the phishing problem. From a practicalpoint of view, the paper is focused on email data corpus preprocessing. Deep learning attention basedmodels, for instance the BERT and the XLNet, have been successfully implemented and comparedusing statistical metrics. Obtained results show indisputable advantages of deep attention techniquescompared to the common approaches. Detailed record

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English