National Repository of Grey Literature 19 records found  1 - 10next  jump to record: Search took 0.00 seconds. 
Automatic Humor Evaluation
Katrňák, Josef ; Ondřej, Karel (referee) ; Dočekal, Martin (advisor)
The aim of this thesis is to create a system for automatic humor evaluation. The system allow to predict humor and category for english input. The main essence is to create a classifier and train the model with the created datasets to get the best possible results. The classifier architecture is based on neural networks. The system also includes a web user interface for communication with the user. The result is a web application linked to a classifier that allows user input to be evaluated and user feedback to be provided.
Email spam filtering using artificial intelligence
Safonov, Yehor ; Uher, Václav (referee) ; Kolařík, Martin (advisor)
In the modern world, email communication defines itself as the most used technology for exchanging messages between users. It is based on three pillars which contribute to the popularity and stimulate its rapid growth. These pillars are represented by free availability, efficiency and intuitiveness during exchange of information. All of them constitute a significant advantage in the provision of communication services. On the other hand, the growing popularity of email technologies poses considerable security risks and transforms them into an universal tool for spreading unsolicited content. Potential attacks may be aimed at either a specific endpoints or whole computer infrastructures. Despite achieving high accuracy during spam filtering, traditional techniques do not often catch up to rapid growth and evolution of spam techniques. These approaches are affected by overfitting issues, converging into a poor local minimum, inefficiency in highdimensional data processing and have long-term maintainability issues. One of the main goals of this master's thesis is to develop and train deep neural networks using the latest machine learning techniques for successfully solving text-based spam classification problem belonging to the Natural Language Processing (NLP) domain. From a theoretical point of view, the master's thesis is focused on the e-mail communication area with an emphasis on spam filtering. Next parts of the thesis bring attention to the domain of machine learning and artificial neural networks, discuss principles of their operations and basic properties. The theoretical part also covers possible ways of applying described techniques to the area of text analysis and solving NLP. One of the key aspects of the study lies in a detailed comparison of current machine learning methods, their specifics and accuracy when applied to spam filtering. At the beginning of the practical part, focus will be placed on the e-mail dataset processing. This phase was divided into five stages with the motivation of maintaining key features of the raw data and increasing the final quality of the dataset. The created dataset was used for training, testing and validation of types of the chosen deep neural networks. Selected models ULMFiT, BERT and XLNet have been successfully implemented. The master's thesis includes a description of the final data adaptation, neural networks learning process, their testing and validation. In the end of the work, the implemented models are compared using a confusion matrix and possible improvements and concise conclusion are also outlined.
Integration of advanced artificial intelligence methods with log management security systems
Sedláček, Jiří ; Mikulec, Marek (referee) ; Safonov, Yehor (advisor)
Cyber security is a very important aspect of everyone’s daily life. With the ever-expanding cyberspace and its growing influence on the real world, the issue of cyber security is all the more important. The theoretical part of the thesis describes the basic aspects of security monitoring. Also, the process of collecting event logs and their management is briefly described. An important means of security monitoring is the management of security information and events. Its advantages, disadvantages and possible improvements with artificial intelligence are discussed. Security orchestration, automation and response functions are also mentioned in the theoretical part. Machine learning techniques such as neural networks and deep learning are also mentioned. This section also focuses on cyber operations centres in terms of improving the efficiency of human ”manual” labour. A survey of possible machine learning techniques for this use case has been conducted, as the lack of human resources is a critical issue within security operations centres. The practical part of the thesis involves setting out a goal (text sequence classification) that could make the work considerably easier in terms of manually categorizing event logs according to their source. For this set task, security monitoring related data was collected from different log sources. In the practical part, the methods for processing this data are also described in detail. Subsequently, a suitable neural network model was selected and its technical description was performed. Finally, the final data processing and the process of training, validating and testing the model are described. Three scenarios were developed for this process, which are then described in detail in the measurement results.
Visual Question Answering
Kocurek, Pavel ; Ondřej, Karel (referee) ; Fajčík, Martin (advisor)
Visual Question Answering (VQA) je systém, kde je vstupem obrázek s otázkou a výstupem je odpověď. Navzdory mnoha pokrokům ve výzkumu se VQA, na rozdíl od počítačově generovaných popisů obrázků, v praxi používá jen zřídka. Cílem této práce je zúžit mezeru mezi výzkumem a praxí. Z tohoto důvodu byla kontaktována komunita zrakově postižených a byla jim nabídnuta demonstrativní aplikace VQA a následně byla vytvořena mobilní aplikace. Byla provedena studie s 20 účastníky z komunity. Nejprve účastníci zkoušeli demonstrativní aplikaci po dobu dvou týdnů a následně byli požádáni o vyplnění dotazníku.   80 % respondentů hodnotilo přesnost aplikace VQA jako dostatečnou nebo lepší a většina z nich by ocenila, kdyby jejich aplikace pro generování popisů podporovala také VQA. Po tomto zjištění práce porovná získané znalosti z VQA se znalostmi z popisů v různých scénářích. Byla vytvořena datová sada 111 obrázků různorodých scén s ručně anotovanými popisky. Experiment porovnávající získané znalosti ukázal úspěšnost 69,9 % pro VQA a 46,2 % pro popisy obrázků. V dalším experimentu v 70,9 % případů účastníci vybrali správný popis za pomocí VQA. Výsledky naznačují, že pomocí VQA je možné zjistit více znalostí o detailech obrázků než je to v případě generovaných popisů.
Classification of Relations between Named Entities in Text
Ondřej, Karel ; Doležal, Jan (referee) ; Smrž, Pavel (advisor)
This master thesis deals with the extraction of relationships between named entities in the text. In the theoretical part of the thesis, the issue of natural language representation for machine processing is discussed. Subsequently, two partial tasks of relationship extraction are defined, namely named entities recognition and classification of relationships between them, including a summary of state-of-the-art solutions. In the practical part of the thesis, system for automatic extraction of relationships between named entities from downloaded pages is designed. The classification of relationships between entities is based on the pre-trained transformers. In this thesis, four pre-trained transformers are compared, namely BERT, XLNet, RoBERTa and ALBERT.
Plot Analysis from Book Summaries and User Reviews
Rúček, Peter ; Dočekal, Martin (referee) ; Smrž, Pavel (advisor)
The aim of this work is to create a system for analysis and classification of plot keywords from summarized storylines and user reviews in English. The chosen problem is solved using a transformer-based machine learning technique. The created solution also implements data downloading and a dataset of user reviews and information about books was created, exceeding 23 million reviews and 900 thousand information about books. The system can predict what plot keywords the data contains. 
Machine Learning for Natural Language Question Answering
Sasín, Jonáš ; Fajčík, Martin (referee) ; Smrž, Pavel (advisor)
This thesis deals with natural language question answering using Czech Wikipedia. Question answering systems are experiencing growing popularity, but most of them are developed for English. The main purpose of this work is to explore possibilities and datasets available and create such system for Czech. In the thesis I focused on two approaches. One of them uses English model ALBERT and machine translation of passages. The other one utilizes the multilingual BERT. Several variants of the system are compared in this work. Possibilities of relevant passage retrieval are also discussed. Standard evaluation is provided for every variant of the tested system. The best system version has been evaluated on the SQAD v3.0 dataset, reaching 0.44 EM and 0.55 F1 score, which is an excellent result compared to other existing systems. The main contribution of this work is the analysis of existing possibilities and setting a benchmark for further development of better systems for Czech.
Predicting stock price movements from financial news using deep neural networks
Kramoliš, Richard ; Baruník, Jozef (advisor) ; Vácha, Lukáš (referee)
Financial media are an important source of information and many articles about companies and stocks are released every day. This thesis assesses the informa- tion value of the articles and utilizes these articles for the stock price move- ment prediction task. For this purpose, models with transformer architecture are used, specifically Bidirectional Encoder Representations from Transform- ers. These models are able to process the text data and create the contextual representation of the text sequence. After adding the classification layer, the models are applied for the stock price movement predictions. The thesis evalu- ates multiple models including different techniques and parameters to find the best performing model. It focuses on two data filters that are expected to de- crease the noise in the data. Moreover, it introduces a new method to recognize the company of interest. As a result of the hyperparameter optimization, the final model is constructed. JEL Classification C45, C51, C52, C53, G11, G14, G17 Keywords BERT, Transformer, Financial Articles, Stock Trading Title Predicting stock price movements from financial news using deep neural networks
Document Information Extraction
Janík, Roman ; Špaňhel, Jakub (referee) ; Hradiš, Michal (advisor)
S rozvojem digitalizace přichází potřeba analýzy historických dokumentů. Důležitou úlohou pro extrakci informací a dolování dat je rozpoznávání pojmenovaných entit. Cílem této práce je vyvinout systém pro extrakci informací z českých historických dokumentů, jako jsou noviny, kroniky a matriční knihy. Byl navržen systém pro extrakci informací, jehož vstupem jsou naskenované historické dokumenty zpracované OCR algoritmem. Systém je založen na modifikovaném modelu RoBERTa. Extrakce informací z českých historických dokumentů přináší výzvy v podobě nutnosti vhodného korpusu pro historickou Češtinu. Pro trénování systému byly použity korpusy Czech Named Entity Corpus (CNEC) a Czech Historical Named Entity Corpus (CHNEC), spolu s mým vlastním vytvořeným korpusem. Systém dosahuje úspěšnosti 88,85 F1 skóre na CNEC a 87,19 F1 skóre na CHNEC. Toto je zlepšení o 1,36 F1 u CNEC a 5,19 F1 u CHNEC a tedy nejlepší známé výsledky.
Call Sign Detection and Recognition in VHF Communication
Dedič, Juraj ; Kocour, Martin (referee) ; Szőke, Igor (advisor)
This work explores the processing of data from air traffic communication in order to detect and recognize the~call signs it contains. Particularly it involves recognizing these call signs in human made and automated text transcripts of the communication between pilots and air traffic controllers. The thesis compares various ways of solving this and describes their problems. It implements a system for the identification of these call signs using a suitable technology based on large language models. One of the outputs of this work is a service that is able to distinguish the call signs, which enables indexation and sorting of this data in an efficient way.

National Repository of Grey Literature : 19 records found   1 - 10next  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.