keywords:"Zpracování přirozeného jazyka" - Search Results - Digital Repository

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: keywords:"Zpracování přirozeného jazyka"

Search:

Search Tips :: Advanced Search

Search collections:

Sort by:	Display results:	Output format:

	Authorship Identification Fabiánek, Ondřej ; Škoda, Petr (referee) ; Smrž, Pavel (advisor) This bachelor's thesis deals with authorship identification based on knowledge of author's previous texts. The aim is to analyze existing methods of authorship attribution and create a system, which is capable of highly successful authorship identification. The system is based on a multivariate analysis and specializes at English books. Part of the solution is also a graphic user interface. Detailed record
	Automatic Humor Evaluation Katrňák, Josef ; Ondřej, Karel (referee) ; Dočekal, Martin (advisor) The aim of this thesis is to create a system for automatic humor evaluation. The system allow to predict humor and category for english input. The main essence is to create a classifier and train the model with the created datasets to get the best possible results. The classifier architecture is based on neural networks. The system also includes a web user interface for communication with the user. The result is a web application linked to a classifier that allows user input to be evaluated and user feedback to be provided. Detailed record
	Named Entity Recognition Rylko, Vojtěch ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor) In this master thesis are described the history and theoretical background of named-entity recognition and implementation of the system in C++ for named entity recognition and disambiguation. The system uses local disambiguation method and statistics generated from the Wikilinks web dataset. With implemented system and with alternative implementations are performed various experiments and tests. These experiments show that the system is sufficiently successful and fast. System participates in the Entity Recognition and Disambiguation Challenge 2014. Detailed record
	Automatic Link Detection in Parts of Audiovisual Documents Sychra, Marek ; Černocký, Jan (referee) ; Szőke, Igor (advisor) This paper deals with topic detection. Specifically link detection - finding similarities amongst a group of short documents according to their topic and story segmentation - finding borders between two topically different parts in a large document. The main motivation for research was practical application with the use of presentation materials from lectures at FIT (linking parts of different lectures and courses). The solution of link detection is achieved by text and word analysis, which includes learning the meaning and importance of each word. Story segmentation uses this while searching for the boundaries. Both parts of the problem (link detection, story segmentation) gave great results while testing with a standard dataset (world news reports). During evaluation of lecture processing the success rate was lower, but still good. Detailed record
	Extraction of Relations among Named Entities Mentioned in Text Voháňka, Ondřej ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor) This bachelor's thesis deals with relation extraction. Explains basic knowledge, that is necessary for creating an extraction system. Then describes design, implementation and comparison of three systems, which works differently. Following methods were used: regular expressions, NER, parser. Detailed record
	Email spam filtering using artificial intelligence Safonov, Yehor ; Uher, Václav (referee) ; Kolařík, Martin (advisor) In the modern world, email communication defines itself as the most used technology for exchanging messages between users. It is based on three pillars which contribute to the popularity and stimulate its rapid growth. These pillars are represented by free availability, efficiency and intuitiveness during exchange of information. All of them constitute a significant advantage in the provision of communication services. On the other hand, the growing popularity of email technologies poses considerable security risks and transforms them into an universal tool for spreading unsolicited content. Potential attacks may be aimed at either a specific endpoints or whole computer infrastructures. Despite achieving high accuracy during spam filtering, traditional techniques do not often catch up to rapid growth and evolution of spam techniques. These approaches are affected by overfitting issues, converging into a poor local minimum, inefficiency in highdimensional data processing and have long-term maintainability issues. One of the main goals of this master's thesis is to develop and train deep neural networks using the latest machine learning techniques for successfully solving text-based spam classification problem belonging to the Natural Language Processing (NLP) domain. From a theoretical point of view, the master's thesis is focused on the e-mail communication area with an emphasis on spam filtering. Next parts of the thesis bring attention to the domain of machine learning and artificial neural networks, discuss principles of their operations and basic properties. The theoretical part also covers possible ways of applying described techniques to the area of text analysis and solving NLP. One of the key aspects of the study lies in a detailed comparison of current machine learning methods, their specifics and accuracy when applied to spam filtering. At the beginning of the practical part, focus will be placed on the e-mail dataset processing. This phase was divided into five stages with the motivation of maintaining key features of the raw data and increasing the final quality of the dataset. The created dataset was used for training, testing and validation of types of the chosen deep neural networks. Selected models ULMFiT, BERT and XLNet have been successfully implemented. The master's thesis includes a description of the final data adaptation, neural networks learning process, their testing and validation. In the end of the work, the implemented models are compared using a confusion matrix and possible improvements and concise conclusion are also outlined. Detailed record
	Computer as an Intelligent Partner in the Word-Association Game Codenames Jareš, Petr ; Fajčík, Martin (referee) ; Smrž, Pavel (advisor) This thesis solves a determination of semantic similarity between words. For this task is used a combination of predictive model fastText and count based method Pointwise Mutual Information. Thesis describes a system which utilizes semantic models for ability to substitue a player in a word association game Codenames. The system has implemented game strategy enabling use of context information from the game progression to benefit his own team. The system is able to substitue a player in both team roles. Detailed record
	Word2vec Models with Added Context Information Šůstek, Martin ; Rozman, Jaroslav (referee) ; Zbořil, František (advisor) This thesis is concerned with the explanation of the word2vec models. Even though word2vec was introduced recently (2013), many researchers have already tried to extend, understand or at least use the model because it provides surprisingly rich semantic information. This information is encoded in N-dim vector representation and can be recall by performing some operations over the algebra. As an addition, I suggest a model modifications in order to obtain different word representation. To achieve that, I use public picture datasets. This thesis also includes parts dedicated to word2vec extension based on convolution neural network. Detailed record
	Information Extraction from Biomedical Texts Knoth, Petr ; Burget, Radek (referee) ; Smrž, Pavel (advisor) Recently, there has been much effort in making biomedical knowledge, typically stored in scientific articles, more accessible and interoperable. As a matter of fact, the unstructured nature of such texts makes it difficult to apply knowledge discovery and inference techniques. Annotating information units with semantic information in these texts is the first step to make the knowledge machine-analyzable. In this work, we first study methods for automatic information extraction from natural language text. Then we discuss the main benefits and disadvantages of the state-of-art information extraction systems and, as a result of this, we adopt a machine learning approach to automatically learn extraction patterns in our experiments. Unfortunately, machine learning techniques often require a huge amount of training data, which can be sometimes laborious to gather. In order to face up to this tedious problem, we investigate the concept of weakly supervised or bootstrapping techniques. Finally, we show in our experiments that our machine learning methods performed reasonably well and significantly better than the baseline. Moreover, in the weakly supervised learning task we were able to substantially bring down the amount of labeled data needed for training of the extraction system. Detailed record
	Brno Communication Agent Křištof, Jiří ; Fajčík, Martin (referee) ; Smrž, Pavel (advisor) The aim of this thesis is the implementation of a communication agent, which provides information about Brno. The communication agent uses three - tier architecture . For the question answering , machine learning and neural network techniques are used . User tests determined the success rate 84 %. 58 % of the primary users were satisfied with the system. Main benefit of the work is facilitating the retrieving of information about Brno for its residents and visitors . Detailed record

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English