Náplava, Jakub - Search Results - Digital Repository

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: Náplava, Jakub

Search:

Search Tips :: Advanced Search

Search collections:

Sort by:	Display results:	Output format:

	Natural Language Correction With Focus on Czech Náplava, Jakub ; Straka, Milan (advisor) ; Grundkiewicz, Roman (referee) ; Dušek, Ondřej (referee) Natural language correction, a subfield of natural language processing (NLP), is the task of automatically correcting user errors in written texts. It includes, but is not lim- ited to, grammatical error correction, spelling error correction and diacritics restoration. During the course of the work on this thesis, we witnessed a great advance in this field, with the emergence of new approaches to correct user errors, new datasets and also new evaluation metrics. This thesis presents, in the form of a dissertation by publication, our contributions to this field. As Czech is the primary language of the thesis author, special focus was devoted to improving natural language correction in Czech. The main con- tributions are (1) the creation of the Grammar Error Correction Corpus for Czech that comprises multiple sources of noisy texts such as essays or online discussion posts, eval- uation of strong neural models on this dataset, and meta-evaluation of existing metrics, (2) the development of grammar error correction systems suited to scenarios in which only low amount of annotated data is available, and (3) the development of two state-of- the-art models and the creation of the new multilingual dataset comprising 12 languages for diacritics restoration. 1 Detailed record
	Natural Language Correction Náplava, Jakub ; Straka, Milan (advisor) ; Straňák, Pavel (referee) The goal of this thesis is to explore the area of natural language correction and to design and implement neural network models for a range of tasks ranging from general grammar correction to the specific task of diacritization. The thesis opens with a description of existing approaches to natural language correction. Existing datasets are reviewed and two new datasets are introduced: a manually annotated dataset for grammatical error correction based on CzeSL (Czech as a Second Language) and an automatically created spelling correction dataset. The main part of the thesis then presents design and implementation of three models, and evaluates them on several natural language correction datasets. In comparison to existing statistical systems, the proposed models learn all knowledge from training data; therefore, they do not require an error model or a candidate generation mechanism to be manually set, neither they need any additional language information such as a part of speech tags. Our models significantly outperform existing systems on the diacritization task. Considering the spelling and basic grammar correction tasks for Czech, our models achieve the best results for two out of the three datasets. Finally, considering the general grammatical correction for English, our models achieve results which are... Detailed record
	PerfJavaDoc: extending API documentation with performance information Náplava, Jakub ; Horký, Vojtěch (advisor) ; Hnětynka, Petr (referee) Javadoc is a documentation tool used for generating API documentation from Java source code. For some methods the generated documentation can contain description of the algorithm and its asymptotic complexity. However, such information is futile when the exact execution time of the method is needed with respect to certain critical characteristics. In this work we decided to enhance the Javadoc tool with a performance extension which permits to measure the performance of the method against the predefined characteristics. These characteristics are specified in a workload generator, which is a method used to prepare the actual arguments and instance for the measured method. The separation of the measured code and the preparatory code allows developers to implement both parts clearly and easily, and also share the generator amongst multiple methods. The measuring itself is performed by a measuring server which may run on a dedicated reference machine, and was programmed to provide accurate results with respect to the Java platform. Powered by TCPDF (www.tcpdf.org) Detailed record

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English