National Repository of Grey Literature 4 records found  Search took 0.00 seconds. 
A school analysis as a possible source of treebanks (?)
Konárová, Marie ; Vidová Hladká, Barbora (advisor) ; Zeman, Daniel (referee)
The aim of this thesis is to explore the possibilities of using data from the school sentence analyses for tagging words in the language corpora. For testing of this hypothesis, a set of sentences has been selected from a common czech language textbook. Students of selected primary and secondary schools were asked to perform the syntactical analysis of these sentences. The data collection was carried out using a prototype sentence analysis editor Capek. The editor is still being developed, also based on feedback gained from the students and teachers who used it during the data collecting process. Several transformation rules for converting data from the school sentence analyses into the data structures used within the Prague Dependency corpus were developed. The accuracy of the conversion using the proposed rules was tested together with the accuracy of students' results.
Detection and Correction of Inconsistencies in the Multilingual Treebank HamleDT
Mašek, Jan ; Žabokrtský, Zdeněk (advisor) ; Mareček, David (referee)
We studied the treebanks included in HamleDT and partially unified their label sets. Afterwards, we used a method based on variation n-grams to automatically detect errors in morphological and dependency annotation. Then we used the output of a part-of-speech tagger / dependency parser trained on each treebank to correct the detected errors. The performance of both the detection and the correction of errors on both annotation levels was manually evaluated on a randomly selected samples of suspected errors from several treebanks. Powered by TCPDF (www.tcpdf.org)
Detection and Correction of Inconsistencies in the Multilingual Treebank HamleDT
Mašek, Jan ; Žabokrtský, Zdeněk (advisor) ; Mareček, David (referee)
We studied the treebanks included in HamleDT and partially unified their label sets. Afterwards, we used a method based on variation n-grams to automatically detect errors in morphological and dependency annotation. Then we used the output of a part-of-speech tagger / dependency parser trained on each treebank to correct the detected errors. The performance of both the detection and the correction of errors on both annotation levels was manually evaluated on a randomly selected samples of suspected errors from several treebanks. Powered by TCPDF (www.tcpdf.org)
A school analysis as a possible source of treebanks (?)
Konárová, Marie ; Vidová Hladká, Barbora (advisor) ; Zeman, Daniel (referee)
The aim of this thesis is to explore the possibilities of using data from the school sentence analyses for tagging words in the language corpora. For testing of this hypothesis, a set of sentences has been selected from a common czech language textbook. Students of selected primary and secondary schools were asked to perform the syntactical analysis of these sentences. The data collection was carried out using a prototype sentence analysis editor Capek. The editor is still being developed, also based on feedback gained from the students and teachers who used it during the data collecting process. Several transformation rules for converting data from the school sentence analyses into the data structures used within the Prague Dependency corpus were developed. The accuracy of the conversion using the proposed rules was tested together with the accuracy of students' results.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.