National Repository of Grey Literature 19 records found  1 - 10next  jump to record: Search took 0.00 seconds. 
Design and Implementation of Sound Recognizer of Particular Grasshopper Species
Schwarz, Jan ; Peterek, Nino (advisor) ; Hlaváčová, Jaroslava (referee)
Biologists asked us to create a system that recognizes particular grasshopper species from stridulation records. Currently we recognize five grasshopper species which can be seen in the Czech Republic using a free available toolkit for speech recognition called HTK. In addition to the acoustic model itself we also created web sites, which would analyse a stridulation record and then save the result for subsequent utilization. The current model is based only on a limited amount of training records, but its results are satisfactory. The web sites also serve as a gathering system; consequently, it is possible to further extend and improve the model.
Automatické osvojení vzorů s minimální supervizí
Klíč, Radoslav ; Hana, Jiří (advisor) ; Hlaváčová, Jaroslava (referee)
The thesis presents a semi-supervised morphology learner developed by extending Paramor (Monson, 2009), an unsupervised system, to accept easy to obtain manually provided data in the form of inflections with marked morpheme boundary. In addition, a hierarchical clustering framework allowing combination of multiple sources of information was developed as a part of the thesis. The approach was tested on Czech, Slovene, German and Catalan and has shown increased F-measure in comparison with the Paramor baseline.
Searching Czech Structured Data using Stemming
Tattermusch, Jan ; Hlaváčová, Jaroslava (advisor) ; Kuboň, Vladislav (referee)
This work describes and implements a component for fulltext searching with czech diacritics restoration and stemming support. Diacritics restoration is based on statistical principles and is context dependent. This work presents ve stemmers ready for immediate use (two algorithmic stemmers and three hybrid stemmers) and discusses their properties. The component is implemented using Apache Lucene library and provides a simple interface for querying and insertions, deletions and updates of documents indexed. Stored documents consist of named elds with prede ned data types. Besides regular fulltext queries, the component also supports non-trivial queries with additional constraints and provides a way to customize the way query result score is computed. Component's performance is suffcient for medium-load applications and is approximately 50 queries per second with a repository that contains 2.7 million documents. Contribution of stemming and diacritics restoration to the quality of fulltext searching was measured using MAP and is signi cant.
Recognition of numerals in Czech texts
Bureš, Jan ; Hlaváčová, Jaroslava (advisor) ; Štěpánek, Jan (referee)
Purpose of this work is to create a tool capable of recognizing cardinal numerals in Czech text, both written with the use of digits and written with the use of words. Emphasis is placed on recognizing numerals written with words and their correct combining. Not only grammatically correct, but also other expressions and their combinations, that are quite frequent in the use of the language, were taken into account. Output is a numeric value of recognized numeral, and a set of possible morphological tags for each numeral. The program performs its own lexical and grammatical analysis based on a set of given numeral forms and rules.
Czech morphological guesser
Suchánek, Michal ; Hlaváčová, Jaroslava (advisor) ; Mírovský, Jiří (referee)
The first step of text analysis is tagging word forms with morphological tags. These tags describe the part of speech, person (where applicable), number, etc. This information is used for further analysis of the text. Tags are automatically assigned by looking up the word form in the morphological dictionary. This gives good results for the Czech language because the word forms express the morphological categories to some extent. Unlike English words, Czech words often change their ending when their morphological category changes. Words that are not present in the dictionary can be tagged with a guesser. The guesser described here uses the similarity of unrecognized words with words already present in the dictionary.
System of Czech numerals and their automatic recognition in texts
Bureš, Jan ; Hlaváčová, Jaroslava (advisor) ; Mírovský, Jiří (referee)
This thesis has two main goals. The first goal is systematic classification of Czech numerals and other quantitative phrases (including multiple-word) with special regard for possible use during automatic recognition of Czech text. The main source of data for theis classification is current Czech grammar and author's research in Czech language corpora The second goal is development of tool for automatic recognition of numerals in Czech text based on the system developed during the first phase of this thesis. This includes determining basic morphological attributes of numerals and their numeric value, where possible and applicable. The tool is even prepared to deal with the fact, that the gramatic rules for numerals are often disregarded.
Czech prefixes
Hrušecký, Michal ; Hlaváčová, Jaroslava (advisor) ; Mírovský, Jiří (referee)
Automatic recognition of new prefixes in Czech language is studied in the presented work. Several methods of automatic recognition of prefixes are described and one of them is analyzed more deeply. Analyzed method is also implemented in the software which is part of this work. Software can be found on attached CD including source code and example datasets. CD includes also results of all tests mentioned in the presented work.
Splitting word compounds
Oberländer, Jonathan ; Pecina, Pavel (advisor) ; Hlaváčová, Jaroslava (referee)
Unlike the English language, languages such as German, Dutch, the Skandinavian languages or Greek form compounds not as multi-word expressions, but by combining the parts of the compound into a new word without any orthographical separation. This poses problems for a variety of tasks, such as Statistical Machine Translation or Information Retrieval. Most previous work on the subject of splitting compounds into their parts, or ``decompounding'' has focused on German. In this work, we create a new, simple, unsupervised system for automatic decompounding for three representative compounding languages: German, Swedish, and Hungarian. A multi-lingual evaluation corpus in the medical domain is created from the EMEA corpus, and annotated with regards to compounding. Finally, several variants of our system are evaluated and compared to previous work. Powered by TCPDF (www.tcpdf.org)
Extending the Lexical Network DeriNet
Vidra, Jonáš ; Žabokrtský, Zdeněk (advisor) ; Hlaváčová, Jaroslava (referee)
DeriNet is a database of Czech lexical derivates. It is a wordnet in which nodes represent lemmas sampled from the Czech National Corpus and edges represent derivational relations between them (such as work → workable → unworkable). Sourcing the lemmas from a corpus brings two problems: errors and missing lemmas that could link together currently unconnected clusters. Therefore, a more reliable and more complete source of lemmas is needed. The goal of this thesis is to extend the lexicon of DeriNet using lemmas sourced from MorfFlex CZ, a Czech morphological dictionary, and to correct the derivational rules that produce errors with the new lexicon. Error rate is measured by comparing the relations in the database with manually annotated data created as part of the thesis. Powered by TCPDF (www.tcpdf.org)
Design and Implementation of Sound Recognizer of Particular Grasshopper Species
Schwarz, Jan ; Peterek, Nino (advisor) ; Hlaváčová, Jaroslava (referee)
Biologists asked us to create a system that recognizes particular grasshopper species from stridulation records. Currently we recognize five grasshopper species which can be seen in the Czech Republic using a free available toolkit for speech recognition called HTK. In addition to the acoustic model itself we also created web sites, which would analyse a stridulation record and then save the result for subsequent utilization. The current model is based only on a limited amount of training records, but its results are satisfactory. The web sites also serve as a gathering system; consequently, it is possible to further extend and improve the model.

National Repository of Grey Literature : 19 records found   1 - 10next  jump to record:
See also: similar author names
16 Hlaváčová, Jana
3 Hlaváčová, Jitka
1 Hlaváčová, Josefína
1 Hlaváčová, Julie
Interested in being notified about new results for this query?
Subscribe to the RSS feed.