National Repository of Grey Literature 4 records found  Search took 0.01 seconds. 
Automatic Creation of Parallel Corpus from Movie Subtitles
Straňák, Marek ; Černocký, Jan (referee) ; Smrž, Pavel (advisor)
This work is about the creation of parallel corpus, where movie subtitles is main source. In particulary, it is about alignment czech and english sentences using dictionaries and morphologic analyzers or alignment talks of subtitles in other languages using timing of talks. The work give basic information about parallel corpus.
XML Dictionary Tagging
Rojček, Martin ; Burget, Radek (referee) ; Smrž, Pavel (advisor)
This Bachelor's thesis describes data stacking of vocabulary into proper XML structure. It deals with structure destription tools(DTD,XML Schema, Relax NG and others) and transformation (XSLT)  of XML documents. It describes best known vocabulary storing formats into XML (OLIF, ISLE/MILE and others) and practical advantage to take one of method - store with helping of simplest ac.dtd. Implementation shows on relatively easy work with vocabularies stored in this form. Last part considers about getting statistic data with using Python scripting language.
XML Dictionary Tagging
Rojček, Martin ; Burget, Radek (referee) ; Smrž, Pavel (advisor)
This Bachelor's thesis describes data stacking of vocabulary into proper XML structure. It deals with structure destription tools(DTD,XML Schema, Relax NG and others) and transformation (XSLT)  of XML documents. It describes best known vocabulary storing formats into XML (OLIF, ISLE/MILE and others) and practical advantage to take one of method - store with helping of simplest ac.dtd. Implementation shows on relatively easy work with vocabularies stored in this form. Last part considers about getting statistic data with using Python scripting language.
Automatic Creation of Parallel Corpus from Movie Subtitles
Straňák, Marek ; Černocký, Jan (referee) ; Smrž, Pavel (advisor)
This work is about the creation of parallel corpus, where movie subtitles is main source. In particulary, it is about alignment czech and english sentences using dictionaries and morphologic analyzers or alignment talks of subtitles in other languages using timing of talks. The work give basic information about parallel corpus.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.