National Repository of Grey Literature 142 records found  beginprevious123 - 132next  jump to record: Search took 0.02 seconds. 
Library for Support of ReReSearch System Development
Heller, Stanislav ; Otrusina, Lubomír (referee) ; Šperka, Svatopluk (advisor)
At this time, the development of the ReReSearch system is significantly slowed down by mutual incompatibility of system modules, by the fact that developers often repeat already known mistakes and of course by poor communication between developers in general. To solve this problem, there was a need to create a component which would implement and unify often performed tasks in development of ReReSearch system and this way to spend time of ReReSearch developers. The result of this effort is so-called "rrslib" - a Python library, which is supposed to be a helper for everyone, who works on parts of ReReSearch project: database, data extractors, web-based agents, crawlers, XML-processing etc. The library should serve for more consistent, faster and more reliable development of ReReSearch system.
An Intelligent System for Question Answering
Mičulka, Jakub ; Kouřil, Jan (referee) ; Otrusina, Lubomír (advisor)
This work deals with problem about proccessing of natural language queries, which are asked in search engines. This work explains basic function principles of search engines, where the main focus is given to database search engines. The essential part of this article deals with system design and implementation of questionAnswering, which is used for searching information in the database of the ReResearch project. The reader is introduced to procedure of design and implementation of mentioned system and to the fundamental problems, that arose from this work. In the end, this system is evaluated with the standard metrics.
Extraction of Relations among Named Entities Mentioned in Text
Voháňka, Ondřej ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor)
This bachelor's thesis deals with relation extraction. Explains basic knowledge, that is necessary for creating an extraction system. Then describes design, implementation and comparison of three systems, which works differently. Following methods were used: regular expressions, NER, parser. 
The Most Frequent Word n-Grams
Holec, Matúš ; Szőke, Igor (referee) ; Smrž, Pavel (advisor)
This thesis deals with design and implementation of effective system for word n-grams extraction from texts. System is based on batch processing therefore it is able to process large text corpuses. The first part contains principles of existing methods for an n-gram extraction. The next part includes description of the implemented system as well as the approach of acceleration system by paralelizing the batch processing. The last part contains efficiency comparison between available implementations and designed system and time complexity comparison between sequential and paralelized approach.
Processing Czech in Python
Novotný, Zdeněk ; Schmidt, Marek (referee) ; Smrž, Pavel (advisor)
This bachorelor´s thesis presents some ways of Czech language processing. The first part contains a general destription of NLTK system. Some of aftermentioned functions were inspired by NLTK functions. There are described functions which attend to inflection and inflexion of various words class in Czech language. Next part is focused on processing of the text in Czech language in which are found and marked each sentences and other parts. Last part describes possibillity of tranformations rules application for each part of text. Results after rules application could be represented graphically.
Czech-English Translation
Petrželka, Jiří ; Schmidt, Marek (referee) ; Smrž, Pavel (advisor)
Tato diplomová práce popisuje principy statistického strojového překladu a demonstruje, jak sestavit systém pro statistický strojový překlad Moses. V přípravné fázi jsou prozkoumány volně dostupné bilingvní česko-anglické korpusy. Empirická analýza časové náročnosti vícevláknových nástrojů pro zarovnání slov demonstruje, že MGIZA++ může dosáhnout až pětinásobného zrychlení, zatímco PGIZA++ až osminásobného zrychlení (v porovnání s GIZA++). Jsou otestovány tři způsoby morfologického pre-processingu českých trénovacích dat za použití jednoduchých nefaktorových modelů. Zatímco jednoduchá lemmatizace může snížit BLEU, sofistikovanější přístupy většinou BLEU zvyšují. Positivní efekty morfologického pre-processingu se vytrácejí s růstem velikosti korpusu. Vztah mezi dalšími charakteristikami korpusu (velikost, žánr, další data) a výsledným BLEU je empiricky měřen. Koncový systém je natrénován na korpusu CzEng 0.9 a vyhodnocen na testovacím vzorku z workshopu WMT 2010.
Similarity Search in Document Collections
Jordanov, Dimitar Dimitrov ; Plchot, Oldřich (referee) ; Smrž, Pavel (advisor)
Hlavním cílem této práce je odhadnout výkonnost volně šířeni balík  Sémantický Vektory a třída MoreLikeThis z balíku Apache Lucene. Tato práce nabízí porovnání těchto dvou přístupů a zavádí metody, které mohou vést ke zlepšení kvality vyhledávání.
Data Mining in Social Networks
Raška, Jiří ; Očenášek, Pavel (referee) ; Bartík, Vladimír (advisor)
This thesis deals with knowledge discovery from social media. This thesis is focused on feature based opinion mining from user reviews. In theoretical part were described methods of opinion mining and natural language processing. Main parts of this thesis were design and implementation of library for opinion mining based on Stanford Parser and lexicon WordNet. For feature identi cation was used dependency grammar, implicit features were mined with method CoAR and opinions were classi ed with supervised algorithm. Finally were given experiments with implemented library and examples of usage.
Document Classification
Marek, Tomáš ; Škoda, Petr (referee) ; Otrusina, Lubomír (advisor)
This thesis deals with a document classification, especially with a text classification method. Main goal of this thesis is to analyze two arbitrary document classification algorithms to describe them and to create an implementation of those algorithms. Chosen algorithms are Bayes classifier and classifier based on support vector machines (SVM) which were analyzed and implemented in the practical part of this thesis. One of the main goals of this thesis is to create and choose optimal text features, which are describing the input text best and thus lead to the best classification results. At the end of this thesis there is a bunch of tests showing comparison of efficiency of the chosen classifiers under various conditions.
Named Entity Recognition
Rylko, Vojtěch ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor)
In this master thesis are described the history and theoretical background of named-entity recognition and implementation of the system in C++ for named entity recognition and disambiguation. The system uses local disambiguation method and statistics generated from the  Wikilinks web dataset. With implemented system and with alternative implementations are performed various experiments and tests. These experiments show that the system is sufficiently successful and fast. System participates in the Entity Recognition and Disambiguation Challenge 2014.

National Repository of Grey Literature : 142 records found   beginprevious123 - 132next  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.