National Repository of Grey Literature 59 records found  beginprevious39 - 48nextend  jump to record: Search took 0.02 seconds. 
Semantic annotations
Dědek, Jan ; Vojtáš, Peter (advisor) ; Maynard, Diana (referee) ; Železný, Filip (referee)
Four relatively separate topics are presented in the thesis. Each topic represents one particular aspect of the Information Extraction discipline. The first two topics are focused on our information extraction methods based on deep language parsing. The first topic relates to how deep language parsing was used in our extraction method in combination with manually designed extraction rules. The second topic deals with a method for automated induction of extraction rules using Inductive Logic Programming. The third topic of the thesis combines information extraction with rule based reasoning. The core of our extraction method was experimentally reimplemented using semantic web technologies, which allows saving the extraction rules in so called shareable extraction ontologies that are not dependent on the original extraction tool. The last topic of the thesis deals with document classification and fuzzy logic. We are investigating the possibility of using information obtained by information extraction techniques to document classification. Our implementation of so called Fuzzy ILP Classifier was experimentally used for the purpose of document classification.
Consistency Checking of Relations Extracted from Text
Stejskal, Jakub ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor)
This bachelor thesis is dedicated to mechanical techniques that are used in the natural language processing and information extraction from particular text. It is approaching the general methods that starting to process the raw text and it continues to the relations extraction from processed language constructs, moreover it provides options for the use of obtained relational data which can be seen for example in the project DBpedia. Another milestone of the described bachelor thesis is the design and implementation of an automated system for extracting information about entities, which do not have their own article on the English version of Wikipedia. Thesis also presents algorithms developed for the extraction of entities with their own name, the verification of the articles ‘existence of the extracted entities and finally the actual extraction of information about individual entities, which can be used during the information consistency checking. In the end, it can be seen the results and suggestions for further development of the created system.
Automatically Updated Web Portal
Staněk, Petr ; Škoda, Petr (referee) ; Smrž, Pavel (advisor)
This bachelor's thesis is dedicated to the design and implementation of an automatically updated web portal that tries to resolve the shortcomings of the portals filled with other people's content. Furthermore, it presents a comparison of the existing scientific portals, it discusses the problems of extraction, saving and searching for information. General mechanisms are demonstrated on the European research projects portal, which removes the shortcomings of CORDIS, the official information portal for European research and development. The thesis takes the existing product as a prototype and its aim is to improve the quality of the extraction and extend the system to detect any potential problems and notified an administrator of them. This was achieved by increasing the robustness and speed of the extractor, by registering all the important events associated with the extraction and, on the other side, the implementation of the separate administrator section of the web portal, which informs the administrator about problems and offers the problem-solving devices.
Extracting text data from the webpages
Troják, David ; Morský, Ondřej (referee) ; Červenec, Radek (advisor)
This work deals with text mining from web pages, an overview of available programs and its methods of text extraction. Part of this work is the program created in Java language, which allows text to obtain data from specific web pages and save them into XML file.
Metadata Extraction from Scientific Papers
Lokaj, Tomáš ; Dytrych, Jaroslav (referee) ; Otrusina, Lubomír (advisor)
This work deals with the Metadata Extraction from Scienti c Papers. There is generally described issue of information extraction, focusing on the processing of text documents. There is also presented programme clanky2meta.py designed to search for relevant  information in scienti c publication, created by the author. At the end of this work is a comparsion of systems dealing with the same issue, especially with the CiteSeerX system.
Support of Information Extraction from Structured Text
Kliment, Radek ; Petřík, Patrik (referee) ; Křivka, Zbyněk (advisor)
This Bachelor thesis deals with the way of information extraction from a structured text. The application converts the text from supported formats into the XML representation that is used for queries and then, corresponding output is created. In this thesis, particular formats of input files are described including the way of their conversion into the XML. The essential part explains the application functionality and implementation including short user manual.
Extraction of Relations among Named Entities Mentioned in Text
Voháňka, Ondřej ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor)
This bachelor's thesis deals with relation extraction. Explains basic knowledge, that is necessary for creating an extraction system. Then describes design, implementation and comparison of three systems, which works differently. Following methods were used: regular expressions, NER, parser. 
Information Extraction from Wikipedia
Krištof, Tomáš ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor)
This bachelor's thesis describes the issue of information extraction from unstructured text. The first part contains summary of basic techniques used for information extracting. Thereafter, concept and realization of the system for information extraction from Wikipedia is described. In the last part of thesis, results, coming from experiments, are analysed.
Encyclopedia Expert
Krč, Martin ; Schmidt, Marek (referee) ; Smrž, Pavel (advisor)
This project focuses on a system that answers questions formulated in natural language. Firstly, the report discusses problems associated with question answering systems and some commonly employed approaches. Emphasis is laid on shallow methods, which do not require many linguistic resources. The second part describes our work on a system that answers factoid questions, utilizing Czech Wikipedia as a source of information. Answer extraction is partly based on specific features of Wikipedia and partly on pre-defined patterns. Results show that for answering simple questions, the system provides significant improvements in comparison with a standard search engine.
Framework for Information Exctration from WWW
Brychta, Filip ; Bartík, Vladimír (referee) ; Burget, Radek (advisor)
Web environment has developed into the largest source of electronic documents, so it would be very useful, to process this information automatically. This is however not a trivial problem. Most documents are written in HTML (Hypertext Markup Language), which does not support semantic description of the content. The goal of this work is to create modular system for information extraction and further processing of this information from HTML documents. Further processing of information means to store this information in XML document or relational database. System modularity makes it possible to use various information extraction and storing methods, thus the system can be used for various tasks.

National Repository of Grey Literature : 59 records found   beginprevious39 - 48nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.