keywords:"Information Extraction" - Search Results - Digital Repository

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: keywords:"Information Extraction"

Search:

Search Tips :: Advanced Search

Search collections:

Sort by:	Display results:	Output format:

	Automatically Updated Web Portal Staněk, Petr ; Škoda, Petr (referee) ; Smrž, Pavel (advisor) This bachelor's thesis is dedicated to the design and implementation of an automatically updated web portal that tries to resolve the shortcomings of the portals filled with other people's content. Furthermore, it presents a comparison of the existing scientific portals, it discusses the problems of extraction, saving and searching for information. General mechanisms are demonstrated on the European research projects portal, which removes the shortcomings of CORDIS, the official information portal for European research and development. The thesis takes the existing product as a prototype and its aim is to improve the quality of the extraction and extend the system to detect any potential problems and notified an administrator of them. This was achieved by increasing the robustness and speed of the extractor, by registering all the important events associated with the extraction and, on the other side, the implementation of the separate administrator section of the web portal, which informs the administrator about problems and offers the problem-solving devices. Detailed record
	Information Extraction from Loosely Structured Text Minárik, Matej ; Bartík, Vladimír (referee) ; Burget, Radek (advisor) Nowadays we are speaking about Web 2.0, which means the web of documents rather than the web of data. Documents are mostly unstructured, or just partially structured, but search engines need data in structured form in order to provide better search results. The process of extracting structured data from partially structured documents is the main goal of this work. In this work we are analyzing information extraction methods, namely classification methods, which need annotated training data, in order to create their inner model. We also analyze methods, which do not need training. These methods are initialized with a few data examples we are interested in extracting. We propose an extraction method in order to extract therapeutic indications and active substances from medical information sheets. Detailed record
	Extracting text data from the webpages Troják, David ; Morský, Ondřej (referee) ; Červenec, Radek (advisor) This work deals with text mining from web pages, an overview of available programs and its methods of text extraction. Part of this work is the program created in Java language, which allows text to obtain data from specific web pages and save them into XML file. Detailed record
	Metadata Extraction from Scientific Papers Lokaj, Tomáš ; Dytrych, Jaroslav (referee) ; Otrusina, Lubomír (advisor) This work deals with the Metadata Extraction from Scienti c Papers. There is generally described issue of information extraction, focusing on the processing of text documents. There is also presented programme clanky2meta.py designed to search for relevant information in scienti c publication, created by the author. At the end of this work is a comparsion of systems dealing with the same issue, especially with the CiteSeerX system. Detailed record
	Support of Information Extraction from Structured Text Kliment, Radek ; Petřík, Patrik (referee) ; Křivka, Zbyněk (advisor) This Bachelor thesis deals with the way of information extraction from a structured text. The application converts the text from supported formats into the XML representation that is used for queries and then, corresponding output is created. In this thesis, particular formats of input files are described including the way of their conversion into the XML. The essential part explains the application functionality and implementation including short user manual. Detailed record
	Extraction of Relations among Named Entities Mentioned in Text Voháňka, Ondřej ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor) This bachelor's thesis deals with relation extraction. Explains basic knowledge, that is necessary for creating an extraction system. Then describes design, implementation and comparison of three systems, which works differently. Following methods were used: regular expressions, NER, parser. Detailed record
	Information Extraction from Wikipedia Krištof, Tomáš ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor) This bachelor's thesis describes the issue of information extraction from unstructured text. The first part contains summary of basic techniques used for information extracting. Thereafter, concept and realization of the system for information extraction from Wikipedia is described. In the last part of thesis, results, coming from experiments, are analysed. Detailed record
	Encyclopedia Expert Krč, Martin ; Schmidt, Marek (referee) ; Smrž, Pavel (advisor) This project focuses on a system that answers questions formulated in natural language. Firstly, the report discusses problems associated with question answering systems and some commonly employed approaches. Emphasis is laid on shallow methods, which do not require many linguistic resources. The second part describes our work on a system that answers factoid questions, utilizing Czech Wikipedia as a source of information. Answer extraction is partly based on specific features of Wikipedia and partly on pre-defined patterns. Results show that for answering simple questions, the system provides significant improvements in comparison with a standard search engine. Detailed record
	Framework for Information Exctration from WWW Brychta, Filip ; Bartík, Vladimír (referee) ; Burget, Radek (advisor) Web environment has developed into the largest source of electronic documents, so it would be very useful, to process this information automatically. This is however not a trivial problem. Most documents are written in HTML (Hypertext Markup Language), which does not support semantic description of the content. The goal of this work is to create modular system for information extraction and further processing of this information from HTML documents. Further processing of information means to store this information in XML document or relational database. System modularity makes it possible to use various information extraction and storing methods, thus the system can be used for various tasks. Detailed record
	Methods for Information Extraction in Text Documents Sychra, Tomáš ; Burget, Radek (referee) ; Bartík, Vladimír (advisor) Knowledge discovery in text documents is part of data mining. However, text documents have different properties in comparison to regular databases. This project contains an overview of methods for knowledge discovery in text documents. The most frequently used task in this area is document classification. Various approaches for text classification will be described. Finally, I will present algorithm Winnow that should perform better than any other algorithm for classification. There is a description of Winnow implementation and an overview of experimental results. Detailed record

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English