National Repository of Grey Literature 2 records found  Search took 0.01 seconds. 
Framework for Information Exctration from WWW
Brychta, Filip ; Bartík, Vladimír (referee) ; Burget, Radek (advisor)
Web environment has developed into the largest source of electronic documents, so it would be very useful, to process this information automatically. This is however not a trivial problem. Most documents are written in HTML (Hypertext Markup Language), which does not support semantic description of the content. The goal of this work is to create modular system for information extraction and further processing of this information from HTML documents. Further processing of information means to store this information in XML document or relational database. System modularity makes it possible to use various information extraction and storing methods, thus the system can be used for various tasks.
Framework for Information Exctration from WWW
Brychta, Filip ; Bartík, Vladimír (referee) ; Burget, Radek (advisor)
Web environment has developed into the largest source of electronic documents, so it would be very useful, to process this information automatically. This is however not a trivial problem. Most documents are written in HTML (Hypertext Markup Language), which does not support semantic description of the content. The goal of this work is to create modular system for information extraction and further processing of this information from HTML documents. Further processing of information means to store this information in XML document or relational database. System modularity makes it possible to use various information extraction and storing methods, thus the system can be used for various tasks.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.