National Repository of Grey Literature 1 records found  Search took 0.01 seconds. 
Universal Framework for HTML Triplification
Kadleček, Rastislav ; Stárka, Jakub (advisor) ; Klímek, Jakub (referee)
The aim of this bachelor thesis is to introduce Linked Data and Resource Description Framework technologies, and map the current situation in the field of HTML document data extraction and extracted data conversion to RDF format. In this thesis, the software system Strigil is introduced. This system is designed to triplificate data from HTML documents, however, it is extensible for another file formats. The features of this system are demonstrated by triplificating data from selected Web sites. Then, some statistical information about this RDF data are shown. In the conclusion of this thesis, the entire thesis is summarized, and some useful hints about Web site scraping are mentioned.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.