National Repository of Grey Literature 4 records found  Search took 0.00 seconds. 
Web Page Segmentation Algorithms Based on Clustering
Lengál, Tomáš ; Bartík, Vladimír (referee) ; Burget, Radek (advisor)
This report deals with segmentation of web pages, which is important discipline of information extraction. In the first part, we describe several general ways to implement it. After that we introduce method Box Clustering Segmentation, which comes with a slightly different approach towards segmentation. In the second half, we describe implementation of this method as a part of framework FITLayout and final testing.
Web Page Segmentation Algorithms
Laščák, Tomáš ; Burgetová, Ivana (referee) ; Burget, Radek (advisor)
Segmentation of web pages is one of the disciplines of information extraction. It allows to divide the page into different semantic blocks. This thesis deals with the segmentation as such and also with the implementation of the segmentation method. In this paper, we describe various examples of methods such as VIPS, DOM PS etc. There is a theoretical description of the chosen method and also the FITLayout Framework, which will be extended by this method. The implementation of the chosen method is also described in detail. The implementation description is focused on describing the different problems we had to solve. We also describe the testing that helped to reveal some weaknesses. The conclusion is a summary of the results and possible ideas for extending this work.
Web Page Segmentation Algorithms Based on Clustering
Lengál, Tomáš ; Bartík, Vladimír (referee) ; Burget, Radek (advisor)
This report deals with segmentation of web pages, which is important discipline of information extraction. In the first part, we describe several general ways to implement it. After that we introduce method Box Clustering Segmentation, which comes with a slightly different approach towards segmentation. In the second half, we describe implementation of this method as a part of framework FITLayout and final testing.
Web Page Segmentation Algorithms
Laščák, Tomáš ; Burgetová, Ivana (referee) ; Burget, Radek (advisor)
Segmentation of web pages is one of the disciplines of information extraction. It allows to divide the page into different semantic blocks. This thesis deals with the segmentation as such and also with the implementation of the segmentation method. In this paper, we describe various examples of methods such as VIPS, DOM PS etc. There is a theoretical description of the chosen method and also the FITLayout Framework, which will be extended by this method. The implementation of the chosen method is also described in detail. The implementation description is focused on describing the different problems we had to solve. We also describe the testing that helped to reveal some weaknesses. The conclusion is a summary of the results and possible ideas for extending this work.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.