National Repository of Grey Literature 1 records found  Search took 0.01 seconds. 

Clusters of closely related documents
Diviš, Jiří ; Húsek, Dušan (referee) ; Holub, Martin (advisor)
This thesis focuses on automatic searching for clusters of topically similar texts in large text collection. We introduce an algorithm for nding the clusters and a method of optimizing its parameters using machine learning techniques. The algorithm is implemented and experimentaly evaluated. For evaluation we use a manually annotated collection of Czech documents, which contains a set of sample clusters chosen and tagged by a human annotator, and a huge collection of newspaper arcticles. Experiments show that the output of our algorithm ful ls our expectation and gives clusters of topically similar texts.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.