Original title: Automatická tvorba korpusů
Translated title: Automatic Creation of Corpora
Authors: Šantavý, Marek ; Černocký, Jan (referee) ; Smrž, Pavel (advisor)
Document type: Bachelor's theses
Year: 2009
Language: cze
Publisher: Vysoké učení technické v Brně. Fakulta informačních technologií
Abstract: [cze] [eng]

Keywords: corpus; near-duplicate; Rabin fingerprint; redundancy; SHA-384; text-data similarity; vertical format; web crawl; duplicity; korpus; podobnost textových dat; Rabin otisk; redundance; SHA-384; stahování obsahu webu; vertikální text

Institution: Brno University of Technology (web)
Document availability information: Fulltext is available in the Brno University of Technology Digital Library.
Original record: http://hdl.handle.net/11012/54503

Permalink: http://www.nusl.cz/ntk/nusl-571324


The record appears in these collections:
Universities and colleges > Public universities > Brno University of Technology
Academic theses (ETDs) > Bachelor's theses
 Record created 2024-04-02, last modified 2024-04-03


No fulltext
  • Export as DC, NUŠL, RIS
  • Share