National Repository of Grey Literature 5 records found  Search took 0.01 seconds. 
Machine Learning in the Domain of Stylometry and Authorship Attribution
Drápela, Karel ; Škoda, Petr (referee) ; Smrž, Pavel (advisor)
Thesis deals with authorship attribution of english internet comments. It describes state of art in authorship attribution on social networks. It decsribes how the new system created during the work on this thesis functions. System is based on selection of most informative characteristics mostly from character n-grams and part of speech tags. It presents results of testing on comments from social networks Quora and Twitter.
Source Code Authorship Attribution
Pružina, Tomáš ; Dytrych, Jaroslav (referee) ; Smrž, Pavel (advisor)
Se vzestupem internetu se trend plagiátorství zdrojového kódu a porušování autorských práv zvyšuje a stal se problémem jak v akademickém, tak v podnikovém prostředí. S ohledem na to je důležité, aby bylo možné identifikovat autora zdrojového kódu založeného na stylomentrických technikách, aby se zabránilo neetickému porušování autorského intelektuálního vlastnictví. Tato práce shrnuje moderní přístupy k přiřazení autorského zdrojového kódu a představuje nástroj pro přiřazení autorů obecného určení, který se pokouší odhalit autora zdrojového kódu.
Assessing the reliability of stress as a feature of authorship attribution in syllabic and accentual syllabic verse
Plecháč, Petr ; Birnbaum, D. J.
This work builds on a recent study by one of the authors, which shows that statistics about versification may be used as a feature in the process of authorship attribution. One such statistic is what we have called the stress profile of a poem, a vector consisting of frequencies of stressed syllables at particular metrical positions. Our initial hypothesis was that because syllabic versification (SV) regulates by definition the number of syllables in a line but not the distribution of stresses, it allows authors to individualize their rhythmical style much more than accentual syllabic versification (ASV), where the distribution of stresses is primarily determined by meter. For that reason, we expected the stress profile to be a more reliable indicator of authorship in Spanish SV than in Czech or German ASV. This hypothesis, however, was not supported by our analysis. For most of our samples, German ASV had lower accuracy than Spanish, which we had predicted, but, contrary to our expectations, the accuracy for Czech ASV and Spanish SV were more or less the same. This result led us to hypothesize further that the traditional labels SV and ASV were misleading and we sought to measure the tonic entropy of our data. In this case, Spanish SV, as expected, was found to be the least tonically regular, while there was a significant difference between the two ASV systems: the values for Czech were even closer to Spanish than to the low-scoring German system. This explains why our initial grouping of Czech and German together into a single ASV category was insufficiently nuanced.
Source Code Authorship Attribution
Pružina, Tomáš ; Dytrych, Jaroslav (referee) ; Smrž, Pavel (advisor)
Se vzestupem internetu se trend plagiátorství zdrojového kódu a porušování autorských práv zvyšuje a stal se problémem jak v akademickém, tak v podnikovém prostředí. S ohledem na to je důležité, aby bylo možné identifikovat autora zdrojového kódu založeného na stylomentrických technikách, aby se zabránilo neetickému porušování autorského intelektuálního vlastnictví. Tato práce shrnuje moderní přístupy k přiřazení autorského zdrojového kódu a představuje nástroj pro přiřazení autorů obecného určení, který se pokouší odhalit autora zdrojového kódu.
Machine Learning in the Domain of Stylometry and Authorship Attribution
Drápela, Karel ; Škoda, Petr (referee) ; Smrž, Pavel (advisor)
Thesis deals with authorship attribution of english internet comments. It describes state of art in authorship attribution on social networks. It decsribes how the new system created during the work on this thesis functions. System is based on selection of most informative characteristics mostly from character n-grams and part of speech tags. It presents results of testing on comments from social networks Quora and Twitter.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.