Original title:
The best of two worlds: cooperation of statistical and rule-based taggers for Czech
Translated title:
Dva typy značkování v češtině
Authors:
Spoustová, D. ; Hajič, J. ; Votrubec, J. ; Krbec, P. ; Květoň, Pavel Document type: Papers Conference/Event: Workshop on Balto-Slavonic Natural Language Processing 2007, Praha (CZ), 2007-06-29
Year:
2007
Language:
eng Abstract:
[eng][cze] Description of several hybrid disambiguation methods combining the strength of hand-written disambiguation rules and statistical taggers. Three different statistical (HMM, Maximum-Entropy and Averaged Perceptron)taggers are used in a tagging experiment using Prague Dependency Treebank. The results of the hybrid systems are better than any other method tried for Czech tagging so far.Popis hybridních disambiguačních metod; použití tří různých statistických taggerů (HMM, Maximum-Entropy, Averaged Perceptron), zhodnocení výsledků.
Keywords:
corpus lingustics; disambiguation; linguistic corpus; tagging Project no.: CEZ:AV0Z90610521 (CEP), 1ET100610409 (CEP), GA407/07/0679 (CEP) Funding provider: GA AV ČR, GA ČR Host item entry: ACL 2007. Proceedings of the Workshop on Balto-Slavonic Natural Language Processing 2007 Note: Související webová stránka: http://www.aclweb.org/anthology/W/W07/W07-1709
Institution: Institute of the Czech Language AS ČR
(web)
Document availability information: Fulltext is available at the institute of the Academy of Sciences. Original record: http://hdl.handle.net/11104/0157456