| |
|
The Corpus of Radio Records
Štěpánová, Veronika
The paper presents the upcoming corpus of records of professional radio speakers, whose creation has been supported by the grant of GA UK. This corpus of records with their transcriptions will serve as a source for the research of the pronunciation usage in Czech media. The material has been gained from the audio archives of the Czech radio, from which records of many radio programmes in a good quality can be downloaded. The choice of speakers is focused on those speakers from whom a higher number of longer monological records are available, so that it would become probable that the observed phenomena will be found in a sufficient number of occurrences.
|
| |
|
The usage of an InterCorp parallel corpus to obtain equivalents for a Croatian-Czech dictionary
Jirásek, Karel
In the year 2010, the parallel Croatian-Czech corpus, which is part of the InterCorp project, exceeded 10 million tokens in both language versions and therefore it was possible to proceed to its practical use in finding equivalents for the Croatian-Czech dictionaries being prepared. The indicated size of the corpus has proved to be quite adequate for building a medium size dictionary, which should contain approximately 20 thousand entries. In the case of frequent entries, due to the confrontation of two languages, this corpus size is often adequate to capture the polysemy of words better than some existing explanatory or bilingual dictionaries. This is a great advantage, especially for the creation of bilingual dictionaries of two closely related languages.
|