|
Methodology for preparing data from digital libraries for use in digital humanities
Lehečka, B. ; Novák, D. ; Kersch, Filip ; Hladík, Radim ; Bíšková, J. ; Sekyrová, K. ; Válek, F. ; Vozár, Z. ; Bodnár, N. ; Sekan, P. ; Bežová, M. ; Žabička, P. ; Lhoták, Martin ; Straňák, Pavel
This methodology aims to offer libraries and other memory institutions in the Czech Republic a recommended procedure for making large volumes of data available for research purposes. Currently, from this point of view, a more than critical amount of documents from library collections are digitized, while the results of digitization are presented in various digital library systems. When making them available, it is always necessary to proceed from the current version of the copyright law, but it is already possible to prepare for its significant amendment, which implements Directive 2019/790 of the European Parliament and the Council and concerns, among other things, the extraction of texts and data for scientific purposes. The architecture of the newly developed system for digital libraries recommended by the methodology will ensure scalability, easy management and the development of related services. The presented methods of data processing, their enrichment and output formats are based on the requirements of specialists from the entire range of humanities.
Plný tet: PDF
|
| |
| |
| |
|
Publicly accessible electronic resources to the study of the historical Czech in The Department of Language Development of The Institute of the Czech Language AS CR, v. v. i
Černá, Alena M. ; Lehečka, Boris ; Nejedlý, Petr ; Šimek, Štěpán ; Vajdlová, Miloslava
The article introduces two internet sources designated to the study of Older Czech language (13th to 18th centuries); both have been designed and run by The Department of Language Development at The Institute of the Czech Language at the Academy of Sciences of the Czech Republic. The first source, Vokabulář webový [Web Vocabulary] (http://vokabular.ujc.cas.cz), makes texts, images and audio materials available to the study of Older Czech language. The accessible materials are, primarily, both modern and historical dictionaries, amongst which the most salient is the, gradually growing, Elektronický slovník staré češtiny [Electronic Old-Czech Vocabulary] that treats Old-Czech lexicon from the dawn of Czech language to the end of the 15th century. Furthermore, Vokabulář includes electronic editions of the works originating in the period from the 13th century to the beginning of the 19th century, presented both as continuous texts and in the corpus version; digitalized copies of Older-Czech grammar books; basic scientific literature; audiobooks of Older-Czech texts; and software tools utilized for the work with historical texts. The second source is Lexikální databáze hu-manistické a barokní češtiny [Lexical Database of Humanistic and Baroque Czech] (http://madla.ujc.cas.cz). It records the Czech vocabulary of the 16th to 18th centuries based on the excerption of the authentic contemporary texts (both old prints and manuscripts): Lexical database illustrates the Czech vocabulary with direct quotations, including stating the source. Thus, Lexical Database partly substitutes the missing Czech vocabulary of the mentioned period.
|
|
Electronic processing and publication of Old Czech texts
Černá, Alena M. ; Lehečka, Boris
Electronic editions prepared in the Department of Language Development of the Institute of the Czech Language, Academy of Science of Czech Republic, v. v. i., are published in web sites Manuscriptorium and Vokabulář webový (in the Edition module and Old Czech text bank) and as electronic books in e-shop of the publishing house Academia. All electronic editions are prepared in Microsoft Word 2003 and are automatically exported to these outputs. There are two main output formats: XML TEI P5 standard and tagged text format for text bank; we use XSLT transformations and special software developed for this purposes.
|
| |