Topic modelling of the publication activities of the Czech Academy of Sciences and Arts in the years 1890-1910
Kersch, Filip ; Marek, Jindřich (advisor) ; Jarolímková, Adéla (referee)
The aim of the thesis is to present an overview of topics that were the subject of research of the Czech Academy from 1890 to 1910 using computational analysis of its publications. The work complements and extends the existing state of knowledge in this subject area, which has so far been limited to analyses of specific publications, fields of study or scientific classes of the Academy, but did not allow a view of the areas of interest of the Academy as a whole. The introductory part of the thesis presents the context of the Academy's founding, summarizes the current state of understanding of the topics that Academy scholars have addressed and describes how the Academy's digitized publications can be used for analysis using computational methods within the digital humanities. In the research part, digitized issues of the scholarly journal Rozpravy, which represented the core publication platform of the Czech Academy, were obtained from the Digital Library of the Czech Academy of Sciences. The obtained data were processed using freely available tools and used as input for topic modelling using the LDA (Latent Dirichlet Allocation) method. The result of the thesis is a comprehensive overview of 35 specific topics that the Czech Academy addressed in the first twenty years of its existence. It...
Methodology for preparing data from digital libraries for use in digital humanities
Lehečka, B. ; Novák, D. ; Kersch, Filip ; Hladík, Radim ; Bíšková, J. ; Sekyrová, K. ; Válek, F. ; Vozár, Z. ; Bodnár, N. ; Sekan, P. ; Bežová, M. ; Žabička, P. ; Lhoták, Martin ; Straňák, Pavel
This methodology aims to offer libraries and other memory institutions in the Czech Republic a recommended procedure for making large volumes of data available for research purposes. Currently, from this point of view, a more than critical amount of documents from library collections are digitized, while the results of digitization are presented in various digital library systems. When making them available, it is always necessary to proceed from the current version of the copyright law, but it is already possible to prepare for its significant amendment, which implements Directive 2019/790 of the European Parliament and the Council and concerns, among other things, the extraction of texts and data for scientific purposes. The architecture of the newly developed system for digital libraries recommended by the methodology will ensure scalability, easy management and the development of related services. The presented methods of data processing, their enrichment and output formats are based on the requirements of specialists from the entire range of humanities.
Standards used for encoding of pre-modern texts and ways of their presentation in digital libraries
Kersch, Filip ; Marek, Jindřich (advisor) ; Dvořák, Jan (referee)
The bachelor thesis provides an overview of standards and methods used for encoding of pre- modern texts and ways of presenting these texts in digital libraries. The literature review part of the thesis describes the principles for encoding electronic texts created by the TEI consortium and the process of preparing an encoded text. The structure of the TEI Guidelines, its modularity, and the possibilities of describing textual and non-textual elements contained in documents are presented. Furthermore, this part also investigates different ways of presentation of texts in digital libraries, considering current trends and especially the web usability principles. In the practical part of the thesis, selected digital libraries are analysed according to a checklist created on the basis of the theoretical part. Three main areas are examined: ways of presenting texts, including their encoding, interoperability aspects and compliance with usability principles. The information obtained is evaluated and put in the context of the theory at the end of the thesis. Keywords: digital library, text encoding, TEI, web usability

