Extracting Information from Medical Texts
Zvára, Karel ; Svátek, Vojtěch (advisor) ; Veselý, Arnošt (referee) ; Skalská, Hana (referee)
The aim of my work was to find out the specific features of Czech medical reports in terms of the possibility of extracting specific information from them. For my work, I had a total of 268 anonymized narrative medical reports from two outpatient departments. I have studied standards for preserving electronic health records and for transferring clinical information between healthcare information systems. I have also participated in the process of implementing electronic medical record in the field of dentistry. First of all, I tried to process narrative medical reports using natural language processing (NLP) tools. I came to the conclusion that narrative medical reports in the Czech language are very different than a typical Czech text, especially because it mostly contains short telegraphic phrases and the texts lack typical Czech sentence structure. It also contains many misspellings, acronyms and abbreviations. Another problem was the absence of existence of the Czech translation of the main international classification systems. Therefore I decided to continue the research by developing the method for pro-processing the input text for translation and its semantic annotation. The main objective of this part of the research was to propose a method and support software for interactive correction...
Data comparability in knowledge discovery in databases
Horáková, Linda ; Chudán, David (advisor) ; Svátek, Vojtěch (referee)
The master thesis is focused on analysis of data comparability and commensurability in datasets, which are used for obtaining knowledge using methods of data mining. Data comparability is one of aspects of data quality, which is crucial for correct and applicable results from data mining tasks. The aim of the theoretical part of the thesis is to briefly describe the field of knowledqe discovery and define specifics of mining of aggregated data. Moreover, the terms of comparability and commensurability is discussed. The main part is focused on process of knowledge discovery. These findings are applied in practical part of the thesis. The main goal of this part is to define general methodology, which can be used for discovery of potential problems of data comparability in analyzed data. This methodology is based on analysis of real dataset containing daily sales of products. In conclusion, the methodology is applied on data from the field of public budgets.
Analysis of utilization
Káva, Ján ; Svátek, Vojtěch (advisor) ; Mynarz, Jindřich (referee)
The bachelor thesis is about utilization of semantic model The theoretical part provides an analysis of using with data standards Microdata, RDFa and JSON-LD. Furthermore the theoretical part is describing search engine optimization and posibilities of in its improving. In the practical part there is an analysis of utilization of in data formats Microdata, RDFa and JSON-LD, based on data extracted from a web crawl in October 2016.
XML Formats Evolution and Integration
Klímek, Jakub ; Nečaský, Martin (advisor) ; Klettke, Meike (referee) ; Svátek, Vojtěch (referee)
In the past decade XML became a wide-spread information exchange data model. Many XML users have many XML formats described by XML schemas. To ease the management of multiple XML schemas modeling similar reality, the conceptual model for XML was defined. With its definition came various challenges that needed to be further researched. This thesis focuses on two of those challenges. The first challenge is to manage the evolution of the multi-level conceptual model as the modeled reality, XML schemas and applications evolve in time. The second challenge is to allow the majority of users who already use XML schemas in their system without the conceptual model to use their schemas to semi-automatically create one. In addition a step towards integration of the conceptual modeling of XML and semantic web techniques was taken.
Linked Open Data for Public Sector Information
Mynarz, Jindřich ; Svátek, Vojtěch (advisor) ; Urbánek, Štefan (referee)
The diploma thesis introduces the domain of proactive disclosure of public sector information via linked open data. At the start, the legal framework encompassing public sector information is expounded along with the basic approaches for its disclosure. The practices of publishing data as open data are defined as an ap- proach for proactive disclosure that is based on the application of the principle of openness to data with the goal to enable equal access and equal use of the data. The reviewed practices range from necessary legal actions, choices of appropriate technologies, and ways in which the technologies should be used to achieve the best data quality. Linked data is presented as a knowledge technology that, for the most part, fulfils the requirements on open technology suitable for open data. The thesis extrapolates further from the adoption of linked open data in the public sector to recognize the impact and challenges proceeding from this change. The distinctive focus on the side supplying data and the trust in the transformative effects of technological changes are identified among the key sources of these challenges. The emphasis on technologies for data disclosure at the expense of a more careful attention to the use of data is presented as a possible source of risks that may undermine the...
Methods for effective querying of RDF data
Dokulil, Jiří ; Pokorný, Jaroslav (advisor) ; Svátek, Vojtěch (referee) ; Benczúr, András (referee)
The RDF is one of the basic building blocks of the Semantic Web. It is a low-level data format intended to be used by software developers to create semantic-enabled applications. The ability to place and efficiently evaluate queries is key in this scenario. In this thesis, we approach the problem of RDF querying from three different angles. First, we present an RDF visualization tool, that the developer can use to get an idea about the structure and contents of the data. Second, we have designed extensions of the XQuery language that allow us to give it RDF handling capabilities. The main contribution is introduction of records into the language. Third, to cover query evaluation, we have designed the Bobox parallel framework, which can be used to simplify development of parallel data processing applications. It provides both task and data parallelism.
Process Mediation Framework for Semantic Web Services
Vaculín, Roman ; Neruda, Roman (advisor) ; Nečaský, Martin (referee) ; Svátek, Vojtěch (referee)
The goal of Web services is to enable interoperability of heterogeneous software systems. Semantic Web services enhance syntactic specifications of traditional Web services with machine processable semantic annotations to facilitate interoperability. AsWeb services get popular in both corporate and open environments, the ability to deal with uncompatibilities of service requesters and providers becomes a critical factor for achieving interoperability. Process mediation solves the problem of interoperability by identifying and resolving all incompatibilities and by mediating between service requesters and providers. In this thesis we address the problem of process mediation of Semantic Web services. We introduce an Abstract Process Mediation Framework that identifies the key functional areas to be addressed by process mediation components. Specifically, we focus on process mediation algorithms, discovery of external services, monitoring, and fault handling and recovery. We present algorithms for solving the process mediation problem in two scenarios: (a) when the mediation process has complete visibility of the process model of the service provider and the service requester (complete visibility scenario), and (b) when the mediation process has visibility only of the process model of the service provider but...
Knowledge Systems on the Semantic Web
Pinďák, Josef ; Zamazal, Ondřej (advisor) ; Svátek, Vojtěch (referee)
The aim of this paper is to understand problematic of knowledge systems on semantic web. Analyze case studies a cases of usage shown on web portal W3C and create knowledge basis for expert system NEST, which on basis of analysis of case study of case usage will recommend clasification to particular type of knowledge system. The aim is achieved by analysis of case studies and case usages. The result of analysis is to set criteria to case study or case usage. On base of criteria, which are passed to expert system. Expert system will recommend clasification of case study or case usage. Knowledge basis also used for its developement data from dissertation [1], where are knowledge basis already separated into individual categories. Benefit of this work is creation of expert system, which allows user to get recommendation how to determinate type of unknown knowledge system and its field of potentional usage. First two chapters are dedicated to theory of knowledge systems and semantic web. Third chapter explains the steps to usage of analysis and fourth chapter is dedicated to creation of knowledge basis for system NEST.
Ontology of Building Accessibility
Hazuza, Petr ; Svátek, Vojtěch (advisor) ; Mynarz, Jindřich (referee)
Within the project Maps without Barriers realized under Charta 77 Foundation - Barriers Account, in 2015 we intend to map accessibility of buildings and its premises from the perspective of people with limited mobility. We plan to inspect nearly 600 castles, palaces and other tourist attractions in the Czech Republic. The acquired data will be gathered and published as an on-line map in form of open and machine-readable data. It will also appear as Linked Open Data. However, the project will not end with mapping premises, the main objective is to provide a solid foundation for a unified database of accessibility of buildings and its premises. Negotiations with institutions and organizations interested in mapping are in progress and we try to offer them our project platform for publication of their data. The required RDFS vocabulary will be designed and carried out as part of this diploma thesis. It will be tested on the data from a number of forms describing existing objects. The data will be gathered by means of services designed in terms of this theses and provided for purchasers and users equally.
Dolování asociačních pravidel jako podpora pro OLAP
Chudán, David ; Svátek, Vojtěch (advisor) ; Máša, Petr (referee) ; Novotný, Ota (referee) ; Kléma, Jiří (referee)
The aim of this work is to identify the possibilities of the complementary usage of two analytical methods of data analysis, OLAP analysis and data mining represented by GUHA association rule mining. The usage of these two methods in the context of proposed scenarios on one dataset presumes a synergistic effect, surpassing the knowledge acquired by these two methods independently. This is the main contribution of the work. Another contribution is the original use of GUHA association rules where the mining is performed on aggregated data. In their abilities, GUHA association rules outperform classic association rules referred to the literature. The experiments on real data demonstrate the finding of unusual trends in data that would be very difficult to acquire using standard methods of OLAP analysis, the time consuming manual browsing of an OLAP cube. On the other hand, the actual use of association rules loses a general overview of data. It is possible to declare that these two methods complement each other very well. The part of the solution is also usage of LMCL scripting language that automates selected parts of the data mining process. The proposed recommender system would shield the user from association rules, thereby enabling common analysts ignorant of the association rules to use their possibilities. The thesis combines quantitative and qualitative research. Quantitative research is represented by experiments on a real dataset, proposal of a recommender system and implementation of the selected parts of the association rules mining process by LISp-Miner Control Language. Qualitative research is represented by structured interviews with selected experts from the fields of data mining and business intelligence who confirm the meaningfulness of the proposed methods.

