National Repository of Grey Literature 8 records found  Search took 0.01 seconds. 
Big Data Governance
Blahová, Leontýna ; Pejčoch, David (advisor) ; Kyjonka, Vladimír (referee)
This master thesis is about Big Data Governance and about software, which is used for this purposes. Because Big Data are huge opportunity and also risk, I wanted to map products which can be easily use for Data Quality and Big Data Governance in one platform. This thesis is not only on theoretical knowledge level, but also evaluates five key products (from my point of view). I defined requirements for every kind of domain and then I set up the weights and points. The main objective is to evaluate software capabilities and compere them.
Towards Complex Data and Information Quality Management
Pejčoch, David ; Rauch, Jan (advisor) ; Máša, Petr (referee) ; Novotný, Ota (referee) ; Kordík, Pavel (referee)
This work deals with the issue of Data and Information Quality. It critically assesses the current state of knowledge within tvarious methods used for Data Quality Assessment and Data (Information) Quality improvement. It proposes new principles where this critical assessment revealed some gaps. The main idea of this work is the concept of Data and Information Quality Management across the entire universe of data. This universe represents all data sources which respective subject comes into contact with and which are used under its existing or planned processes. For all these data sources this approach considers setting the consistent set of rules, policies and principles with respect to current and potential benefits of these resources and also taking into account the potential risks of their use. An imaginary red thread that runs through the text, the importance of additional knowledge within a process of Data (Information) Quality Management. The introduction of a knowledge base oriented to support the Data (Information) Quality Management (QKB) is therefore one of the fundamental principles of the author proposed a set of best
The role of data profiling in data quality management
Fišer, David ; Pejčoch, David (advisor) ; Kyjonka, Vladimír (referee)
The goal of this thesis is to research to role of data profiling in data quality management and the quality of existing data profiling SW tools. The role of data profiling was based up an analysis of the general methodology of approach to the management of data quality. I've compiled an own methodology for the analysis of data profiling tools. This methodology focuses on three main aspects: Technical requirements, User friendliness and Functionality. Based on the methodology the best SW solution for data profiling was chosen. General shortcomings of tested SW tools were also analysed.
Data Quality Tools Benchmark
Černý, Jan ; Pejčoch, David (advisor) ; Máša, Petr (referee)
Companies all around the world are wasting their funds due to the poor data quality. Rationally speaking as the volume of processed data increase, the volume of error data increase too. This diploma thesis explains what is it data quality about, what are the causes of data quality errors, the impact of poor data and the way it can be measured. If you can measure it, you can improve it. This is where data quality tools are used. There are vendors that offer commercial solutions and there are also vendors that offer open-source solutions of data quality tools. Comparing DataCleaner (open-source tool) with DataFlux (commercial tool) using defined criteria this diploma thesis proves that those two tools could be equal in terms of data profiling, data enhancement and data monitoring. DataFlux is slightly better in standardization and data validation. Data deduplication is not included in tested version of DataCleaner, although DataCleaner's vendor claimed it should be. One of the biggest obstacles why companies don't buy data quality tools could be its price. At this moment, it is possible to consider DataCleaner as an inexpensive solution for companies looking for data profiling tool. If Human Inference added data deduplication to DataCleaner, it could be also possible to consider it as an inexpensive solution covers whole data quality process.
Automation of data preprocessing using domain knowledge
Beskyba, Jan ; Šimůnek, Milan (advisor) ; Pejčoch, David (referee)
In this work we propose a solution that would help automate the part of knowledge discovery in databases. Domain knowledge has an important role in the automation process which is necessary to include into the proposed program for data preparation. In the introduction to this work, we focus on the theoretical basis of knowledge discovery of databases with an emphasis on domain knowledge. Next, we focus on the basic principles of data pre-processing and scripting language LMCL that could be part of the design of the newly established applications for automated data preparation. Subsequently, we will deal with application design for data pre-processing, which will be verified on the data the House of Commons.
Possibility of using regular expressions in data quality management
Elznic, Matěj ; Pejčoch, David (advisor) ; Kyjonka, Vladimír (referee)
In this bachelor thesis is analyzed problem of data quality and then there are adduced the options, through which it is possible to keep the data quality. The thesis is focused on the usage of regular expressions and their application in the online ambience. The thesis consists of three parts. The first part contents theoretical definition of data quality and regular expressions. The second part presents the theoretical concept of specific roles, which are focused on e-shops and presentation of some selected programming languages, which will be used in practical part. In the last part is presented an analysis of user behavior, which is in connection with web registration forms and the practical demonstration of realization this form.
Analysis of the operational risks of the banking system implementation project
Bertsch, Jan ; Žváčková, Lenka (advisor) ; Pejčoch, David (referee)
This bachelor thesis describes analysis of operational risk of bank software implementing project. In first part of this thesis, I`m theoreticaly describing definition of risk and different classifying of the risk. The procceses of risk management are also described. I`m focusing on operational risk. In the second part, I`m describing phases of implementation project. There is detailed decription of risks of project phases, which are able to negativly influence goals of the project.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.