National Repository of Grey Literature 30 records found  1 - 10nextend  jump to record: Search took 0.02 seconds. 
Linked Data Integration
Michelfeit, Jan ; Knap, Tomáš (advisor) ; Klímek, Jakub (referee)
Linked Data have emerged as a successful publication format which could mean to structured data what Web meant to documents. The strength of Linked Data is in its fitness for integration of data from multiple sources. Linked Data integration opens door to new opportunities but also poses new challenges. New algorithms and tools need to be developed to cover all steps of data integration. This thesis examines the established data integration proceses and how they can be applied to Linked Data, with focus on data fusion and conflict resolution. Novel algorithms for Linked Data fusion are proposed and the task of supporting trust with provenance information and quality assessment of fused data is addressed. The proposed algorithms are implemented as part of a Linked Data integration framework ODCleanStore.
Converting HTML product data to Linked Data
Kadleček, Rastislav ; Nečaský, Martin (advisor) ; Svoboda, Martin (referee)
In order to make a step towards the idea of the Semantic Web it is necessary to research ways how to retrieve semantic information from documents published on the current Web 2.0. As an answer to growing amount of data published in a form of relational tables, the Odalic system, based on the extended TableMiner+ Semantic Table Interpretation algorithm was introduced to provide a convenient way to semantize tabular data using knowledge base disambiguation process. The goal of this thesis is to propose an extended algorithm for the Odalic system, which would allow the system to gather semantic information for tabular data describing products from e-shops, which have very limited presence in the knowl- edge bases. This should be achieved by using a machine learning technique called classification. This thesis consists of several parts - obtaining and preprocessing of the product data from e-shops, evaluation of several classification algorithms in order to select the best-performing one, description of design and implementation of the extended Odalic algorithm, description of its integration into the Odalic system, evaluation of the improved algorithm using the obtained product data and semantization of the product data using the new Odalic algorithm. In the end, the results are concluded and possible...
Implementace Business Intelligence v MVNO
Kamenchshikova, Alena ; Pour, Jan (advisor) ; Basl, Josef (referee)
The goal of this paper is to implement Business Intelligence solution for mobile virtual network operator Erbia Mobile. The first part is devoted to description and analysis of concepts and architecture associated with BI implementation. The second part deals with technical aspects of BI introduction to the company based on listed requirements gathered from series of interviews with management. Implementation is initiated by analysis of company data sources and detailed description of attributes essential to the telecommunication industry. Based on requirements and data source examination outputs, multidimensional analysis is created and described in detail. Next part describes individual components (Data Warehouse, ETL, OLAP cubes) implementation as well as different optimization techniques. Given components created on Microsoft platform using Integration, Analysis and Reporting Services. Final reports and dashboard visualizations are created using MS Excel and Power BI software tools.
Data quality and its analysis in a non-bank loan company
Vránek, Pavel ; Maryška, Miloš (advisor) ; Espinoza, Felix (referee)
This bachelor thesis is focused on complex elaboration of the subject data quality from the theoretical description of working with data in an information system, through the data quality definition, description of the causes of poor quality of data and consequences, which poor data quality brings, to analyze the quality of data in the non-bank load company. For the analysis of the data quality will be first selected suitable dimensions of data quality, for which will be subsequently defined metrics. These metrics will be then measured over a real data using SQL query language and software designed for the analysis of data quality. The main contribution of this work is complex processing of data quality issues and a demonstration of the real state of data quality in the non-bank loan company. The work offers the possibility of extending the draft procedures and rules for data quality management.
Linked Data Integration
Michelfeit, Jan ; Knap, Tomáš (advisor) ; Klímek, Jakub (referee)
Linked Data have emerged as a successful publication format which could mean to structured data what Web meant to documents. The strength of Linked Data is in its fitness for integration of data from multiple sources. Linked Data integration opens door to new opportunities but also poses new challenges. New algorithms and tools need to be developed to cover all steps of data integration. This thesis examines the established data integration proceses and how they can be applied to Linked Data, with focus on data fusion and conflict resolution. Novel algorithms for Linked Data fusion are proposed and the task of supporting trust with provenance information and quality assessment of fused data is addressed. The proposed algorithms are implemented as part of a Linked Data integration framework ODCleanStore.
Data quality in the business information database environment
Cabalka, Martin ; Chlapek, Dušan (advisor) ; Kučera, Jan (referee)
This master thesis is concerned with the choice of suitable data quality dimensions for a particular database of economy information and proposes and implements metrics for its assessment. The aim of this paper is to define the term data quality in the context of economy information database and possible ways to measure it. Based on dimensions suitable to observe, a list of metrics was created and subsequently implemented in SQL query language, alternatively in a procedural extension Transact SQL. These metrics were also tested with the use of real data and the results were provided with a commentary. The main asset of this work is its complex processing of the data quality topic, from theoretical term definition to particular implementation of individual metrics. Finally, this study offers a variety of both theoretical and practical directions fort this issue to be further researched.
Variants of data quality management within the regulation Solvency II
Pastrňáková, Alena ; Bína, Vladislav (advisor) ; Přibil, Jiří (referee)
The diploma thesis deals with data quality in connection with legal requirements of the Solvency II regulation, which must be met by insurance companies in order to keep their licences. The aim of this thesis is to consider opportunities and impacts of implementing data quality for Solvency II. All data quality requirements of the regulation were specified and supplemented with possibilities how to meet them. Related data quality areas were also described. Sample variants of manual, partially automated and fully automated solutions with regard to expenditure of costs and time were compared based on knowledge and acquired information. The benefit of this thesis is evaluation of possible positive and negative impacts of implementing data quality for Solvency II taking into account the possibility of introducing data quality across the entire company. General solution variants can be used for decision-making on implementing data quality in most companies out of insurance industry.
Master Data Quality and Data Synchronization in FMCG
Tlučhoř, Tomáš ; Chlapek, Dušan (advisor) ; Kučera, Jan (referee)
This master thesis deals with a topic of master data quality at retailers and suppliers of fast moving consumer goods. The objective is to map a flow of product master data in FMCG supply chain and identify what is the cause bad quality of the data. Emphasis is placed on analyzing a listing process of new item at retailers. Global data synchronization represents one of the tools to increase efficiency of listing process and improve master data quality. Therefore another objective is to clarify the cause of low adoption of global data synchronization at Czech market. The thesis also suggests some measures leading to better master data quality in FMCG and expansion of global data synchronization in Czech Republic. The thesis consists of theoretical and practical part. Theoretical part defines several terms and explores supply chain operation and communication. It also covers theory of data quality and its governance. Practical part is focused on objectives of the thesis. Accomplishment of those objectives is based on results of a survey among FMCG suppliers and retailers in Czech Republic. The thesis contributes to enrichment of academic literature that does not focus on master data quality in FMCG and global data synchronization very much at the moment. Retailers and suppliers of FMCG can use the results of the thesis as an inspiration to improve the quality of their master data. A few methods of achieving better data quality are introduced. The thesis has been assigned by non-profit organization GS1 Czech Republic that can use the results as one of the supporting materials for development of next global data synchronization strategy.
The role of data profiling in data quality management
Fišer, David ; Pejčoch, David (advisor) ; Kyjonka, Vladimír (referee)
The goal of this thesis is to research to role of data profiling in data quality management and the quality of existing data profiling SW tools. The role of data profiling was based up an analysis of the general methodology of approach to the management of data quality. I've compiled an own methodology for the analysis of data profiling tools. This methodology focuses on three main aspects: Technical requirements, User friendliness and Functionality. Based on the methodology the best SW solution for data profiling was chosen. General shortcomings of tested SW tools were also analysed.
Data Quality Tools Benchmark
Černý, Jan ; Pejčoch, David (advisor) ; Máša, Petr (referee)
Companies all around the world are wasting their funds due to the poor data quality. Rationally speaking as the volume of processed data increase, the volume of error data increase too. This diploma thesis explains what is it data quality about, what are the causes of data quality errors, the impact of poor data and the way it can be measured. If you can measure it, you can improve it. This is where data quality tools are used. There are vendors that offer commercial solutions and there are also vendors that offer open-source solutions of data quality tools. Comparing DataCleaner (open-source tool) with DataFlux (commercial tool) using defined criteria this diploma thesis proves that those two tools could be equal in terms of data profiling, data enhancement and data monitoring. DataFlux is slightly better in standardization and data validation. Data deduplication is not included in tested version of DataCleaner, although DataCleaner's vendor claimed it should be. One of the biggest obstacles why companies don't buy data quality tools could be its price. At this moment, it is possible to consider DataCleaner as an inexpensive solution for companies looking for data profiling tool. If Human Inference added data deduplication to DataCleaner, it could be also possible to consider it as an inexpensive solution covers whole data quality process.

National Repository of Grey Literature : 30 records found   1 - 10nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.