National Repository of Grey Literature 15 records found  1 - 10next  jump to record: Search took 0.01 seconds. 
Selected impacts of missing data problem in economics
Uenal, Hatice
Data sources and data quality are indispensable in economical, medical, pharmaceutical or other studies and provide the basis for reliable study results in numerous research questions. Depending on the purpose of use, a high quality of data is a prerequisite. However, with increasing registry quality, costs also increase accordingly. Considering these time and cost consuming factors, this work is an attempt to estimate the cost advantages when applying statistical tools to existing registry data. This includes methodological considerations and suggestions regarding the evaluation of data quality including factors such as bias and reliability after dealing properly (or not) with missing data (MD), and possible consequences when ignoring the incompleteness of data. Results for the quality analysis of the gastric cancer patients’ data example showed that millions of Euros in study costs can be saved by reducing the time horizon. On average, €523,126.70 can be saved for every year that the study duration is shortened. Replacing additionally the over 25% of MD in some variables, data quality was immensely improved, but still showed quality difficulties, which – beside MD in variables – could be an indication for completely missing entries of patients in the registry. Capturerecapture methods were therefore discussed to demonstrate how the total completeness in a registry can be estimated. Since it was not possible to illustrate the CARE method with the example of the gastric cancer patients due to the given data structure (no access to required variables), other data sets had to be chosen – the publicly accessible data of the amyotrophic lateral sclerosis (ALS) and data of towed vehicles in the City of Chicago. The consequence of ignoring MD was further analyzed using bankruptcy prediction data sets of agribusiness companies and confirmed the assumption that MD have a negative impact on the data quality, in this case also regarding the misclassifications of predictions of bankrupted companies. Using the decision tree method (known as one of the most suitable methods in predicting financial distress), the percentage of correctly bankruptcy-predicted of bankrupted companies (one year to bankruptcy) with MD imputation was 87.5%, whereas it was only 60% when completely omitting MD. Overall, my findings showed dearly the importance of statistical methods to improve data quality which in turn helps to avoid drawing biased conclusions due to incomplete data.
Measurement of (anti)immigration Attitudes from the Methodological Perspective. Quality of Measurement with the Special Focus on Measurement Equivalence
Šarapatková, Anna ; Remr, Jiří (advisor) ; Soukup, Petr (referee)
Opportunities that we have in today's world are sharply evolving, and the world is changing all together with these changes. This development is noticeably observed within the topic of global movement of (not only) population, which has changed fundamentally, both economically, politically and socially. Today's so much diversified form of migration, which has lost its transparency it used to has, is a very up to date and debated topic currently almost all over the world. Because of high importance of the topic "migration" it is often subject of research and number of surveys. One of the most examined area within the topic migration is attitudes of people towards immigration and immigrant, oftentimes together with investigating cause leading to particular attitude. Due to the international reach of the topic, these attitudes are often subject of cross-national research or national research, which, however, use data from international surveys. There is a clear disparity across European states in these attitudes towards immigration and, above all, the immigrants themselves. Given this nature of cross-national surveys measuring attitudes towards immigrants, it is important to focus on the measurement quality, which is becoming increasingly complex in the perspective of international research. It is...
Algorithms for anomaly detection in data from clinical trials and health registries
Bondarenko, Maxim ; Blaha, Milan (referee) ; Schwarz, Daniel (advisor)
This master's thesis deals with the problems of anomalies detection in data from clinical trials and medical registries. The purpose of this work is to perform literary research about quality of data in clinical trials and to design a personal algorithm for detection of anomalous records based on machine learning methods in real clinical data from current or completed clinical trials or medical registries. In the practical part is described the implemented algorithm of detection, consists of several parts: import of data from information system, preprocessing and transformation of imported data records with variables of different data types into numerical vectors, using well known statistical methods for detection outliers and evaluation of the quality and accuracy of the algorithm. The result of creating the algorithm is vector of parameters containing anomalies, which has to make the work of data manager easier. This algorithm is designed for extension the palette of information system functions (CLADE-IS) on automatic monitoring the quality of data by detecting anomalous records.
Algorithms for anomaly detection in data from clinical trials and health registries
Bondarenko, Maxim ; Blaha, Milan (referee) ; Schwarz, Daniel (advisor)
This master's thesis deals with the problems of anomalies detection in data from clinical trials and medical registries. The purpose of this work is to perform literary research about quality of data in clinical trials and to design a personal algorithm for detection of anomalous records based on machine learning methods in real clinical data from current or completed clinical trials or medical registries. In the practical part is described the implemented algorithm of detection, consists of several parts: import of data from information system, preprocessing and transformation of imported data records with variables of different data types into numerical vectors, using well known statistical methods for detection outliers and evaluation of the quality and accuracy of the algorithm. The result of creating the algorithm is vector of parameters containing anomalies, which has to make the work of data manager easier. This algorithm is designed for extension the palette of information system functions (CLADE-IS) on automatic monitoring the quality of data by detecting anomalous records.
Computer-aided data quality monitoring and assessment in clinical research
Šiška, Branislav ; Kolářová, Jana (referee) ; Schwarz, Daniel (advisor)
The diploma thesis deals with the monitoring and evaluation of data in clinical research. Usual methods to identify incorrect data are one-dimensional statistical methods per each variable in the register. Proposed method enters directly into database and finds out outliers in data using machine learning combined with multidimensional statistical methods that transform all column variables of clinical register to one, representing one record of patient in the register. Algorithm of proposed method is written in Matlab.
Data quality and consistency in Scopus and Web of Science in their indexing of Czech Journals
Mika, Pavel ; Szarzec, Jakub ; Sivertsen, Gunnar
This study addresses the discussion of “quality versus coverage” that often arises if a choice is needed between Scopus and Web of Science (WoS). We present a new methodology to detect problems in the quality of indexing procedures. Our preliminary findings indicate the same degree and types of errors in Scopus and WoS. The more serious errors seem to occur in the indexing of cited references, not in the recording of traditional metadata.
Fulltext: Download fulltextPDF
Effectivity assessment of the implementation of the reporting system
Řežábek, Martin ; Lorenc, Miroslav (advisor) ; Vladyka, Štěpán (referee)
Thesis is focused on effectivity assessment of the reporting system of the selected company and the comparison of the former and current reporting solution. This is achieved by the appropriate literature research, creation of the individual assessment model based on the methodology of the analogy from the information systems assessment and based on the experience of the selected company's employees and the experience of the experts in the field of corporate financial management with the focus on the reporting systems. Model is defined by the set of criteria structured into the groups and by the weights assigned to criteria along with the value for each of them. The last phase consists of stepping out of the individual assessment and defines the generally applicable model, usable on the wide range of different reporting systems.
Data comparability in knowledge discovery in databases
Horáková, Linda ; Chudán, David (advisor) ; Svátek, Vojtěch (referee)
The master thesis is focused on analysis of data comparability and commensurability in datasets, which are used for obtaining knowledge using methods of data mining. Data comparability is one of aspects of data quality, which is crucial for correct and applicable results from data mining tasks. The aim of the theoretical part of the thesis is to briefly describe the field of knowledqe discovery and define specifics of mining of aggregated data. Moreover, the terms of comparability and commensurability is discussed. The main part is focused on process of knowledge discovery. These findings are applied in practical part of the thesis. The main goal of this part is to define general methodology, which can be used for discovery of potential problems of data comparability in analyzed data. This methodology is based on analysis of real dataset containing daily sales of products. In conclusion, the methodology is applied on data from the field of public budgets.
Adolescent's attitudes in public opinion research, data quality and reliability
Šlégrová, Petra ; Vinopal, Jiří (advisor) ; Podaná, Zuzana (referee)
The diploma thesis focuses on the youngest age cathegory of respondents in public opinion polls. The main goal is to examine character and quality of information about adolescent's attitudes and opinions obtained in public opinion polls that are held by The Public Opinion Research Centre. To achieve the main goal nonattitude is examined. The thesis will be divided into theoretical and practical part. Theoretical part stands on the basis of public opinion sociology and developmental psychology and the issue of attitude measurement is introduced along with adolescents developmental theory and characteristics. Practical part reflects the information summoned in theoretical part and test them on data collected by The Public Opinion Research Centre which were obtained in continuous research within project Our society. Analysis focuses on examination of nonresponse, don't know answers and neutral attitudes. Results are compared among all age groups.
Deduplication methods in databases
Vávra, Petr ; Kyjonka, Vladimír (advisor) ; Skopal, Tomáš (referee)
In the present work we study the record deduplication problem as an issue of data quality. We define duplicates as records having different syntax and the same semantics and which are representing the same real-world entity. The main goal of this work is to provide the overview of existing deduplication methods according to their requirements, results and usability. We focus on the comparison of two groups of record deduplication methods - with and without the domain knowledge. Therefore, the second part of this work is dedicated to the implementation of our method which does not utilize any domain knowledge and compare its results with the results of commercial tool deeply utilizing the domain knowledge.

National Repository of Grey Literature : 15 records found   1 - 10next  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.