National Repository of Grey Literature 1 records found  Search took 0.02 seconds. 
Prediction of data-profiling duration
Kaštovský, Ondřej ; Kofroň, Jan (advisor) ; Kliber, Filip (referee)
Today, data quality plays a vital role in strategic planning and corporate decision-making processes. The ability to predict the duration of tasks re- lated to data processing and analysis is crucial for efficient use of resources and optimization of work processes. The goal of this work is to extend the functionality of Ataccama ONE, a data management platform of Ataccama, with a new microservice that allows predicting the duration of data profil- ing jobs. Our solution involves identifying the key data characteristics that affect the duration of these jobs and using these insights to prototype a ma- chine learning model to predict job durations. An important part of the solution is also to detect and process newly executed jobs in the platform in real-time and prepare the microservices for future integration into the plat- form. Emphasis is then placed on the quality of the implementation and the extensibility of the solution to predict other types of jobs.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.