Název:
Avoiding overfitting of models: an application to research data on the Internet videos
Autoři:
Jiroušek, Radim ; Krejčová, I. Typ dokumentu: Příspěvky z konference Konference/Akce: MME 2017. International Conference Mathematical Methods in Economics /35./, Hradec Králové (CZ), 20170913
Rok:
2017
Jazyk:
eng
Abstrakt: The problem of overfitting is studied from the perspective of information theory. In this context, data-based model learning can be viewed as a transformation process, a process transforming the information contained in data into the information represented by a model. The overfitting of a model often occurs when one considers an unnecessarily complex model, which usually means that the considered model contains more information than the original data. Thus, using one of the basic laws of information theory saying that any transformation cannot increase the amount of information, we get the basic restriction laid on models constructed from data: A model is acceptable if it does not contain more information than the input data file.
Klíčová slova:
data-based learning; information theory; lossless encoding; MDL principle; probabilistic models Číslo projektu: GA15-00215S (CEP) Poskytovatel projektu: GA ČR Zdrojový dokument: Proceedings of the 35th International Conference Mathematical Methods in Economics (MME 2017), ISBN 978-80-7435-678-0