Original title:
Avoiding overfitting of models: an application to research data on the Internet videos
Authors:
Jiroušek, Radim ; Krejčová, I. Document type: Papers Conference/Event: MME 2017. International Conference Mathematical Methods in Economics /35./, Hradec Králové (CZ), 20170913
Year:
2017
Language:
eng Abstract:
The problem of overfitting is studied from the perspective of information theory. In this context, data-based model learning can be viewed as a transformation process, a process transforming the information contained in data into the information represented by a model. The overfitting of a model often occurs when one considers an unnecessarily complex model, which usually means that the considered model contains more information than the original data. Thus, using one of the basic laws of information theory saying that any transformation cannot increase the amount of information, we get the basic restriction laid on models constructed from data: A model is acceptable if it does not contain more information than the input data file.
Keywords:
data-based learning; information theory; lossless encoding; MDL principle; probabilistic models Project no.: GA15-00215S (CEP) Funding provider: GA ČR Host item entry: Proceedings of the 35th International Conference Mathematical Methods in Economics (MME 2017), ISBN 978-80-7435-678-0