Original title: Avoiding overfitting of models: an application to research data on the Internet videos
Authors: Jiroušek, Radim ; Krejčová, I.
Document type: Papers
Conference/Event: MME 2017. International Conference Mathematical Methods in Economics /35./, Hradec Králové (CZ), 20170913
Year: 2017
Language: eng
Abstract: The problem of overfitting is studied from the perspective of information theory. In this context, data-based model learning can be viewed as a transformation process, a process transforming the information contained in data into the information represented by a model. The overfitting of a model often occurs when one considers an unnecessarily complex model, which usually means that the considered model contains more information than the original data. Thus, using one of the basic laws of information theory saying that any transformation cannot increase the amount of information, we get the basic restriction laid on models constructed from data: A model is acceptable if it does not contain more information than the input data file.
Keywords: data-based learning; information theory; lossless encoding; MDL principle; probabilistic models
Project no.: GA15-00215S (CEP)
Funding provider: GA ČR
Host item entry: Proceedings of the 35th International Conference Mathematical Methods in Economics (MME 2017), ISBN 978-80-7435-678-0

Institution: Institute of Information Theory and Automation AS ČR (web)
Document availability information: Fulltext is available at external website.
External URL: http://library.utia.cas.cz/separaty/2017/MTR/jirousek-0481488.pdf
Original record: http://hdl.handle.net/11104/0277045

Permalink: http://www.nusl.cz/ntk/nusl-369603


The record appears in these collections:
Research > Institutes ASCR > Institute of Information Theory and Automation
Conference materials > Papers
 Record created 2017-12-07, last modified 2022-09-29


No fulltext
  • Export as DC, NUŠL, RIS
  • Share