Original title: Balancing Exploitation and Exploration via Fully Probabilistic Design of Decision Policies
Authors: Kárný, Miroslav ; Hůla, František
Document type: Research reports
Year: 2018
Language: eng
Series: Research Report, volume: 2376
Abstract: Adaptive decision making learns an environment model serving a design of a decision policy. The policy-generated actions influence both the acquired reward and the future knowledge. The optimal policy properly balances exploitation with exploration. The inherent dimensionality\ncurse of decision making under incomplete knowledge prevents the realisation of the optimal design.
Keywords: Adaptive systems; Bayesian estimation; Decision policy; Exploitation; Exploration; Fully probabilistic design; Kullback-Leibler divergence; Markov decision process
Project no.: GA16-09848S (CEP), GA18-15970S (CEP)
Funding provider: GA ČR, GA ČR

Institution: Institute of Information Theory and Automation AS ČR (web)
Document availability information: Fulltext is available at external website.
External URL: http://library.utia.cas.cz/separaty/2018/AS/karny-0495875.pdf
Original record: http://hdl.handle.net/11104/0288947

Permalink: http://www.nusl.cz/ntk/nusl-387695

The record appears in these collections:
Research > Institutes ASCR > Institute of Information Theory and Automation
Reports > Research reports
 Record created 2018-11-15, last modified 2020-03-26

No fulltext
  • Export as DC, NUŠL, RIS
  • Share