Original title:
Second Order Optimality in Markov and Semi-Markov Decision Processes
Authors:
Sladký, Karel Document type: Papers Conference/Event: MME 2019: International Conference on Mathematical Methods in Economics /37./, České Budějovice (CZ), 20190911
Year:
2019
Language:
eng Abstract:
Semi-Markov decision processes can be considered as an extension of discrete- and continuous-time Markov reward models. Unfortunately, traditional optimality criteria as long-run average reward per time may be quite insufficient to characterize the problem from the point of a decision maker. To this end it may be preferable if not necessary to select more sophisticated criteria that also reflect variability-risk features of the problem. Perhaps the best known approaches stem from the classical work of Markowitz on mean-variance selection rules, i.e. we optimize the weighted sum of average or total reward and its variance. Such approach has been already studied for very special classes of semi-Markov decision processes, in particular, for Markov decision processes in discrete - and continuous-time setting. In this note these approaches are summarized and possible extensions to the wider class of semi-Markov decision processes is discussed. Attention is mostly restricted to uncontrolled models in which the chain is aperiodic and contains a single class of recurrent states. Considering finite time horizons, explicit formulas for the first and second moments of total reward as well as for the corresponding variance are produced.
Keywords:
average reward and variance over time; discrete and continuous-time Markov reward chains; risk-sensitive optimality; semi-Markov processes with rewards Project no.: GA18-02739S (CEP) Funding provider: GA ČR Host item entry: Conference Proceedings. 37th International Conference on Mathematical Methods in Economics 2019, ISBN 978-80-7394-760-6