Abstrakt: The work addresses the issue of decreased utility of future rewards, referred to as discounting, while utilizing fully probabilistic design (FPD) of decision strategies. FPD obtains the optimal strategy for decision tasks using only probability distributions, which is its main asset. The standard way of solving decision tasks is provided by Markov decision processes (MDP), which FPD covers as a special case. Methods of solving discounted MDPs have already been introduced. However, the use of FPD might be advantageous when solving tasks with an unknown system model. Due to its probabilistic nature, FPD is able to obtain a more precise estimation of this model. After previously introducing discounting and system model estimation to FPD, the current work examines the effect of discounting on decision processes and its possible advantages when dealing with an unknown system model.
