National Repository of Grey Literature 47 records found  beginprevious38 - 47  jump to record: Search took 0.01 seconds. 
Second Order Optimality in Transient and Discounted Markov Decision Chains
Sladký, Karel
The article is devoted to second order optimality in Markov decision processes. Attention is primarily focused on the reward variance for discounted models and undiscounted transient models (i.e. where the spectral radius of the transition probability matrix is less than unity). Considering the second order optimality criteria means that in the class of policies maximizing (or minimizing) total expected discounted reward (or undiscounted reward for the transient model) we choose the policy minimizing the total variance. Explicit formulae for calculating the variances for transient and discounted models are reported along with sketches of algoritmic procedures for finding second order optimal policies.
Cumulative Optimality in Risk-Sensitive and Risk-Neutral Markov Reward Chains
Sladký, Karel
This contribution is devoted to risk-sensitive and risk-neutral optimality in Markov decision chains. Since the traditional optimality criteria (e.g. discounted or average rewards) cannot reflect the variability-risk features of the problem, and using the mean variance selection rules that stem from the classical work of Markowitz present some technical difficulties, we are interested in expectation of the stream of rewards generated by the Markov chain that is evaluated by an exponential utility function with a given risk sensitivity coefficient. Recall that for the risk sensitivity coefficient equal zero we arrive at¨traditional optimality criteria. In this note we present necessary and sufficient risk-sensitivity and risk-neutral optimality conditions; in detail for unichain models and indicate their generalization to multichain Markov reward chains.
Risk-Sensitive and Average Optimality in Markov Decision Processes
Sladký, Karel
This contribution is devoted to the risk-sensitive optimality criteria in finite state Markov Decision Processes. At first, we rederive necessary and sufficient conditions for average optimality of (classical) risk-neutral unichain models. This approach is then extended to the risk-sensitive case, i.e., when expectation of the stream of one-stage costs (or rewards) generated by a Markov chain is evaluated by an exponential utility function. We restrict ourselves on irreducible or unichain Markov models where risk-sensitive average optimality is independent of the starting state. As we show this problem is closely related to solution of (nonlinear) Poissonian equations and their connections with nonnegative matrices.
Nový přístup k odhadování Bellmanových funkcí
Zeman, Jan
The paper concerns an approximate dynamic programming. It deals with a class of tasks, where the optimal strategy on a shorter horizon is close to the global optimal strategy. This property leads to a new, specific, design of the Bellman function estimation. The paper introduces the proposed approach and provides an illustrative example performed on the futures trading data.
Vylešpení modelu dynamického rozhodování pomocí metody "Iteration spread in time"
Divišová, L. ; Zeman, Jan
In the present work we study the problem of ¯nding the best de- cision based on our previous experience with the system. To solve this task, we use the dynamic programming and its approximations. In the work we summarize the theory needed for usage of the dynamic programming and we deal with its application on futures dealing trying to ¯nd best strategy, id est a sequence of decisions, maximizing our gain or minimizing the loss function. We introduce notion "Bellman function", explain why the approximation of this function is needed, demonstrate one of already tested approximation methods together with its results and we try to propose a method that would lead to the best approximation in suitable time and with available computation aids.
O strukturách prediktorů umožňujících distribuované dynamické Bayesovské rozhodování
Šmídl, Václav
Decentralized adaptive control is based on the use of many local controllers in parallel, each of them estimating its own local model and pursuing its local aims. If each controller designs its strategy using only its own model, the resulting control may be poor since consequences of actions of the neighbors are not taken into account. We seek a way how to improve algorithm of decision strategy design of a single local controller without significant increase in complexity of the local model or complexity of the design procedure. In this paper we study variants of distributed dynamic programming that could be evaluated locally. Specifically, we will investigate variants of the fully probabilistic control strategy design. Distributed and cen- tralized control strategies will be compared.
Algorithmic procedures for moment optimality in Markovian decision models
Sitař, Milan
We consider a discrete time Markov reward process with finite state and action spaces and random returns. In contrast with the classical models we assume that instead of maximizing the long run average expected return we maximize the first moment and simultaneously minimize the second moment of the reward. An algorithmic procedure is suggested for finding Pareto optimal policies for the considered moment optimality criteria.
Solving of integer problems by dynamic programming
Polonyankina, Tatiana ; Kalčevová, Jana (advisor) ; Lagová, Milada (referee)
Optimalization problems with integer requirements on the variables occurs in real life very often. Unfortunately, finding optimal solutions to such problems are often numerically very difficukt. The work describes several possible algorithms for solving linear integer problems. The reader is also familiarized with the method of dynamic programming and the principle of optimality. This is demonstrated in a practical example of a knapsack model where the calculation is done using tables. The goal of this work is to apply the knowledge from the application of dynamic programming on a typical linear integer problems, namely on the problem of material separation, and thus show the algorithm of calculating integer problems. Finding the optimal integer solution is accomplished in two ways: by the classical method of spreadsheet tables and by the simplified method of using Lagrange multipliers. In the conclusion there are summarized the advantages and disadvantages of solving technic.
The methods of dynamic programming in logistics an planning
Molnárová, Marika ; Pelikán, Jan (advisor) ; Fábry, Jan (referee)
The thesis describes the principles of dynamic programming and it's application to concrete problems. (The travelling salesman problem, the knapsack problem, the shortest path priblem,the set covering problem.)

National Repository of Grey Literature : 47 records found   beginprevious38 - 47  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.