Chapter Abstract:
The literature on inference and planning is vast. This chapter presents a type of decision processes in which the state dynamics are Markov. Such a process, called a Mark...Show MoreMetadata
Chapter Abstract:
The literature on inference and planning is vast. This chapter presents a type of decision processes in which the state dynamics are Markov. Such a process, called a Markov decision process (MDP), makes sense in many situations as a reasonable model and have in fact found applications in a wide range of practical problems. An MDP is a decision process in which the next state S[n + 1] of the environment, or the system, is completely determined by the current state of the system denoted by S[n] and the action (or the decision) taken at current time an. The chapter explains finite‐horizons MDPs and infinite‐horizon MDPs. Policy iteration and value iteration can be used to compute a sequence of value functions for a finite‐horizon partially observable Markov decision process (POMDP) with increasing horizon length until the change is negligible as an approximation to the infinite‐horizon optimal value function.
Page(s): 207 - 268
Copyright Year: 2015
Edition: 1
ISBN Information: