Bellman equation

[4] The Bellman equation was first applied to engineering control theory and to other topics in applied mathematics, and subsequently became an important tool in economic theory; though the basic concepts of dynamic programming are prefigured in John von Neumann and Oskar Morgenstern's Theory of Games and Economic Behavior and Abraham Wald's sequential analysis.[6][7] In discrete time any multi-stage optimization problem can be solved by analyzing the appropriate Bellman equation.Alternatively, it has been shown that if the cost function of the multi-stage optimization problem satisfies a "backward separable" structure, then the appropriate Bellman equation can be found without state augmentation.[citation needed] Dynamic programming breaks a multi-period planning problem into simpler steps at different points in time.The information about the current situation that is needed to make a correct decision is called the "state".[10][11] For example, to decide how much to consume and spend at each point in time, people would need to know (among other things) their initial wealth.[citation needed] The dynamic programming approach describes the optimal plan by finding a rule that tells what the controls should be, given any possible value of the state.[citation needed] Bellman showed that a dynamic optimization problem in discrete time can be stated in a recursive, step-by-step form known as backward induction by writing down the relationship between the value function in one period and the value function in the next period.Under these assumptions, an infinite-horizon decision problem takes the following form: subject to the constraints Notice that we have defined notationto denote the optimal value that can be obtained by maximizing this objective function subject to the assumed constraints.The dynamic programming method breaks this decision problem into smaller subproblems.In the context of dynamic game theory, this principle is analogous to the concept of subgame perfect equilibrium, although what constitutes an optimal policy in this case is conditioned on the decision-maker's opponents choosing similarly optimal policies from their points of view.But we can simplify by noticing that what is inside the square brackets on the right is the value of the time 1 decision problem, starting from stateTherefore, the problem can be rewritten as a recursive definition of the value function: This is the Bellman equation.In the deterministic setting, other techniques besides dynamic programming can be used to tackle the above optimal control problem.However, the Bellman Equation is often the most convenient method of solving stochastic optimal control problems.For a specific example from economics, consider an infinitely-lived consumer with initial wealth endowmentBecause r is governed by a Markov process, dynamic programming simplifies the problem significantly.Then the Bellman equation is simply: Under some reasonable assumption, the resulting optimal policy function g(a,r) is measurable.For a general stochastic sequential optimization problem with Markovian shocks and where the agent is faced with their decision ex-post, the Bellman equation takes a very similar form The first known application of a Bellman equation in economics is due to Martin Beckmann and Richard Muth.A celebrated economic application of a Bellman equation is Robert C. Merton's seminal 1973 article on the intertemporal capital asset pricing model.Nancy Stokey, Robert E. Lucas, and Edward Prescott describe stochastic and nonstochastic dynamic programming in considerable detail, and develop theorems for the existence of solutions to problems meeting certain conditions.They also describe many examples of modeling theoretical problems in economics using recursive methods.[21] This book led to dynamic programming being employed to solve a wide range of theoretical problems in economics, including optimal economic growth, resource extraction, principal–agent problems, public finance, business investment, asset pricing, factor supply, and industrial organization.Lars Ljungqvist and Thomas Sargent apply dynamic programming to study a variety of theoretical questions in monetary policy, fiscal policy, taxation, economic growth, search theory, and labor economics.[22] Avinash Dixit and Robert Pindyck showed the value of the method for thinking about capital budgeting.[24] Using dynamic programming to solve concrete problems is complicated by informational difficulties, such as choosing the unobservable discount rate.There are also computational issues, the main one being the curse of dimensionality arising from the vast number of possible actions and potential state variables that must be considered before an optimal strategy can be selected.[26] In Markov decision processes, a Bellman equation is a recursion for expected rewards.The equation above describes the reward for taking the action giving the highest expected return.