Chemistry Reference and  Research
           
 
Periodic Table
- standard table
- large table
 
Chemical Elements
- by name
- by symbol
- by atomic number
 
Chemical Properties
 
Chemical Reactions
 
Organic Chemistry
 
Branches of Chemistry
Analytical chemistry
Biochemistry
Computational Chemistry
Electrochemistry
Environmental chemistry
Geochemistry
Inorganic chemistry
Materials science
Medicinal chemistry
Nuclear chemistry
Organic chemistry
Pharmacology
Physical chemistry
Polymer chemistry
Supramolecular Chemistry
Thermochemistry

Bellman equation

Bellman equations occur in dynamic programming. A Bellman equation is also called an optimality equation or a dynamic programming equation. This approach was developed by Richard Bellman.

In reinforcement learning a Bellman equation refers to a recursion for expected rewards. For example, the expected reward for being in a particular state s and following some fixed policy π has the Bellman equation:

Vπ(s) = R(s) + γP(s' | s,π(s))Vπ(s')
s'

while the equation for the optimal policy is referred to as the Bellman optimality equation:

V * (s) = R(s) + maxaγP(s' | s,a)Vπ(s')
s'

the difference being that rather than taking the action prescribed by some policy π, we take the action that gives the best expected return.

Example

The recursive Bellman equation used to find a maximum of the dynamic programming problem:

\max_{ 	\left \{ x_{t+1} \right \}_{t=0}^{\infty} }  \sum_{t=0}^{\infty} \beta^t F(x_t,x_{t+1})

such that

\begin{matrix} x_{t+1} \in \Gamma (x_t), & t = 0, 1, 2, ... \\ x_0 \in X, & Given \end{matrix}

can be written as:

V(x) = \max_{y \in \Gamma (x) } [F(x,y) + \beta V(y)], \forall x \in X.

Here

y \in \Gamma (x)

is dependent on the state x, and

y(x)

is the policy function .

01-04-2007 01:16:19
The contents of this article are licensed from Wikipedia.org under the GNU Free Documentation License. How to see transparent copy