Open in App
Log In Start studying!

Select your language

Suggested languages for you:

Problem 1

Consider the discrete-time time-invariant one dimensional system with \(X=\mathbb{R}\), transitions $$ x^{+}=x+u, $$ and control value space the nonnegative reals: $$ \mathcal{U}=\mathbb{R}_{+} . $$ With \(q(t, x, u):=u^{2}, p(x):=x^{2}, \sigma=0\), and \(\tau=3\), find \(V(t, x)\) for all \(x\) and \(t=0,1,2\) using the dynamic programming technique. Next guess a general formula for arbitrary \(t, \sigma, \tau\) and establish its validity.

Short Answer

Expert verified
In summary, using the dynamic programming technique for the given one-dimensional discrete-time linear system, we derived the cost-to-go function \(V(t, x)\) for \(t = 0, 1, 2\). The general formula for arbitrary values of \(t, \sigma\), and \(\tau\) is given by: \(V(t, x) = (1 + \frac{t}{\tau - \sigma})x^2\) for \(x \ge 0\), and \(V(t,x) = (1 - \frac{t}{\tau - \sigma})x^2\) for \(x < 0\)
See the step by step solution

Step by step solution

Unlock all solutions

Get unlimited access to millions of textbook solutions with Vaia Premium

Over 22 million students worldwide already upgrade their learning with Vaia!

Step 1: Write the dynamic programming equation

The dynamic programming equation, also known as the Bellman equation, is given as: \(V(t, x) = \min_{u \in \mathcal{U}} [q(t, x, u) + V(t+1, x^+)]\) with the terminal condition: \(V(\tau, x) = p(x)\) Now we have to find the optimal control input \(u^* \in \mathcal{U}\) that minimizes the cost-to-go function for \(t = 0, 1, 2\).

Step 2: Calculate V(3, x)

The terminal condition is given as: \(V(\tau, x) = V(3, x) = p(x) = x^2\)

Step 3: Calculate V(2, x)

Substitute \(t = 2\) into the Bellman equation: \(V(2, x) = \min_{u \in \mathcal{U}} [u^2 + V(3, x+u)]\) We know \(V(3, x+u) = (x + u)^2\), so: \(V(2, x) = \min_{u \in \mathcal{U}} [u^2 + (x + u)^2]\) To find the optimal control input \(u^*\), we have to take the derivative of the cost function with respect to \(u\), set it equal to 0, and solve for \(u\): \(\frac{d}{du}[u^2 + (x + u)^2] = 0\) By solving for \(u\), we get \(u^* = -\frac{x}{2}\) However, since the control input has to be non-negative, the optimal control input is: \(u^* = \max\left(-\frac{x}{2}, 0\right)\) Now substitute optimal control input \(u^*\) into the cost function: \(V(2, x) = \frac{3x^2}{4}\) for \(x \ge 0\), and \(V(2, x) = x^2\) for \(x < 0\)

Step 4: Calculate V(1, x)

Following the same procedure for \(t = 1\): \(V(1, x) = \min_{u \in \mathcal{U}} [u^2 + V(2, x+u)]\) Using the calculated value for \(V(2, x+u)\), we can find the expression for \(V(1, x)\): \(V(1, x) = \lbrace \begin{array}{ll} \frac{2x^2}{3} & \mbox{if } x \ge 0 \\ \frac{3x^2}{2} & \mbox{if } x < 0 \end{array} \)

Step 5: Calculate V(0, x)

Finally, for \(t = 0\): \(V(0, x) = \min_{u \in \mathcal{U}} [u^2 + V(1, x+u)]\) Using the calculated value for \(V(1, x+u)\), we can find the expression for \(V(0, x)\): \(V(0, x) = \frac{7x^2}{6}\) for \(x \ge 0\), and \(V(0,x) = x^2\) for \(x < 0\)

Step 6: Generalize the result for arbitrary t, σ, τ

For general time t and arbitrary \(\sigma\), \(\tau\): \(V(t, x) = (1 + \frac{t}{\tau - \sigma})x^2\) for \(x \ge 0\), and \(V(t,x) = (1 - \frac{t}{\tau - \sigma})x^2\) for \(x < 0\) This formula can be validated by applying the step-by-step calculation at each time period involved in dynamic programming. It always yields correct cost-to-go functions for a given set of \(t, \sigma\), and \(\tau\).

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Access millions of textbook solutions in one place

  • Access over 3 million high quality textbook solutions
  • Access our popular flashcard, quiz, mock-exam and notes features
  • Access our smart AI features to upgrade your learning
Get Vaia Premium now
Access millions of textbook solutions in one place

Most popular questions from this chapter

Chapter 8

Suppose that \(L_{f} V(x) \leq 0\) for all \(x\) and that \(\dot{x}=f(x)+G(x) u\) is globally stabilized by \(u=-(\nabla V(x) \cdot G(x))^{\prime}\), as in Proposition 5.9.1. Show that \(u=k(x)\) is an optimal feedback, and \(V\) is the value function, for some suitably chosen cost. (Hint: Let $Q(x):=-L_{f} V(x)+\frac{1}{2} L_{G} V(x)\left(L_{G} V(x)\right)^{\prime}$, which gives (8.64) for which \(R\) ? Use Exercise 8.5.5.)

Chapter 8

Find the value function \(V\) and the optimal feedback solution for the problem of minimizing \(\int_{0}^{\infty} x^{2}+u^{2} d t\) for the scalar system \(\dot{x}=x^{2}+u\). (Hint: The HJB equation is now an ordinary differential equation, in fact, a quadratic equation on the derivative of \(V\).)

Chapter 8

Prove, without using the Pole-Shifting Theorem: If \(\Sigma\) is a controllable time-invariant linear discrete-time system over \(\mathbb{K}=\mathbb{R}\), then there exists an \(m \times n\) real matrix \(F\) such that \(A+B F\) is a convergent matrix.

Chapter 8

Consider the case when \(B=0, Q=I\), and \(S\) approaches zero. Show that the formulas for least-squares observation in Section \(6.3\) can be recovered from the results in this section. (Hint: The equation for \(\widetilde{P}\) can be solved with final condition zero, and its solution at the initial time can be expressed directly in terms of the Wronskian \(W\).)

Chapter 8

Consider the problem of minimizing the same cost (8.42) over the set of all trajectories of the system $$ \dot{\xi}=A \xi+G \bar{u}+B \omega, $$ where \(\bar{u}\) is a known control on \([\sigma, \tau]\). Show that the minimum is achieved through the solution of $$ \dot{z}=A z+L[C z-\bar{y}]+G \bar{u}, \quad z(\sigma)=0, $$ where \(L\) is the same as before. (Hint: Simply convert the original problem into an estimation problem for \(\dot{\widetilde{\xi}}=A \xi+B \omega\), where \(\tilde{\xi}:=\xi-\bar{x}\) and \(\bar{x}\) satisfies $\dot{\bar{x}}=A \bar{x}+G \bar{u} .)$

Join over 22 million students in learning with our Vaia App

The first learning app that truly has everything you need to ace your exams in one place.

  • Flashcards & Quizzes
  • AI Study Assistant
  • Smart Note-Taking
  • Mock-Exams
  • Study Planner
Join over 22 million students in learning with our Vaia App Join over 22 million students in learning with our Vaia App

Recommended explanations on Math Textbooks