Computing stability and performance bounds for unconstrained NMPC schemes

We present a technique for computing stability and performance bounds for unconstrained nonlinear MPC schemes. The technique relies on controllability properties of the system under consideration and the computation can be formulated as an optimization problem whose complexity is independent of the state space dimension.


I. INTRODUCTION
The stability and suboptimality analysis of model predictive control (MPC, often also termed receding horizon control) schemes has been a topic of active research during the last decades. While in the MPC literature in order to prove stability and suboptimality of the resulting closed loop often stabilizing terminal constraints or terminal costs are used (see, e.g., [7], [1], [5] or the survey paper [9]), here we consider the simplest class of MPC schemes, namely those without terminal constraints and cost. These schemes are attractive for their numerical simplicity, do not require the consideration of feasible sets imposed by the stabilizing constraints and are easily generalized to time varying tracking type problems and to the case where more complicated sets than equilibria are to be stabilized. Essentially, these unconstrained MPC schemes can be interpreted as a simple truncation of the infinite optimization horizon to a finite horizon N .
For unconstrained schemes without terminal cost, Jadbabaie and Hauser [6] and Grimm et al. [2] show under different types of controllability and detectability conditions for nonlinear systems that stability of the closed loop can be expected if the optimization horizon N is sufficiently large, however, no explicit bounds for N are given. The paper [3] (see also [4]) uses techniques from relaxed dynamic programming [8], [11] in order to compute explicit estimates for the degree of suboptimality, which in particular lead to bounds on the stabilizing optimization horizon N . The conditions used in this paper are satisfied under a controllability condition, however, the resulting estimates for the stabilizing horizon N are in general not optimal. Such optimal estimates for the stabilizing horizon N have been obtained in [12], [10] using the explicit knowledge of the finite horizon optimal value functions, which could be computed numerically in the (linear) examples considered in these papers.
Unfortunately, for high (or even infinite) dimensional or nonlinear systems in general neither an analytical expression nor a sufficiently accurate numerical approximation of optimal value functions is available. However, it may still L. Grüne is with the Mathematical Institute, University of Bayreuth, 95440 Bayreuth, Germany, lars.gruene@uni-bayreuth.de be possible to analyze (open loop) controllability properties. Hence in this paper we base our analysis on such properties, more precisely on KL bounds of the chosen running cost along (not necessarily optimal) trajectories. Such bounds induce upper bounds on the optimal value functions and the main feature we exploit is the fact that the controllability properties do not only impose bounds on the optimal value function at the initial value but -via Bellman's optimality principle -also along "tails" of optimal trajectories. As in [3], the resulting condition gives a bound on the degree of suboptimality of the MPC feedback which in particular allows to determine a bound on the minimal stabilizing horizon N . Furthermore, the condition can be expressed as an optimization problem whose complexity is independent on the dimension of the state space of the system and which is actually a linear program if the KL function involved in the controllability assumption is linear in its first argument. An important feature of our approach is that the resulting bound on the stabilizing optimization horizon N turns out to be optimal -not necessarily with respect to a single system but with respect to the whole class of systems satisfying the assumed controllability property.
The paper is organized as follows: in Section II we describe the setup and the relaxed dynamic programming inequality our approach is based upon. In Section III we describe the controllability condition we are going to use and its consequences to the optimal value functions and trajectories. In Section IV we uses these results in order to obtain a condition for suboptimality and in Section V we show how this condition can be formulated as an optimization problem. Section VI shows how our condition can be applied to the stability analysis. In Section VII we discuss some numerical results and Section VIII gives some brief conclusions and outlook. A technical lemma is formulated and proved in the Appendix.

II. SETUP AND PRELIMINARY RESULTS
We consider a nonlinear discrete time system given by with x(n) ∈ X and u(n) ∈ U for n ∈ N 0 . Here we denote the space of control sequences u : N 0 → U by U and the solution trajectory for some u ∈ U by x u (n). Here the state space X is an arbitrary metric space, i.e., it can range from a finite set to an infinite dimensional space. Our goal is to find a feedback control law minimizing the infinite horizon cost l(x u (n), u(n)), (2.2) with running cost l : X × U → R + 0 . We denote the optimal value function for this problem by Here we use the term feedback control in the following general sense.
Definition 2.1: For m ≥ 1, an m-step feedback law is a map µ : X ×{0, . . . , m−1} → U which is applied according to the rule In other words, the feedback is evaluated at the times 0, m, 2m . . . and generates a sequence of m control values which is applied in the m steps until the next evaluation. Note that for m = 1 we obtain the usual static state feedback concept in discrete time.
If the optimal value function V ∞ is known, it is easy to prove using Bellman's optimality principle that the optimal feedback law µ is given by l(x u (n), u(n)) .
(2.4) Remark 2.2: We assume throughout this paper that in all relevant expressions the minimum with respect to u ∈ U m is attained. Although it is possible to give modified statements using approximate minimizers, we decided to make this assumption in order to simplify and streamline the presentation.
Since infinite horizon optimal control problems are in general computationally infeasible, we use a receding horizon approach in order to compute an approximately optimal controller, To this end we consider the finite horizon functional for N ∈ N 0 (using −1 n=0 = 0) and the optimal value function Note that this is the conceptually simplest receding horizon approach in which neither terminal costs nor terminal constraints are imposed. Based on this finite horizon optimal value function for m ≤ N we define an m-step feedback law µ N,m by picking the first m elements of the optimal control sequence for this problem according to the following definition. Definition 2.3: Let u * be a minimizing control for (2.5) and initial value x 0 . Then we define the m-step MPC feedback law by µ N,m (x 0 , n) = u * (n), n = 0, . . . , m − 1.
Here the value N is called the optimization horizon while we refer to m as the control horizon.
Note that we do not need uniqueness of u * for this definition, however, for µ N,m (x 0 , ·) being well defined we suppose that for each x 0 we select one specific u * from the set of optimal controls.
The first goal of the present paper is to give estimates about the suboptimality of the feedback µ N,n for the infinite horizon problem. More precisely, for an m-step feedback law µ with corresponding solution trajectory x µ (n) from and are interested in upper bounds for the infinite horizon value V µ N,m ∞ , i.e., in an estimate about the "degree of suboptimality" of the controller µ N,m . Based on this estimate, the second purpose of this paper is to derive results on the asymptotic stability of the resulting closed loop system using V N as a Lyapunov function.
The approach we take in this paper relies on results on relaxed dynamic programming [8], [11] which were already used in an MPC context in [4], [3]. Next we state the basic relaxed dynamic programming inequality adapted to our setting. for some α ∈ (0, 1] and all x 0 ∈ X. Then for all x ∈ X the estimate αV ∞ (x) ≤ αVμ ∞ (x) ≤ V (x) holds. Proof: The proof is similar to that of [11,Proposition 3] and [3, Proposition 2.2]: Consider x 0 ∈ X and the trajectory xμ(n) generated by the closed loop system usingμ. Then from (2.7) for all n ∈ N 0 we obtain Summing over n yields For K → ∞ this yields that V is an upper bound for αVμ ∞ and hence αV ∞ (x) ≤ αVμ ∞ (x) ≤ V (x). Remark 2.5: The term "unconstrained" only refers to constraints which are introduced in order to ensure stability of the closed loop. Other constraints are easily included in our setup, e.g., the set U of admissible control values could be subject to -possibly state dependent -constraints or X could be the feasible set of a state constrained problem on a larger state space.

VALUES
In this section we introduce an asymptotic controllability assumption and deduce several consequences for our optimal control problem. In order to facilitate this relation we will formulate our basic controllability assumption, below, not in terms of the trajectory but in terms of the running cost l along a trajectory.
To this end we say that a continuous function ρ : R ≥0 → R ≥0 is of class K ∞ if it satisfies ρ(0) = 0, is strictly increasing and unbounded. We say that a continuous function β : R ≥0 × R ≥0 → R ≥0 is of class KL 0 if for each r > 0 we have lim t→∞ β(r, t) = 0 and for each t ≥ 0 we either have β(·, t) ∈ K ∞ or β(·, t) ≡ 0. Note that in order to allow for tighter bounds for the actual controllability behavior of the system we use a larger class than the usual class KL. It is, however, easy to see that each β ∈ KL 0 can be overbounded by aβ ∈ KL, e.g., by settingβ(r, t) = max τ ≥t β(r, t) + e −t r. Furthermore, we define l * (x) := min u∈U l(x, u).
Under Assumption 3.1, for any r ≥ 0 and any N ≥ 1 we define the value An immediate consequence of Assumption 3.1 is the following lemma.
Lemma 3.2: For each N ≥ 1 the inequality holds.
Proof: Using u x0 from Assumption 3.1, the inequality follows immediately from In the special case The following lemma gives bounds on the finite horizon functional along optimal trajectories. Lemma 3.3: Assume Assumption 3.1 and consider x 0 ∈ X and an optimal control u * for the finite horizon optimal control problem (2.6) with optimization horizon N ≥ 1. Then for each k = 0, . . . , N − 1 the inequality Hence, for the control function defined bỹ . Subtracting the latter from the former yields which using (3.6) implies i.e., the assertion. A similar inequality can be obtained for V N . Lemma 3.4: Assume Assumption 3.1 and consider x 0 ∈ X and an optimal control u * for the finite horizon optimal control problem (2.6) with optimization horizon N . Then for each m = 1, . . . , N − 1 and each j = 0, . . . , N − m − 1 the inequality holds for B N from (3.4). Proof: We define the control functioñ where we used (3.5) in the last step. This is the desired inequality.

SEQUENCES
In this section we now consider arbitrary values λ 0 , . . . , λ N −1 > 0 and ν > 0 and derive necessary conditions under which these values coincide with an optimal sequence l(x u * (n), u * (n)) and an optimal value V N (x u * (m)), respectively.

V. OPTIMIZING THE WORST CASE
The assumptions of Theorem 4.2 can be verified by an optimization approach. To this end consider the following optimization problem: The following is a straightforward corollary from Theorem 4.2.
Problem 5.1 is an optimization problem of a much lower complexity than the original MPC optimization problem. Still, it is in general nonlinear. However, it becomes a linear program if we assume that β(r, n) and thus B k (r) are linear in r.

VI. ASYMPTOTIC STABILITY
We now investigate the asymptotic stability of the zero set of l * . To this end we make the following assumption. Assumption 6.1: There exists a compact set A ⊂ X satisfying: (i) For each x ∈ A there exists u ∈ U with f (x, u) ∈ A and l(x, u) = 0, i.e., we can stay inside A forever at zero cost. (ii) There exist K ∞ -functions α 1 , α 2 such that the inequality holds for each x ∈ X where x A := min y∈A x − y . This assumption assures global asymptotic stability of A under the optimal feedback (2.4) for the infinite horizon problem, provided β(r, n) is summable. We remark that condition (ii) can be relaxed in various ways, e.g., it could be replaced by a detectability condition similar to the one used in [2]. However, in order to keep the presentation in this paper technically simple we will work with Assumption 6.1(ii) here. Our main stability result is formulated in the following theorem. As usual, we say that a feedback law µ asymptotically stabilizes a set A if there existsβ ∈ KL such that the closed loop system satisfies x µ (n) A ≤ β( x 0 A , n). Theorem 6.2: Consider β ∈ KL 0 , N ≥ 1, m ∈ {1, . . . , N − 1}, and assume that the optimization Problem 5.1 has an optimal value α ∈ (0, 1]. Then for each optimal control problem (2.1), (2.6) satisfying the Assumptions 3.1 and 6.1 the m-step MPC feedback law µ N,m asymptotically stabilizes the set A. Furthermore, V N is a corresponding m-step Lyapunov function in the sense that Proof: From (6.1) and Lemma 3.2 we immediately obtain the inequality The stated Lyapunov inequality (6.2) follows immediately from (2.7) which holds according to Corollary 5.2. Again using (6.1) we obtain V m (x) ≥ α 1 ( x A ) and the asymptotic stability follows from a standard Lyapunov function argument using the fact that for n = 1, . . . , m − 1 the inequality Of course, Theorem 6.2 gives a conservative criterion in the sense that for a given system satisfying the Assumptions 3.1 and 6.1 asymptotic stability of the closed loop may well hold for smaller optimization horizons N . A trivial example for this is an asymptotically stable system (2.1) which does not depend on u at all, which will of course be "stabilized" regardless of N .
Hence, the best we can expect is that our condition is tight under the information we use, i.e., that given β, N, m such that the assumption of Theorem 6.2 is violated we can always find a system satisfying Assumptions 3.1 and 6.1 which is not stabilized by the MPC feedback law. The following Theorem 6.3 shows that this is indeed the case if β satisfies (3.3). Its proof relies on the explicit construction of an optimal control problem which is not stabilized. Although this is in principle possible for all m ∈ {1, . . . , N − 1}, we restrict ourselves to the classical feedback case, i.e., m = 1, in order to keep the construction technically simple. Theorem 6.3: Consider β ∈ KL 0 satisfying (3.3), N ≥ 1, m = 1 and assume that the optimization Problem 5.1 has an optimal value α < 0.
where we have used (4.2) for j = k in the second last inequality. Summarizing, we obtain that any optimal control u * x for x = (1, 0) must satisfy u * x (0) = 1 because for u(0) = 1 we can realize a value ≤ Λ while for u(0) = 1 we inevitably obtain a value > Λ. Consequently, the MPC feedback law will steer the system from x = (1, 0) to x + := (1, 1). Now we use that by construction f and l have the symmetry properties f ((q, p), u) = f ((q, −p + 1), −u) and l((q, p), u) = l((q, −p + 1), −u) for all (q, p) ∈ X which implies J((q, p), u) = J(q, −p + 1), −u). Observe that x + = (1, 1) is exactly the symmetric counterpart of x = (1, 0). Thus, any optimal control u * x + from x + must satisfy u * x + (n) = −u * x (n) for some optimal control u * x for initial value x. Hence, we obtain u * x + (0) = −1 which means that the MPC feedback steers x + back to x. Thus, under the MPC-Feedback law we obtain the closed loop trajectory (x, x + , x, x + , . . .) which clearly does not converge to A. This shows that the closed loop system is not asymptotically stable.

VII. NUMERICAL FINDINGS AND EXAMPLES
In this section we illustrate some results obtained from our approach. Note that this is but a small selection of possible scenarios and more will be addressed in future papers.
We first investigate numerically how our estimated minimal stabilizing horizon N depends on β. A first observation is that if N is large enough in order to stabilize each system satisfying Assumption 3.1 with β(r, 0) = γr, β(r, n) = 0, n ≥ 1, (7.1) then N is also large enough to stabilize each system satisfying Assumption (3.1) with β satisfying ∞ n=0 β(r, n) ≤ γr.
In particular, this applies to β(r, n) = Cσ n r with C/(1 − σ) ≤ γ. The reason for this is that the inequalities (4.1), (4.2) for (7.1) form weaker constraints than the respective inequalities for (7.2), hence the minimal value α for (7.1) must be less or equal than α for (7.2).
Note that even without sophisticated algorithms for finding the minimum in this expression this computation needs just a few minutes using our MATLAB code. The resulting values N (γ) are shown in Figure 7.1.
It is interesting to observe that the resulting values almost exactly satisfy N (γ) ≈ γ log γ, which leads to the conjecture that this expression describes the analytical "stability margin".
In order to see the influence of the control horizon m we have repeated this computation for m = [N/2] + 1, which numerically appears to be the optimal choice of m. The results are shown in Figure 7.2.
Here, one numerically observes that N (γ) ≈ 1.4γ, i.e., we obtain a linear dependence between γ and N (γ). If we consider the running cost l as a design parameter which we are free to choose in order to guarantee stability with N as small as possible, then these numerical results have an immediate and very natural consequence: the running cost l should be chosen such that the accumulated overshoot ∞ n=0 β(r, n) for β from Assumption 3.1 is as small as possible.
In order to illustrate this for a concrete example we apply our approach to the two dimensional example from [12] given by with running cost l(x, u) = max{ x ∞ , |u|} = max{|x 1 |, |x 2 |, |u|}.
Since this example is low dimensional and linear, V N can be computed numerically. This fact was used in [12] in order to compute the minimal optimization horizon for a stabilizing MPC feedback law with m = 1, which turns out to be N = 5 (note that the numbering in [12] differs from ours).
Solving Problem 5.1 for this β we obtain a minimal stabilizing horizon N = 12, which is clearly conservative compared to the value N = 5 computed in [12]. Note, however, that instead of using the full information about the functions V N , which are in general difficult to compute, we only use controllability information on the system. Now we demonstrate how a modified design of the running cost l can considerably improve our estimate of N . Recall that the estimate becomes the better, the smaller the accumulated overshoot induced by β is. A look at (7.3) reveals that in this example a reduction of the overshoot can be achieved by reducing the weight of u in l. For instance, if we modify l to l(x, u) = max{ x ∞ , |u|/2} then (7.3) leads to β(r, 0) = 1.1 r, β(r, 1) = 2.11 r, β(r, n) = 0, n ≥ 2.
Solving Problem 5.1 for this β leads to a minimal stabilizing horizon N = 5, which demonstrates that a good design of l can indeed considerably reduce our estimate for N .

VIII. CONCLUSIONS AND OUTLOOK
We have presented a sufficient condition which guarantees performance bounds for an unconstrained MPC feedback applied to a control system satisfying a controllability condition. The condition can be formulated as an optimization problem and the stability criterion derived from it turns out to be tight with respect to the whole class of systems satisfying the assumed controllability condition. Examples show how our method can be used in order to determine the dependence between overshoot and stabilizing horizon and how different choices of the running cost l influence the stability criterion.