A networked unconstrained nonlinear MPC scheme

In this paper we propose an MPC scheme with a compensation mechanism for packet dropouts in a network connection between controller and actuator. We provide a stability and suboptimality analysis of the scheme based on asymptotic controllability properties and show that for large classes of systems we obtain the same stability conditions as for classical MPC and in particular stability for sufficiently large optimization horizon. As a byproduct, we observe that longer control horizons may improve the performance of the MPC closed loop. We illustrate our results by the standard inverted pendulum on a cart problem.


I. INTRODUCTION
Due to lower implementation costs, greater interoperability, and a wide range of choices in developing control systems, networked control systems (NCS) are increasingly used, particularly in the automotive and aeronautical industries that are seeing high adoption-rates of drive-by-wire and fly-by-wire designs.The main drawback of NCS is the additional complexity in analysis and feedback design.
In this paper we consider the implementation of a nonlinear model predictive control (MPC) scheme over a network.More precisely, we consider an uncertain transmission channel between the controller and the actuator and focus on the idealized situation in which delays are negligible but packet dropouts may occur.In order to compensate for these dropouts, we propose an MPC variant whose main ingredient is a buffer device in the actuator.Note that we do not assume any particular protocol like round-robin (RR) or try-once-discard (TOD), as, e.g., in [12], [15], [16].That is, we assume that either a packet arrives unperturbed and with negligible delay over the channel, or it is treated as a dropout.While this is an admittedly simplified setting, we consider our proposed MPC scheme as a building block for more sophisticated schemes which, in addition, are able to handle delays between sensor, controller and actuator and whose details are currently under investigation, see e.g [6].
Our proposed MPC scheme results in a nonstandard MPC closed loop in which the control horizon -i.e., the number of elements of the online computed optimal control sequence which are eventually applied at the plant -is time varying and unknown at the time of optimization.The main goal of this paper is to provide a mathematically rigorous stability and suboptimality analysis of this scheme.During the last decades, such results have been obtained for different MPC variants, see, e.g., [1], [3], [4], [7], [8], [10].Here we consider the simplest and industrially most commonly used class of MPC schemes for nonlinear systems, namely those without terminal constraints and costs, see [2] for a survey.
For our analysis we generalize results from [4] by allowing for variable control horizons.This technique relies on a suitable asymptotic controllability assumption and leads to a necessary and sufficient condition for suboptimality and stability in terms of a small optimization problem which was solved numerically in [4].Besides generalizing these results to variable control horizons, in this paper we also present a closed analytic solution formula for this optimization problem for a large class of systems.This allows for a detailed qualitative study of the impact of different control horizons which in particular reveals that for certain classes of systems longer control horizons can yield better suboptimality estimates than those obtained for the usual control horizon of length one.Since our results are based on a worst case analysis over a class of asymptotically controllable systems, we additionally perform a numerical Monte-Carlo simulation in order to provide some insight into the gap between the worst case and the average performance.
The paper is organized as follows: In Section II we describe the setup and formalize the MPC scheme we propose.In Section III we summarize and extend the optimization based MPC analysis technique from [4] and in Section IV we show how this technique can be used in order to prove asymptotic stability for our proposed MPC scheme.Thereafter, in Section V we present the analytic solution of the optimization problem and state a couple of consequences for the stability of our proposed scheme.In Section VI we illustrate our results by means of a numerical example.Finally, we perform a Monte-Carlo simulation in order to investigate the differences between the worst case and average performance and draw some conclusions.

II. SETUP AND PRELIMINARIES
We consider a nonlinear discrete time control system given by x(n Here the state space X is an arbitrary metric space.We denote the space of control sequences u : N 0 → U by U and the solution trajectory for given u ∈ U by x u (n).
A typical class of such discrete time systems are sampleddata systems induced by a controlled -finite or infinite dimensional -differential equation with sampling period T > 0. In this situation, the discrete time n corresponds to the continuous time t = nT .
We consider the situation of a networked control system shown in Fig. 1 where the controller at every time instant n ∈ N uses a network channel in order to transmit the feedback control value u(k) = µ(x(k)) to the actuator.We assume that delays over the network are negligible but that occasional packet dropouts occur, i.e., that the control value sent by the controller does not arrive at the actuator.
In order to compensate for these dropouts, we add a buffer device in the actuator and design a controller which at each time instant k sends a sequence µ(x(k), 0), µ(x(k), 1), . . ., µ(x(k), m ⋆ − 1) instead of a single control value u(k) = µ(x(k)) ∈ U.In the actuator, the elements of this sequence are buffered and used until the next sequence arrives.
Definition 2.1: Given a set M ⊆ {1, . .., m ⋆ }, we call a control horizon sequence (m i ) i∈N 0 admissible if m i ∈ M holds for all i ∈ N 0 .Furthermore, for k, n ∈ N 0 we define Here σ (k) denotes the kth successful transmission time while ϕ(n) denotes the largest successful transmission time ≤ n.
MPC is ideally suited to implement the proposed compensation strategy since in each MPC optimization step an optimal control sequence is computed, anyway.In order to formalize MPC, we start by looking at the following problem: Find a feedback control law minimizing the infinite horizon cost J ∞ (x 0 , u) = ∑ ∞ n=0 l(x u (n), u(n)) with running cost l : X × U → R + 0 .We denote the optimal value function for this problem by V ∞ (x 0 ) = inf u∈U J ∞ (x 0 , u).In order to be consistent with the scheme introduced above, we use the term feedback control in the following general sense.
Definition 2.2: For m ⋆ ≥ 1 and M ⊆ {1, . . ., m ⋆ } a multistep feedback law is a map µ : X × {0, . . ., m ⋆ − 1} → U which for an admissible control horizon sequence (m i ) i∈N 0 is applied according to the rule x µ (0) = x 0 , (2) Since infinite horizon optimal control problems are in general computationally infeasible, we use a receding horizon approach in order to compute an approximately optimal controller.To this end we consider the finite horizon functional with optimization horizon N ∈ N for N ∈ N 0 which gives us the optimal value function Here, we consider the conceptually simplest MPC approach imposing neither terminal costs nor terminal constraints.
Results including an additional weight on the final term can be found in [5].
Based on this finite horizon optimal value function we define an multistep feedback law µ N,m ⋆ by picking the first m ⋆ elements of the optimal control sequence.
Definition 2.3: For m ⋆ ≥ 1 and N ≥ m ⋆ + 1 we define a multistep MPC feedback law by µ N,m ⋆ (x 0 , n) = u ⋆ (n), where u ⋆ is a minimizing control for (4) with initial value x 0 .
Remark 2.4: For simplicity of exposition we assume that the infimum in ( 4) is a minimum.
Note that "classical" MPC is included in this definition and corresponds to the choice m ⋆ = 1.
In order to measure the suboptimality degree of the multistep feedback for the infinite horizon problem we define Our approach relies on results on relaxed dynamic programming [9], [13] already used in an MPC context in [7] which we adapt to our variable control horizon setting.
Proposition 2.5: Consider a multistep feedback law μ : and a function V : X → R + 0 and assume that for each admissible control horizon sequence (m i ) i∈N 0 and each x 0 ∈ X the corresponding solution for some α ∈ (0, 1].Then for all x 0 ∈ X and all admissible Proof: The proof is similar to that of [4, Proposition 2.4]: Consider x 0 ∈ X and the trajectory x μ (n) generated by the closed loop system using the multistep feedback μ and the control horizons m i .Then from ( 5) for all k ∈ N 0 we obtain α Summing over the transmission times σ (k) yields

III. CONTROLLABILITY AND PERFORMANCE BOUNDS
In this section we introduce an asymptotic controllability assumption and deduce several consequences for our optimal control problem.In order to facilitate this relation we will formulate our basic controllability assumption, below, not in terms of the trajectory but in terms of the running cost l along a trajectory.To this end we say that a continuous function ρ : R ≥0 → R ≥0 is of class K ∞ if it satisfies ρ(0) = 0, is strictly increasing and unbounded.We say that a continuous function β : R ≥0 × R ≥0 → R ≥0 is of class K L 0 if for each r > 0 we have lim t→∞ β (r,t) = 0 and for each t ≥ 0 we either have β (•,t) ∈ K ∞ or β (•,t) ≡ 0. Note that in order to allow for tighter bounds for the actual controllability behavior of the system we use a larger class than the usual class K L .It is, however, easy to see that each β ∈ K L 0 can be overbounded by a β ∈ K L , e.g., by setting β (r,t) = max τ≥t β (r, τ) + e −t r.Furthermore, we define l ⋆ (x) := min u∈U l(x, u).Assumption 3.1: Given a function β ∈ K L 0 , for each x 0 ∈ X there exists a control function for real constants C ≥ 1 and σ ∈ (0, 1), i.e., exponential controllability, and for some real sequence (c n ) n∈N 0 with c n ≥ 0 and c n = 0 for all n ≥ n 0 , i.e., finite time controllability (with linear overshoot).
For certain results it will be useful to have the property Property (8) ensures that any sequence of the form λ n = β (r, n), r > 0, also fulfills λ n+m ≤ β (λ n , m).It is, for instance, always satisfied in case (6) and satisfied in case (7) if c n+m ≤ c n c m .If needed, this property can be assumed without loss of generality, because by Sontag's K L -Lemma [14] β in Assumption 3.1 can be replaced by a β of the form turn is a necessary condition for Assumption 3.1 to hold for n = 0 and β (r,t) = α 1 (α 2 (r)e −t ).Under Assumption 3.1, for any r ≥ 0 and any N ≥ 1 we define the value An immediate consequence of Assumption 3.1 are the following lemmata which have been shown in [4].Lemma 3.2: For each N ≥ 1 the inequality holds.Lemma 3.3: Assume Assumption 3.1 and consider x 0 ∈ X and an optimal control u ⋆ for the finite horizon optimal control problem (4) with optimization horizon N ≥ 1.Then for each j = 0, . . ., N − 1 the inequality and for each m = 1, . . ., N − 1 and each j = 0, . . ., N − m − 1 the inequality holds for B N− j from (9).Now we provide a constructive approach in order to compute α in (5) for systems satisfying Assumption 3.1.Note that (5) only depends on m 0 and not on the remainder of the control horizon sequence.Hence, we can perform the computation separately for each control horizon m and obtain the desired α for variable m by minimizing over the α-values for all feasible m.
For our computational approach we consider arbitrary values λ 0 , . . ., λ N−1 > 0 and ν > 0 and start by deriving necessary conditions under which these values coincide with an optimal sequence l(x u ⋆ (n), u ⋆ (n)) and an optimal value V N (x u ⋆ (m)), respectively.
Using this proposition a sufficient condition for suboptimality of the MPC feedback law µ N,m is given in Theorem 3.5 which is proved in [4].
Theorem 3.5: Consider β ∈ K L 0 , N ≥ 1, m ∈ {1, . . ., N − 1}, and assume that all sequences λ n > 0, n = 0, . . ., N − 1 and values ν > 0 fulfilling ( 13), ( 14) satisfy the inequality for some α ∈ (0, 1].Then for each optimal control problem (1), ( 4) satisfying Assumption 3.1 the assumptions of Proposition 2.5 are satisfied for the multistep MPC feedback law µ N,m and in particular the inequality αV ∞ (x) In view of Theorem 3.5, the value α can be interpreted as a performance bound which indicates how good the receding horizon MPC strategy approximates the infinite horizon problem.In the remainder of this section we present an optimization approach for computing α.To this end consider the following optimization problem.
The following is a straightforward corollary from Theorem 3.5.

IV. ASYMPTOTIC STABILITY
In this section we show how the performance bound α can be used in order to conclude asymptotic stability of the MPC closed loop.More precisely, we investigate the asymptotic stability of the zero set of l ⋆ .To this end we make the following assumption.
Assumption 4.1: There exists a closed set A ⊂ X satisfying: (i) For each x ∈ A there exists u ∈ U with f (x, u) ∈ A and l(x, u) = 0, i.e., we can stay inside A forever at zero cost.(ii) There exist K ∞ -functions α 1 , α 2 such that the inequal- ity holds for each x ∈ X where x A := min y∈A x − y .This assumption assures global asymptotic stability of A under the optimal feedback for the infinite horizon problem, provided β (r, n) is summable.We remark that condition (ii) can be relaxed in various ways, e.g., it could be replaced by a detectability condition similar to the one used in [3].However, in order to keep the presentation in this paper technically simple we will work with Assumption 4.1(ii) here.Our first stability result is formulated in the following theorem.Here we say that a multistep feedback law µ asymptotically stabilizes a set A if there exists β ∈ K L 0 such that for all admissible control horizon sequences the closed loop system satisfies x µ (n) A ≤ β ( x 0 A , n).Theorem 4.2: Consider β ∈ K L 0 , m ⋆ ≥ 1 and N ≥ m ⋆ + 1 and a set M ⊆ {1, . . ., m ⋆ }.Assume that α ⋆ := min m∈M {α[N, m]} > 0 where α[N, m] denotes the optimal value of optimization Problem 3.6.Then for each optimal control problem (1), (4) satisfying the Assumptions 3.1 and 4.1 the multistep MPC feedback law µ N,m ⋆ asymptotically stabilizes the set A for all admissible control horizon sequences (m i ) i∈N 0 .Furthermore, the function V N is a Lyapunov function at the transmission times σ (k) in the sense that holds for all k ∈ N 0 and x 0 ∈ X.
Proof: From ( 16) and Lemma 3.2 we immediately obtain the inequality Note that B N • α 2 is again a K ∞ -function.The stated Lya- punov inequality (17) follows immediately from the definition of α ⋆ and (5) which holds according to Corollary 3.7 for all m ∈ M. Again using ( 16) we obtain V m (x) ≥ α 1 ( x A ) and thus a standard construction (see, e.g., [11]) yields a K Lfunction ρ for which the inequality In addition, using the definition of µ N,m ⋆ , for p = 1, . . . ,m k − 1, k ∈ N 0 , and abbreviating where we have used (17) in the last inequality.Hence, we obtain the estimate and thus asymptotic stability with K L -function given by, e.g., β (r, n Remark 4.3: For the "classical" MPC case m ⋆ = 1 and β satisfying (8) it is shown in [4,Theorem 5.3] that the criterion from Theorem 4.2 is tight in the sense that if α ⋆ < 0 holds then there exists a control system which satisfies Assumption 3.1 but which is not stabilized by the MPC scheme.We conjecture that the same is true for the general case m ⋆ ≥ 2.

V. CALCULATION OF α
Problem 3.6 is an optimization problem of much lower complexity than the original MPC optimization problem.Still, it is in general nonlinear.However, it becomes a linear program if β (r, n) (and thus B k (r) from ( 9)) is linear in r.
Lemma 5.1: If β (r,t) is linear in r, then Problem 3.6 yields the same optimal value α as min subject to the (now linear) constraints ( 13), ( 14) and For a proof we refer to [4].For linear β we can define γ k := B k (r)/r.This allows for an explicit formula to calculate the optimal value α of Problem 3.6.Theorem 5.2: Let β (•, •) be linear in its first argument and satisfy (8).Then the optimal value α = α[N, m] for given optimization horizon N and control horizon m is Proof: We only sketch the main ideas of the proof an refer to [5] for details. the optimum of the linear problem stated in Lemma 5.1 inequality (14), j = N − m − 1, is an active constraint.As a consequence, the positivity conditions concerning ν and λ 0 are implicitly guaranteed.The obtained equality for (14), j = N − m− 1, in combination with equality (20) allows for rewriting the objective function as 1 − (γ m+1 − 1)λ N−1 and eliminating ν and λ 0 from the optimization problem entirely.A pairwise comparison based on (8) of ( 13), k = m, . . ., N − 2, and ( 14), j = 0, . . ., N − m − 2, provides that the restrictions (13), k = m, . . ., N − 2, are negligible because each point which violates (13) for k is not feasible due to (14) Hence, the optimization problem under consideration depends only on λ 1 , . . ., λ N−1 ≥ 0 and the remaining N − 1 inequalities.In addition, we prove that the optimum is strictly positive and satisfies all other constraints with equality.Solving the resulting linear system of equations yields the stated formula for α.Theorem 5.2 enables us to easily compute the performance bounds α[N, m] which are needed in Theorem 4.2 provided β is known.However, even if β is not known exactly, we can deduce valuable information.The following corollary is obtained by a careful analysis of the fraction in (21), cf.[5].
Corollary 5.3: For each fixed m and β of type ( 6) or ( 7) we have lim N→∞ α[N, m] = 1.In particular, for sufficiently large N the assumptions of Theorem 4.2 hold and hence the networked closed loop system is asymptotically stable.Another application of Formula (21) is the investigation of qualitative properties of α[N, m] depending on the control horizon m.The following symmetry property follows immediately from Formula (21).Example 5.5 shows that the desired monotonicity property does not hold for arbitrary K L 0 -functions β .However, the following theorem (for the proof see [5]) shows that monotonicity holds for β of type (6) and at least for a subset of β of type (7).
Theorem 5.7: Let β be of type (6) or of type (7) with c n = 0 for n ≥ 1.Then for each N ≥ 1 the stability criterion from Theorem 4.2 is satisfied for m ⋆ = N − 1 if and only if it is satisfied for m ⋆ = 1.
In other words, for exponentially controllable systems and for systems which are finite time controllable in one step, for our proposed networked MPC scheme we obtain stability under exactly the same conditions as for "classical" MPC, i.e., m ⋆ = 1.In this context we recall once again that for m ⋆ = 1 the stability condition of Theorem 4.2 is tight, cf.Remark 4.3.

VI. EXAMPLE
In this section we compare our analytical results to a numerical MPC simulation.To this end we consider the linear inverted pendulum on a cart given by Here, we want to stabilize the upright position x ⋆ = (0, 0, 0, 0) using linear MPC.We consider the optimization horizon N = 10, the sampling interval T = 0.5 and the cost functional Id and R = 4 Id.Moreover, we use the constants g = 9.81 and k = 0.1 for gravitation and friction respectively.
For each m = 1, . . ., 9 we have simulated MPC closed loop trajectories x p µ N,m with control horizon m i ≡ m and equidistant initial values x p , p = 1, . . ., 625, from a rectangle with diameter 0.2 around (0, 0, −4, −1) ⊤ .Along each trajectory we have then computed α[N, m] p as the minimum of the values α from Formula (5) applied with x 0 = x p µ N,m (n), n = 0, m, 2m, . . ., ϕ(18).A selection of these values is plotted in Fig. 4, in which each dashed line represents the values α[N, 1] p , . . ., α[N, N − 1] p for an initial value x p .In addition, the minima over all trajectories are plotted as a solid line.
The results indicate that the closed loop is asymptotically stable for each m i and confirm that choosing control horizons m i > 1 may indeed improve the suboptimality bound.Moreover, it is interesting to compare Fig. 4 with Fig. 2. While Fig. 2 shows the minimal α-values for a set of exponentially controllable systems over all initial values, the curves in Fig. 4 represent the α-values for one particular system and a finite Fig. 4. Approximation of α [10,m] for the linear inverted pendulum.A possible explanation for this fact is that the values α[N, m] from Theorem 5.2 are obtained by computing the worst case over all control systems which are controllable in the sense of Assumption 3.1.One may conjecture that a "randomly" chosen system is more likely to be close to the worst case system for large m than for small.
In order to support this conjecture, instead of computing α by minimizing (19) over all admissible λ 0 , . . . ,λ N−1 , and ν we randomly generate admissible sequences λ 0 , . . . ,λ N−1 , then minimize (19) over ν only, and average over the resulting α in a Monte-Carlo simulation.While minimizing over the λ i corresponds to picking the worst case system in the class of systems satisfying the controllability Assumption 3.1, picking random sequences corresponds to picking random systems from this class.
In order to generate an admissible random sequence λ 0 , . . ., λ N−1 , i.e., a sequence satisfying (13), we again exploit the linearity of B k which yields B k (r) = γ k r with γ k = B k (r)/r, cf. the definition before Theorem 5.2.We then calculate λ i by backward induction: we set λ N−1 = 1 and inductively compute where z ∈ (0, 1) are random numbers generated in each iteration step.Here we used uniformly distributed random num-bers, however, experiments with other distributions yielded qualitatively similar results.Normalizing this sequence such that ∑ m−1 n=0 λ n = 1 holds, we then solved (19) numerically by minimizing over ν only.
Figure 5 shows the result of a Monte-Carlo simulation using this procedure, in which we averaged the resulting αvalues over 1000 randomly generated sequences for β (r,t) = Cσ t r with C = 3 and σ = 0.6.We observe the same qualitative behavior as for the inverted pendulum example in Figure 4.In particular, the result supports our conjecture that the average performance for large m is closer to the worst case than for small m.Still, the result also shows that for control horizons m up to about 0.8N we do not observe a significant loss of performance compared to the classical MPC case m = 1.

VIII. CONCLUSIONS
We have proposed a building block for the stability and performance analysis of MPC schemes for networked control systems with packet dropouts.Our technique is based on asymptotic controllability properties and leads to an explicitly computable performance index α which shows that for a large class of systems stability can be guaranteed under the same conditions as for a classical MPC scheme.In addition, by means of a numerical Monte-Carlo simulation we investigated the gap between the worst case and the average behavior showing that the average performance is closer to the worst case for large control horizons (corresponding to long periods of network failure) than for small ones.

Fig. 1 .
Fig. 1.Scheme of the considered networked control system
values.Despite this very different nature of the computations, the curves in Fig.4at least approximately resemble the shape of the curves in Fig.2.VII.MONTE-CARLO SIMULATIONWhile we were able to observe the monotonicity property stated in Theorem 5.6 -at least approximately -in many numerical examples (cf.also the examples in[5]), the symmetry proved in Corollary 5.4 could not be observed.As in Figure4, for large m in simulations the values α[N, m] are typically significantly smaller (and thus "worse") than the values α[N, N − m].