A distributed NMPC scheme without stabilizing terminal constraints

We consider a distributed NMPC scheme in which the individual systems are coupled via state constraints. In order to avoid violation of the constraints, the subsystems communicate their individual predictions to the other subsystems once in each sampling period. For this setting, Richards and How have proposed a sequential distributed MPC formulation with stabilizing terminal constraints. In this paper we show how this scheme can be extended to MPC without stabilizing terminal constraints or costs. We show theoretically and by means of numerical simulations that under a suitable controllability condition stability and feasibility can be ensured even for rather short prediction horizons.


Introduction
In this paper we consider a distributed nonlinear model predictive control (NMPC) algorithm for systems which are coupled via state constraints.NMPC is a controller design method which relies on the online solutions of optimal control problems on finite optimization horizons in each sampling period.In a distributed setting, the solution of this optimal control problem is distributed among the individual systems.This can be done in various ways, see [12,Chapter 6] or [15] for an overview.One way is to formulate the optimization objective in a centralized way and to solve this problem in a distributed way in each sampling period.The necessary splitting of the optimization problem can be obtained in various ways which under suitable assumptions guarantee that the performance of the distributed controller is similar to that of a centralized controller; examples can be found, e.g., in [4] or [12,Chapter 6].The drawback of this method -which is usually called cooperative control -Lars Grüne and Karl Worthmann, Mathematical Institute, University of Bayreuth, 95440 Bayreuth, Germany, e-mail: lars.gruene,karl.worthmann@uni-bayreuth.de is that it requires numerous information exchanges between the individual systems during the iterative optimization procedure in each sampling interval.
A less demanding approach from the communication point of view is noncooperative control, in which some information from the other systems is taken into account when a system performs its optimization but in which the optimization objectives of the individual systems are independent from each other.It is known that for this setting a solution close to the central optimum can no longer be expected; rather, the best one can get is a Nash equilibrium, see [12,Chapter 6].However, under suitable conditions the resulting closed loop may still be stable and maintain the imposed coupling constraints.This is the situation we investigate in this paper.More precisely, we consider a specific non-cooperative distributed NMPC algorithm proposed by Richards and How [13,14] in which each system sends information about its predicted future states once in each sampling period.Via a suitable sequential ordering of the individual optimizations it is then ensured that the coupling state constraints are maintained whenever the optimization problems are feasible, i.e., when optimal solutions exist.Clearly, requiring a strict sequential order is a drawback of this approach which we will attempt to relax in future research.Still, the numerical effort of this scheme is already significantly lower than for a centralized solution of the optimization problem, cf. the discussion after Algorithm 3.1, below.
In a stabilization setting, the optimal control problem to be solved online in the NMPC iteration usually minimizes the distance to the desired equilibrium.Often, additional stabilizing terminal constraints and costs are imposed in order to ensure asymptotic stability of the resulting closed loop.This means that the optimization on the finite horizon in each sampling instant is performed over those trajectories which -at the end of the optimization horizon -end up in the terminal constraint set which is typically a neighborhood of the equilibrium to be stabilized.These terminal constraints also play a vital role for ensuring both stability and feasibility in the scheme of Richards and How.In certain situations, however, imposing terminal constraints has the significant drawback that rather long optimization horizons are needed in order to ensure the existence of trajectories which end up in the terminal constraint sets.Furthermore, stabilizing terminal constraints may have negative effects on the performance of the scheme, see, e.g., [7,Section 8.4].As we will see in the detailed description in Section 3, in the distributed setting the terminal constraint formulation has the additional drawback that possible conflicts between the individual systems, i.e., violations of the coupling state constraints, have to be resolved in an initialization step.
The contribution of this paper is to give sufficient conditions under which we can ensure stability and feasibility without stabilizing terminal constraints.In the non-distributed setting, several approaches for this purpose have been developed, e.g., in [5,6,8,9].Here we use the approach developed in [6,8] which relies on an asymptoptic controllability assumption taking into account the stage cost of the finite horizon optimal control problems.We will develop an extension of this condition to the distributed setting and we will verify that this condition holds for a simple test example of moving agents in a plane where the coupling constraints are formulated in order to avoid collisions between the agents.Numerical simulations for this example illustrate that with this scheme stability can be achieved with short optimization horizons and that this scheme allows to resolve conflicts between the individual systems once they become "visible", i.e., at the runtime of the system rather than in an initialization step.
The paper is organized as follows.In Section 2 we describe the problem formulation and in Section 3 we present the algorithm of Richards and How [13,14] and discuss its main features.In Section 4 we recall the controllability based stability analysis for NMPC schemes from [6,8].Section 5 contains the main result of this paper, i.e., a distributed version of this controllability condition and the corresponding stability result.In Section 6 we investigate a simple test example theoretically and numerically.Section 7 concludes the paper and presents some ideas for future extensions of our main result.

Problem setup and preliminaries
We consider P ∈ N control systems described by the discrete time dynamics for p = 1, . . ., P, with x p (k) ∈ X p , u p (k) ∈ U p and f p : X p ×U p → X p , where X p are arbitrary metric spaces and U p are sets of admissible control values for p = 1, . . ., P.
The solution of (1) for initial value x p (0) = x 0 p and control sequence u p (k) ∈ U p , k = 0, 1, 2, . . .will be denoted by x u p (k, x 0 p ), i.e., we will omit the subscript p in u p in order to simplify the notation.The combined state space of all systems will be denoted by X = X 1 × . . .× X P .
Our goal is to stabilize each system at a desired equilibrium point x * p ∈ X p .This means, we are looking for feedback controllers µ p (x p (k), I p (k)) ∈ U p which render the respective equilibria asymptotically stable.Here the additional argument I p (k) of the controller µ p denotes information from the other systems.We assume that for the purpose of exchanging such information the individual systems can communicate over a network with negligible delay.The precise definition of I p (k) and the controller µ p are given in Definition 2.3 and Formula (5), below.The closed loop solutions of (1) with controller µ p , i.e., the solutions of will be denoted by x p (k), i.e., in order to simplify the notation we will not explicitly include the controller µ p , the initial value x p (0) and the additional information I p in the notation.
Beyond ensuring stability, we want to design the controllers such that the combined state x(k) = (x 1 (k), . . ., x P (k)) of the closed loop systems satisfies state constraints of the form i.e., the state constraints are defined via a state constraint set X. Note that these constraints induce a coupling between the -otherwise independent -systems which induces the need for passing information I p (k) between the subsystems.
Example 2.1.As an example which will be used in order to illustrate our concepts throughout this paper we consider a very simple model of p = 1, . . ., P autonomous agents moving in the plane1 R 2 with state x p = (x p,1 (k), for some ū > 0 and dynamics Thinking of x p (k) as the position of the individual agent in the plane, the state constraints can be used in order to avoid collisions of the agents.To this end, for some desired distance δ > 0 we define where • denotes an arbitrary norm in R 2 .If we use a specific norm in the subsequent computations then this will always be explicitly stated.
Clearly, in order to be able to maintain the state constraints in closed loop, i.e., to avoid collisions in the example, the individual controllers need to have some information about the other systems and for this purpose we will use the so far undefined information I p (k).In order to define what kind of information I p (k) the systems should exchange, we first need to specify the control algorithm we are going to use.In this paper we propose to use a model predictive (or receding horizon) control approach.To this end, at each time instant k for its current state x p (k) each agent solves the optimal control problem minimize J N p (x 0 p , u p ) = N−1 ∑ j=0 p (x u p ( j, x 0 p ), u p ( j)) with initial value x 0 p = x p (k) (4) over all admissible control sequences u p (•) ∈ U N,ad p (k, x 0 p , I p (k)) ⊆ U N p on the optimization horizon N ≥ 2, where the set of admissible control sequences U N,ad p will be defined in Definition 2.3, below.Here p is a stage cost function which penalizes the distance of the state from the equilibrium and the control effort.For instance, could be p (x p , u p ) = x p − x * p + λ u p or p (x p , u p ) = x p − x * p 2 + λ u p 2 , where λ > 0 is a weight parameter.
We denote the optimal control sequence for (4) by u * ,k p (0), . . ., u * ,k p (N −1) and the corresponding predicted optimal trajectory by x u * ,k p (0), . . ., x u * ,k p (N − 1).According to the usual receding horizon construction, the value of the MPC controller is given by the first element of the optimal control sequence u * ,k p (0).
In order to define this MPC feedback law in a rigorous way, we need to define the set of admissible control sequences in the optimization (4) for the p-th system.To this end, we make use of the following definition.Definition 2.2.(i) For an index set P = {p 1 , . . ., p m } ⊆ {1, . . ., P} with m ∈ N, m ≤ P we define the set of partial states as Elements of X P will be denoted by x P = (x p 1 , . . ., x p m ).The partial state constraint set X P ⊂ X P is defined as (ii) Given an index set P, an element x P ∈ X P , an element x p ∈ X p with p ∈ P and a subset Q = {q 1 , . . ., q l } ⊂ P we write (x p , (x q ) Q ) := (x p , x q 1 , . . ., x q l ) ∈ X {p}∪Q .
The admissible control sequences over which we optimize in ( 4) are now defined via the information available from the other agents according to the following definition.
Definition 2.3.(i) We assume that at time instant k when optimizing (4) for x 0 p = x p (k) the p-th agent knows prediction sequences x k q q (•) = (x k q q (0), . . ., x k q q (N − 1)) for q ∈ {1, . . ., P} \ {p} computed at time instant k q ≤ k from the other agents.We define Note that I p (k) lies in the set (ii) Given a time k ∈ N 0 and I p ∈ I p with k q ≤ k for all k q contained in I p , we define the set of admissible control sequences for system p at time k as The trajectories x u p (•, x 0 p ) for u ∈ U N,ad p (k, x 0 p , I p ) are called admissible trajectories.In words, this definition demands that the minimization of ( 4) is performed over those trajectories which satisfy the state constraints together with the known predictions from the other systems for j = 0, . . ., N − 1.
The resulting feedback law µ p thus depends on the current state x p (k) of the p-th closed loop system and on the other systems' predictions x k q q (•), q = p available at time k.For I p (k) ∈ I p the resulting MPC controller is hence given by the map where u * ,k p (•) is the optimal control sequence minimizing (4).For later use we define the associated optimal value function as In order not to overload the notation it does not reflect the implicit k-dependence of µ p and V N p .Moreover, for simplicity of exposition, throughout the paper we assume that the minimum of this expression exists whenever U N,ad p (k, x 0 p , I p ) = / 0. The important questions to be analyzed for this system are the following: • do the resulting closed loop systems (2) maintain the state constraints (3)?
• are the optimization problems feasible in each step, i.e., is the set of admissible control sequences U N,ad p (k, x 0 p , I p (k)) in the minimization of (4) non empty?• is the closed loop system (2) asymptotically stable; in particular, do the trajectories x p (k) converge to the fixed points x * p as k → ∞?
These are the questions we want to investigate in this paper.Clearly, the precise way of how the information I p (k) is constructed is crucial for answering these questions.
To this end, in the following section we investigate an algorithm in which the construction of the sets I p (k) implies that feasibility is sufficient for maintaining the state constraints, cf.Proposition 3.2.

The scheme of Richards and How
In this section we define how the information I p (k) is constructed and according to which schedule the information is passed from one system to the others.To this end, we use the sequential scheme introduced by Richards and How in [13,14].It should be noted that the general setting in these references is different from ours: on the one hand, only linear dynamics are considered in these references, on the other hand, perturbations are explicitly included in the models considered in [13,14] and the MPC scheme is designed to be robust against perturbations.
The main idea of the way the distributed optimization takes place, however, is independent from these details.Using the notation introduced in the last section, this idea is described in the following algorithm.
Find control sequences u p ∈ U N p such that the corresponding trajectories satisfy for p = 1, . . ., P: for k = 1, 2, . ..: for p = 1, . . ., P: set and minimize (4) for x 0 p = x p (k) with respect to u p ∈ U N,ad p (k, x 0 p , I p (k)).Denote the resulting optimal control by u * ,k p , set This scheme is sequential in the sense that in step (1) the individual systems perform their optimization one after the other before the control values are eventually applied in all systems.Note that system p always uses the most recent available predictions of the other systems in order to construct the set of admissible control sequences U N,ad p , i.e., for q < p the predictions x k q made at time k are used and for q > p the predictions x k−1 q computed at time instant k − 1 are used in I p (k).In case of a large number P of systems this sequential optimization may cause rather long waiting times which may not be available in case of fast sampling.While one goal of future research will thus be to relax the strict sequential structure, see also Section 7, below, we remark that the scheme is well applicable for small values of P and, as pointed out in [14, Section 7], even for large P the scheme considerably reduces the numerical effort compared to a centralized solution of the optimization problem in each time instant.
The main advantage of the sequential scheme is that once the initialization step (0) has been performed successfully, the validity of the state constraints for the closed loop solution follows from feasibility.This is made precise in the following proposition.Proposition 3.2.Assume that in Algorithm 3.1 the initialization step (0) is successful in finding u p ∈ U N p satisfying (6) and that in step (1) the optimal control prob-lems are feasible, i.e., that U N,ad p (k, x p (k), I p (k)) = / 0 holds for all p = 1, . . ., P and all k ≥ 1.Then the closed loop maintains the state constraints (3) for all k ≥ 0.
Proof.Condition (6) and the definition of µ p in step (0) immediately imply (3) for k = 1.Now we proceed by induction over k.Assume that (3) holds for some k ≥ 1 and that U N,ad p (k, x p (k), I p (k)) = / 0 holds for all p = 1, . . ., P. Then each µ p defined in step ( 1) is well defined and the definition of U N,ad P (k, x P (k), I P (k)) implies By definition of the µ p and (2) we obtain for all p = 1, . . ., P and thus This shows (3) for k + 1.
In order to ensure U N,ad p (k, x p (k), I p (k)) = / 0, in [14] a condition involving terminal constraints sets is used.The following assumption summarizes this condition in our notation and without the additional constructions needed for the robust design in [14].Assumption 3.3 There exist closed neighborhoods T p , p = 1, . . ., P of the equilibria x * p satisfying the following conditions.(i) T 1 × . . .× T P ⊂ X. (ii) On each T p there exists a stabilizing controller K p for x p such that T p is forward invariant for the closed loop system using K p .(iii) The control functions u p in the initialization step (0) and in the optimization of (4) in step (1) are such that x u p (N, x p (k)) ∈ T p holds.In the optimization, this amounts to adding x u p (N, x p (k)) ∈ T p as a further condition to the definition of the admissible control sequences U N,ad p (k, x 0 p , I p (k)).
The benefit of this condition is that if the computation of u 1 , . . ., u P satisfying (6) in step (0) is successful at time k = 0, then U N,ad p (k, x 0 p , I p (k)) = / 0 is ensured for all subsequent times k ≥ 1 and all p = 1, . . ., P. In order to see this, consider the control sequence u * ,k−1 p from the previous time step k − 1 in step (1) for p = 1, . . ., P. Then the construction of I q (k − 1) for q > p and I q (k) for q < p ensures u * ,k−1 Since the predictions of all other systems q = p also end up in their respective sets T q and T 1 × . . .× T P ⊂ X, we obtain u p ∈ U N,ad p (k, x 0 p , I p (k)).
Besides ensuring feasibility, Assumption 3.3 also ensures stability.Indeed, a standard MPC stability proof (cf.[10] or [12,Section 2.4]) shows that under a compatibility condition between the stage cost p and a suitably chosen terminal cost which is defined on T p and added to J N in (4), the optimal value function V p becomes a Lyapunov function of the system which proves stability.For this reason, the sets T p in Assumption 3.3 are usually called stabilizing terminal constraints.
In the context of Example 2.1, the stabilizing terminal constraints demand that already in the initialization step (0) we have to plan collision free trajectories for all systems from the initial value x p (0) to a neighborhood T p of x * p .On the one hand, this implies that we may need to use rather large optimization horizons N if we consider initial conditions x p (0) far away from the terminal sets T p .On the other hand, and more importantly in our distributed setting, Assumption 3.3 implies that all conflicts, i.e., possible collisions, until the "safe" terminal constraint sets T p are reached have to be resolved in the initialization step (0).Although in each iteration in step (1) the optimization algorithm is allowed to replan the trajectory, condition ( 6) is crucial in order to ensure feasibility for k = 1 and thus -via Proposition 3.2 -to ensure that the state constraints are maintained for all k ≥ 1.
The goal of this paper is now to relax these two drawbacks.While we will keep using Algorithm 3.1, we will not use Assumption 3.3 and in particular we will not require the solutions to end up in terminal constraint sets T p .The hope is that this will enable us to obtain an MPC scheme which is stable and maintains the state constraints with considerably smaller optimization horizon N and -in the context of Example 2.1 -which is able to solve possible conflicts at the times k ≥ 1 when they become visible and not necessarily in the initialization step 0.
To this end, in the next section we first revisit a stability condition for NMPC schemes without stabilizing terminal constraints.

Stability of NMPC without stabilizing terminal constraints
In this section we recall the stability analysis of NMPC controllers without stabilizing terminal constraints from [6,8].We will present the analysis for a single system of type (1).In the subsequent Section 5, we extend these results to our setting with P systems.
Since in this section we deal with a single system of type (1), we will omit the index p in all expressions as well as the dependence of V N and µ on information from the other systems.Analogous to Definition 2.3, admissibility for a control sequence u ∈ U N and an initial value x 0 ∈ X means that u( j) ∈ U and x u ( j, x 0 ) ∈ X for j = 0, . . ., N − 1, i.e., that the state constraints are maintained.Since in this section we do not consider couplings between different systems, Definition 2.3(ii) simplifies to U N,ad (x 0 ) := {u(•) ∈ U N | x u ( j, x 0 ) ∈ X for all j = 0, . . ., N − 1}.(7) We assume that for each x ∈ X and each N ∈ N this set satisfies U N,ad (x) = / 0 which means that the state constraint set X ⊂ X is forward invariant or viable.This assumption provides the easiest way to ensure feasibility of the resulting NMPC scheme and is used here in order to simplify the exposition.If desired, it can be relaxed in various ways, see, e.g., [7, or [11,Theorem 3].
Stability of the NMPC closed loop is established by showing that the optimal value function V N is a Lyapunov function for the system.More precisely, we aim at giving conditions under which for all x ∈ X we can establish the inequalities and for α 1 , α 2 ∈ K ∞ and α ∈ (0, 1].Then, under the additional assumption that holds for all x ∈ X, suitable α 3 , α 4 ∈ K ∞ and * (x) := min u∈U (x, u), we can conclude asymptotic stability as stated by the following theorem.
Theorem 4.1.Assume that the inequalites (8), ( 9) and (10) hold for the optimal value function V N and the stage cost of the optimal control problem (4) for one system, i.e., for p = P = 1.Then the closed loop system (2) with the NMPC feedback (5) is asymptotically stable on X.
Proof.The proof follows from by standard Lyapunov function arguments using V N as a Lyapunov function, see [6, Theorem 5.2].
The inequalities ( 8) and ( 9) can be ensured by an asymptotic controllability condition of the equilibrium x * .Here we work with the special case of exponential controllability, more general versions can be found in [6,8].Assumption 4.2 Given constants C > 0, σ ∈ (0, 1), for each x ∈ X and each N ∈ N there exists an admissible control function u x ∈ U N,ad (x) satisfying (x u x ( j, x), u x ( j)) ≤ Cσ j * (x) for all j ∈ {0, . . ., N − 1} with * from (10).
Observe that the controllability condition is defined here in a slightly weaker form than in [6,8] in the sense that the control function u x is implicitly allowed to depend on N while in [6,8] the existence of one u x for all N ∈ N is assumed.However, it is straightforward to see that the weaker condition given here is sufficient for all arguments used in the proofs in these references.Note that the constant C > 0 allows for an increase of (x u x ( j, x), u x ( j)) for small j before it must eventually decrease.In particular, does not need to be a control Lyapunov function for the system.Example 4.3.Consider Example 2.1 with only one system, which in particular implies that the state constraint X does not include any coupling terms.Instead, we use the state constraint set X = [−1, 1] 2 .As stage cost we use (x, u) = x − x * 2 + λ u 2 for some x * ∈ [−1, 1] and some λ ≥ 0.Moreover, let c := max x∈X x − x * denote the maximal distance in X from x * .
We inductively define a control u ∈ U N,ad (x) by for some design parameter ρ ∈ (0, 1).Note that the choice of κ implies u(k) ∈ [− ū, ū] 2 for x u (k, x) ∈ X.Moreover, this definition implies and, as a consequence, Due to the convexity of X and κ ∈ (0, 1), the identity ( 12) ensures feasibility of x u (•).Using the definition of u(k) and ( 12) yields Then for each x ∈ X the following properties hold.
(i) The inequality holds.
(ii) This inequality follows from (i) applied to x = x u * (k, x) using the fact that by the dynamic programming principle tails of optimal trajectories are again optimal trajectories, see [6,Lemma 3.4] for details.
(iii) Follows from the inequality with u x from Assumption 4.2 with x = x u * (1+ j, x) and (i), for details see [6,Lemma 3.5].
The conditions ( 16) and ( 17) lead to the following sufficient condition for (9).
The characterization of α via the optimization problem (18) is particularly useful because it admits the following explicit analytic solution.
Remark 4.9.An inspection of the proof of [8,Theorem 5.4] shows that some inequalities provided by Lemma 4.4 are not needed in order to prove (19) since in this proof a relaxed problem [8, Problem 5.3] with fewer constraints was used.It turns out that the inequalities not needed in this relaxed problem are exactly (14) for k = 1, . . ., N − 2 or, equivalently, (16) for k = 1, . . ., N − 2, see [7,Remark 6.35].While this has no consequence for the analysis in this section since we get all inequalities in ( 14) "for free" from Assumption 4.2, this observation will turn out very useful in the next section.
Combining the three Theorems 4.1, 4.7 and 4.8 yields the following corollary.
Corollary 4.10.Consider a single system of type (1) and the NMPC feedback law (5) for some N ≥ 2. Let Assumption 4.2 and (10) hold and assume that α > 0 holds for α from (19).Then the closed loop system (2) is asymptotically stable on X.
Using the convergence α → 1 for N → ∞ we can use this corollary in order to conclude that when (10) and Assumption 4.2 hold, then asymptotic stability can be guaranteed for each sufficiently large optimization horizon N. Beyond this asymptotic result, however, the condition α > 0 in (19) also gives a useful stability criterion for small optimization horizons N, as the following example shows.
More complex examples of this kind including infinite dimensional PDE models can be found, e.g., in [6,Sections 6 and 7] or [1,2,7].Finally, we remark that α also allows to estimate the performance of the MPC feedback law µ in terms of an infinite horizon optimization criterion; for details see, e.g., [6,Theorem 4.2].

Stability of distributed NMPC without stabilizing terminal constraints
In this section we adapt the results of the previous section to the distributed MPC setting introduced in Section 2 using Algorithm 3.1.The goal is to adapt Assumption 4.2 to the distributed setting.This way we derive a sufficient condition for distributed NMPC without stabilizing terminal constraints which ensures feasibility of the optimal control problems in Algorithm 3.1(1) -and thus via Proposition 3.2 guarantees that the state constraints are maintained -and stability of the NMPC closed loop.Stability will be guaranteed by showing that each optimal value function V N p will satisfy the inequalities ( 8) and ( 9), i.e., that each V N p is a Lyapunov function for the corresponding system.
Comparing the distributed setting of Section 2 with the non-distributed setting of Section 4, the main difference is that the set of admissible control sequences U N,ad p in Definition 2.3(ii) changes with time k due to the fact that the information I p (k) in Algorithm 3.1(1) also changes with time.In contrast to this, the set U N,ad in ( 7) is constant over time.In order to include the time dependence in the controllability assumption we make use of sets of admissible control sequences according to the following definition.Recalling that in our setting the admissible control sequences are derived from the state constraint sets X and the predicted trajectories of the other systems contained in I p via Definition 2.3(ii), a little computation reveals that for each time instant k ≥ 0 the sets W m p = U m,ad p (k, x 0 p , I p ), m ∈ N are nested and that this choice of W m p implies W p [u, l, m] = U m,ad p (k + l, x u p (l, x 0 p ), I p ).Another issue we take into account when adapting Assumption 4.2 is that in the distributed setting it is quite demanding to assume that controllability holds for all possible initial values.Instead, we will formulate the respective condition for fixed initial conditions.The following theorem presents this variant in an abstract setting with nested admissible control sequence sets W m p and W m p .In the subsequent Theorem 5.3 we will then show how this condition fits into Algorithm 3.1.
Theorem 5.2.Consider some p ∈ {1, . . ., P}, two families of nested admissible control sequence sets W m p , W m p ⊆ U m p , m ∈ {1, . . ., N} for N ≥ 2, a point x 0 p ∈ X p and the optimal values Then, the closed loop solutions maintain the state constraints (3) and there exists α 1 , α 2 ∈ K ∞ such that the optimal value functions V N p satisfy for all k ≥ 1 and the inequality holds for α from (19) and all k ≥ 1.
In particular, if α > 0 (which always holds for N > 0 sufficiently large) then the V N p are Lyapunov functions for the closed loop systems for k ≥ 1 and thus asymptotic stability of the equilibria x * p follows.
Proof.We show that for each k ≥ 2 the assumptions of Theorem 5.The central assumption in this theorem is that condition (iii) of Theorem 5.2 holds.In words, this assumption requires two things: first, U N,ad p (k, x p (k), I p (k)) needs to be non empty which means that given the predictions of the other systems x u q , q = p, contained in I p there is still enough space to "squeeze in" a solution x u p .Second, the condition requires that starting from any point on the optimal open loop trajectory from the last time instant, there are solutions which approach the equilibrium x * p sufficiently fast in the sense of the controllability assumption.The important fact in this condition is that when the p-th system selects its control it knows the other systems' predictions.For this reason this rather technical condition can be rigorously verified at least for simple systems, as the example in the following section shows.
Note that even though step (0) remains formally identical to Algorithm 3.1, without the additional terminal condition from Assumption 3.3(iii) and with smaller N it is much easier to satisfy (6).This is illustrated in the numerical simualations at the end of the next section, in which for most of the systems the state constraints only become relevant after several steps of the algorithm.Our final numerical experiment shows that the optimization horizon N needed in order to obtain stability of the closed loop heavily depends on the initial values, i.e., on the way how the individual agents meet and which way they are heading when they meet.Due to the fact that the control constraints u p ∈ [− ū, ū] are box constraints which allow the systems to move faster and more flexible in diagonal direction, it is not surprising that avoiding conflicts is easier when the agents are approaching each other in a diagonal way.This explains why resolving the conflict is easier in the situation of Figure 3 in which the optimization horizon N = 3 is sufficient in order to stabilize all P = 4 agents.A further interesting observation in this example is that the resulting trajectories are not symmetric as in Figures 1 and 2. Rather, here one sees the effect of the sequential ordering since x 1 and x 3 approach their respective equilibria directly (with x 3 performing a shorter step at time k = 7 in order to avoid the collision with x 1 ) while x 2 and x 4 are forced to take small detours.Due to the short horizons N, most of the conflicts, i.e., the possible collisions, are resolved by the optimization at runtime in step (1) and not in the initialization in step (0) of Algorithm 3.1.The only exception is the fourth system starting at x 4 (0) = (0, −1) T in Figure 2, which at the time of its first optimization in step (0) already knows all other systems' predictions.Hence, the simulations nicely illustrate our schemes' ability to resolve conflicts at runtime.

Conclusion and future work
In this paper we have shown that the non-cooperative distributed model predictive control scheme from [13,14] can be formulated without stabilizing terminal constraints.An extension of the controllability based NMPC stability analysis from [6,8] yields a sufficient distributed controllabilty condition ensuring stability and feasibility.Numerical examples show that the resulting scheme is able to stabilize a test system with small optimization horizons N and illustrate the schemes' ability to resolve conflicts at runtime.
We regard the analysis in this paper as a first step which can and needs to be improved in many ways.The controllability conditions (i) and (ii) required in Theorem 5.3 are certainly difficult to check for systems more complex than Example 2.1 and even for Example 2.1 with large P a rigorous verification currently appears out of reach.In fact, the inequality (22) resulting from the controllability condition is a quite strong property by itself in the sense that in each sampling period each optimal value function V p is decreasing.This property is most likely too demanding in many applications in which one would rather expect that in each step only some of the V p decrease while others may even increase.ISS-small gain arguments for large systems as in [3] may be suitable for handling such situations, however, so far it is an open questions what kind of controllability properties are needed in order to ensure the appropriate small gain properties of the NMPC closed loop systems.
Another extension will be to relax the requirement of strict sequential optimization imposed in Algorithm 3.1.A first step in this direction could be to exploit the fact that stability can also be expected if the optimization is performed only in each m-th sampling period for m ∈ {1, . . ., N − 1}, cf.[8].This enables us to set up a cyclic scheme in which in each sampling period only a subset of systems performs an optimization.While this reduces the waiting time, it does not completely remove the sequential order of the optimization.The development of a scheme in which the optimization is performed completely in parallel remains a major challenge.Now we continue by constructing two alternative trajectories x u r (k) and x u l (k) for k ≥ k 2 + 1, cf. .
j) for j = 0, . . ., N − 1 and send (k p , x k p p (•)) to all other systems Apply the control value µ p (x 0 p ) = u p (0) in the first step.end of p-loop (1) Control loop for k ≥ 1.

1 p( 1 ,
2 hold with W m p = U m,ad p (k − 1, x p (k − 1), I p (k − 1)), W m p = U m,ad p (k, x p (k), I p (k)), x 0 p = x p (k − 1), xp = x p (k) and u * = u * ,k−1 p .To this end, first observe that in the discussion after Assumption 3.3 we have shown that in step (1) the relation u * ,k−1 p (1 + •) ∈ U N−1,ad p (k, x p (k), I p (k)) holds which implies that condition (ii) of Theorem 5.2 is satisfied.Condition (iii) of Theorem 5.2 holds by assumption and condition (i) of Theorem 5.2 at time k follows from condition (iii) for j = 0 at time k − 1, since x p (k) = x u * ,k−x p (k − 1)) and W m p at time k equals W m p at time k − 1.Thus, Theorem 5.2 is applicable which proves (22).Inequality (21) is then obtained with the same arguments as in Remark 4.5.Finally, since the assumed condition (iii) of Theorem 5.2 in particular demands U N,ad p (k, x p (k), I p (k)) = / 0, Proposition 3.2 yields feasibility of the problem and implies that the closed loop solutions satisfy the state constraints (3).

Fig. 4
Fig. 4 Illustration of the trajectories used in the construction of x u 1 .

Figure 4 (
left).At least one of them is admissible and reaches T in a predetermined number of steps.The first elements of the corresponding control sequences areu r (k) = ū y 2 (k + 1) − y 2 (k) and u l (k) = − ū y 2 (k + 1) − y 2 (k)