A prediction based control scheme for networked systems with delays and packet dropouts

We present a networked control scheme which uses a model based prediction and time-stamps in order to compensate for delays and packet dropouts in the transmission between sensor and controller and between controller and actuator, respectively. In order to analyze the properties of our scheme, we introduce the notion of prediction consistency which enables us to precisely state the network properties needed in order to ensure stability of the closed loop.


I. INTRODUCTION
Motivated by numerous emerging applications, networked control systems have received considerable attention during the last years.In this paper we consider a setting where the controller, the actuator and the sensor of a closed loop system communicate over a network in which the transmission is subject to (not necessarily small) delays and packet dropouts.
In order to compensate for delays, we add a model based predictor to our controller, as in, e.g., [1], [5], [6], [7] and the references therein.Based on the most recent measurement available at the controller, the predictor computes an estimate of the future state from which the controller determines the control signal.This signal is then sent to the actuator and the prediction horizon is chosen large enough (based on an estimated bound of the transmission delay) such that the control signal can be expected to reach the actuator in time.
In order to compensate for packet dropouts (where delays which exceed the estimated bound for the transmission delay are treated as dropouts, too), the controller does not only compute a feedback value for the next sampling instant.Instead, it computes and transmits a whole feedback control sequence which is used by the actuator until the next sequence arrives.
The main difficulty in designing such a scheme lies in the fact that the control sequence used by the actuator needs to be known to the predictor before it is applied by the actuator.In fact, it needs to be known even before we can be sure whether it is successfully transmitted to the actuator.Thus, in order to ensure a faithful prediction, the main problem to be solved is to guarantee that the control sequences used for the prediction and at the actuator coincide, a property we call prediction consistency, cf.Definition 2.1.Introducing this property allows to separate the robustness analysis of the controller with respect to (inevitable but usually small) prediction errors from the analysis of the basic correctness of the prediction scheme.For reasons of space limitations we can only sketch the application of this separation principle in this paper, cf.Theorem 2.2.More general stability proofs based on this principle will be addressed in future papers.
In order to ensure prediction consistency, our proposed scheme contains a correction mechanism which uses appropriate time stamps.These enable the actuator to identify and discard non-consistent control sequences.If this happens, the controller is notified via an error message and can adjust the prediction control sequence.This idea is not entirely new, as similar schemes have been presented in, e.g., [1], [6].Compared to these schemes, the advantages of our scheme are twofold: on the one hand, the scheme is simpler in the sense that no special "recovery mode" is necessary.On the other hand, the special buffer structure allows us to guarantee both fast performance if the network is working without errors and fast recovery after a network error has occured, see the discussion at the beginning of Section III.

II. SETUP AND PRELIMINARIES
We consider a discrete time nonlinear control system with x(n) ∈ X, u(n) ∈ U , whose solution with x(n 0 ) = x 0 is denoted by x(n; n 0 , x 0 , u).Typically, the model under consideration will be a discrete time model for a sampled data system.Thus, we often refer to the time instances n as sampling instances.
We assume that a controller is given, which generates a control sequence for each x ∈ X, where m ⋆ ≥ 1 is some fixed number.For instance, model predictive control algorithms naturally fit this setting.
An admissible control horizon sequence is a sequence of numbers we define the solutions x(•) of the closed loop system by where It follows that in (1), ( 2) the feedback values µ(x(σ kn ), 0), . . ., µ(x(σ kn ), m kn − 1) are used.Conditions under which a model predictive control scheme yields a stable closed loop system for all admissible control horizon sequences (m i ) i∈N0 can, e.g., be found in [3], [8].
We consider the following delays in the control loop: • τ sc (n): communication delay between sensor and controller • τ c (n): computational delay, i.e., the time the controller needs to compute µ(x 0 , •) from x 0 • τ ca (n): communication delay between controller and actuator Here the index n stands for the n-th transmission or computation, respectively, in or between the corresponding devices.
For ease of notation we assume that all these delays are integer values.In addition to these delays, packet dropouts can occur in each transmission.
For simplicity of exposition, let us assume that sensor, actuator and controller have synchronized clocks (this could be relaxed similar to [8, Section III.C]).Hence, at the time the measurement arrives at the controller, the delay τ sc (n) is known but τ c (n) and τ ca (n) are unknown.Since these values are unknown, we will use upper bounds τ max c and τ max ca of the delays which we intend to compensate.Note, however, that in our scheme we will not need to assume because each violation of (3) can be treated as a packet dropout.Thus, it is enough to assume that the transmission is successful "sufficiently frequently", which will be made precise in Theorem 4.5 and Remark 4.6.
In order to compensate for delays, we add a model based predictor to the controller. 1 We assume that given a state x(n) at time n, a time σ > n and a control sequence u n , . . ., u σ−1 , the predictor is able to generate a prediction x(σ; n, x(n), u) ≈ x(σ; n, x(n), u).
In the networked control scheme, the predictor will use a buffered control sequence ũ in order to generate the prediction x(σ; n s , x(n s ), ũ).Here n s denotes the most recent sensor time stamp, i.e., the prediction is based on the most recent measurement x(n s ) available to the predictor and σ is chosen such that delays ≤ τ max c + τ max ca are compensated, details will be provided below.Abbreviating x(σ) = x(σ; n s , x(n s ), ũ), the feedback value sequence µ(x(σ kn ), •) in (2) will then be replaced by µ(x(σ kn ), •).In order to ensure a faithful prediction we introduce the following definition.
(ii) We call a networked control scheme prediction consistent if at each time n the computation of u(n) according to (2) in the actuator is well defined, i.e., and µ(x(σ kn ), •) is consistently predicted.Under this condition it is easy to state various stability results.Here we only sketch a possible result which is similar to [6, Theorem 1] (see [6] for more precise assumptions and [2] or [4] for the definition and sufficient conditions for practical asymptotic stability).
Theorem 2.2: Assume that the closed loop system (1) is practically asymptotically stable if µ(x(σ kn ), •) is replaced by µ(x(σ kn ; n s , x(n s ), u), •) in ( 2), where u is the control sequence applied by the actuator.Then, if the networked control scheme is prediction consistent, the networked closed loop system is practically asymptotically stable.Sketch of the proof: Prediction consistency implies that u(n) from ( 2) is well defined and that the identity µ(x(σ kn ; n s , x(n s ), u), •) = µ(x(σ kn ; n s , x(n s ), ũ), •) holds.Hence, the networked closed loop system coincides with (1) where µ(x(σ kn ), •) is replaced by µ(x(σ kn ; n s , x(n s ), u), •).Since this system is assumed to be practically asymptotically stable, the assertion follows.
A more detailed formulation of Theorem 2.2 as well as extensions to other stability notions like, e.g., input-to-state stability, will be addressed in future papers.
Essentially, the notion of prediction consistency leads to a separation principle which allows to analyze the basic correctness of the prediction independently from the robustness of the closed loop system with respect to prediction errors and from the accuracy of the prediction model (which in turn depends on the delay τ s of the sensor transmissions, cf. also [6,Assumption 2]).Hence, we can leave robustness and accuracy issues aside when analyzing the basic mechanisms of a scheme.As a consequence, the prediction consistency framework allows to thoroughly and rigorously analyze the correctness and performance of more sophisticated networked control schemes, which is our focus in the remainder of this paper.

III. DESCRIPTION OF THE SCHEME
As already pointed out in the introduction, the necessity to know the control sequence for the prediction before it is applied by the actuator poses the crucial difficulty in designing a prediction consistent networked control scheme.Even worse, due to the possible packet dropouts the control sequence is needed in the predictor before we know whether it will arrive at the actuator.There is, however, an important detail in Definition 2.1 which we exploit: the control sequences only need to coincide for those feedback sequences µ(x(σ), •) which are applied by the actuator.Hence, by adding suitable time stamps to the transmissions we enable the actuator to determine whether the received control sequence was computed from a consistent prediction.Then, the actuator can discard erroneous control sequences and notify the controller that the prediction control sequence needs to be corrected.
In the literature, two related approaches can be found: in [6] an acknowledgment based scheme has been presented, in which the actuator confirms the receipt of each control sequence.As a consequence, the controller has to wait for the acknowledgment before the next control sequence can be computed, i.e., the transmission delay limits the time between two instances at which the control loop is closed.Hence, we propose an error message based scheme similar to [1] in which the network is assumed to work without errors until the actuator sends an error message.The main difference of our scheme compared to [1] is that our different buffer structure allows for a faster "recovery" of the scheme if an error has occurred: while in [1] after a network failure the scheme is in "recovery mode" for m ⋆ steps, our scheme will resolve a network error in at most τ max c + τ max ca time units plus the time needed for the transmission of the error message, cf.Remark 4.6.Furthermore, we do not need any internal "recovery mode" of the actuator.
In order to describe the scheme we specify the necessary buffer structure and the algorithms used in each component of the control loop.Although the clocks are assumed to be synchronized, it will be convenient to use different symbols n s , n c and n a for the time in the sensor, controller and actuator, respectively.
The main idea of the scheme is to use time stamped transmissions in order to compensate for delays.The simplest device in our scheme is the sensor.Recall that the sensor delay affects the prediction accuracy but not the prediction consistency, cf. the discussion after Theorem 2.2.
Sensor: The sensor simply sends a time stamped measurement at each sampling instance n s : (S) at the sampling instance n s the measured state and time stamp (x(n s ); n s ) are sent from the sensor to the controller While the sensor data carries only one time stamp n s indicating the time of the measurement, the control sequences µ(x(σ k ), •) sent to the actuator carry two time stamps: the first one, σ k , indicates the time from which on the sequence is supposed to be used in the actuator and the second one, σ pre,k , contains the largest time stamp of those control sequences which have been used for the prediction of x(σ k ).This information will be stored in the actuator and used to detect inconsistent control sequences.Thus, prediction consistency by removing inconsistent data from the buffer is maintained.
If a missing or inconsistent control sequence is detected, an error message is sent.This way the controller is able to correct the control sequence used for prediction, if necessary.For the synchronization of the respective control sequences, the controller uses an internal variable σ pre whose meaning will be described below.

Controller (including the predictor):
The controller and predictor device has two buffers: • a measurement buffer B m for storing the most recent time stamped measurement (x(n s ); n s ) received from the sensor • a control buffer B c for storing the time stamped feedback laws (µ(x(σ k ), •), σ k ), which are needed in order to construct the input sequence for the prediction.We define S c := {σ 0 , . . ., σ k } to be the (ordered) set of time stamps of the entries in B c .Note that in practice old values will, of course, be deleted from the buffer B c .For simplicity of exposition, we will not address this issue here.
The algorithm in the controller now works as follows: At each sampling instance2 n c , in the Steps (C1)-(C4) the predictor estimates the future state x(n c +τ max c +τ max ca ) from the most recent measurement available in the measurement buffer B m at time n c .The control sequence ũ for the prediction is constructed according to (2) from the feedback control sequences stored in B c and the largest time stamp σ kn used in ( 2) is stored in σ pre .The predicted state is used by the controller to compute a feedback control sequence which is sent to the actuator and stored in the buffer.Before the computation starts, an error check is performed in Step (C0): whenever the actuator detects an either missing or inconsistently predicted control sequence, an error message is sent.The error message contains the time stamps σ err and σ cor of the missing or inconsistent sequence and of the last consistent sequence received, respectively.If an error message is received, it is first checked whether a control sequence with time stamp σ err is contained in the prediction buffer B c , i.e., if σ err ∈ S c .This is the case if and only if an inconsistency occurred which has not been known before.In this case, all inconsistent control sequences are removed from the buffer B c .At the beginning, the internal variable is initialized to σ pre = undefined.
At each sampling instance n c : (C0) if an error message (σ err , σ cor ) has been received from the actuator, check whether σ err ∈ S c .If yes, delete all entries (µ(x(σ k ), •), σ k ) with σ k > σ cor from the control buffer B c and set σ pre := σ cor (C1) from the measurement (x(n s ), n s ) ∈ B m , predict x(σ) for σ := n c + τ max c + τ max ca using the prediction control sequence generated from B c via (2) (C2) from the predicted value, compute µ(x(σ), •) (finished at time n c + τ c (n c )) (C3) at time n c + τ c (n c ), send this control sequence, its time stamp and the time stamp of the preceding control sequence (µ(x(σ), •); σ; σ pre ) to the actuator (C4) set σ pre := σ and add (µ(x(σ), •), σ) to the control buffer B c Actuator: The actuator has the following buffer: • a control buffer B a for storing the time stamped feedback laws (µ(x(σ k ), •), σ k , σ pre,k ) received by the controller.We define S a := {σ 0 , . . ., σ k } to be the (ordered) set of time stamps which are contained in B a .
This buffer is similar to the control buffer B c in the controller but also stores the σ pre time stamps.Like in the controller, old values will be deleted from B a in practice.Again, for simplicity of exposition, we will not address this issue here.
We assume that the actuator is able to insert a transmitted feedback value sequence (µ(x(σ), •), σ, σ pre ) at the correct position σ into the buffer, i.e., the set S a remains ordered after inserting σ.This enables us to use feedback value sequences which arrive in the wrong order (with respect to the time stamp σ) due to the transmission delay.
In the actuator, we need to insert the arriving sequences into the buffer, detect missing and remove inconsistent sequences and apply the control value to the plant.This is done by the three steps of the following algorithm.
At each sampling instance n a : (A1) insert all time stamped feedback sequences (µ(x(σ), •), σ, σ pre ) which arrived at the actuator since the previous sampling instance n a − 1 and satisfy σ ≥ n a (sequence arrived in time) into the buffer B a at the correct position.(A2) if n a = 0 check whether there is Note that formula (2) used in Step (A3) requires n a ≤ max(S a ∩ {0, . . ., n a }) + m ⋆ − 1 for successful computation of u(n a ).We will later derive conditions which guarantee this inequality.
In words, step (A2) of this algorithm does the following: whenever a transition from one feedback sequence µ(x(σ i−1 ), •) to the next sequence µ(x(σ i ), •) occurs in the control sequence, it is checked whether σ pre,i = σ i−1 holds.If this is the case, then the actuator will use the new sequence.
If n a = 0, no check is performed, because no previous sequence is available in the buffer.
If, however, σ pre,i = σ i−1 holds, then the algorithm detects an inconsistency, deletes this sequence from the buffer and thus continues to use µ(x(σ i−1 ), •).In addition an error message containing the time stamps σ err of the inconsistent sequence and σ cor of the last correct sequence is sent to the controller.
If no feedback sequence µ(x(σ i ), •) with n a = σ i is present in the buffer, then the actuator assumes that this sequence has been lost and thus an inconsistency will occur at some later time.Hence, an error message containing the time stamp of the missing sequence and of the last correct sequence is sent to the controller.In step (C0), the check σ err ∈ S c enables the controller to decide whether the sequence was indeed lost, in which case an inconsistency will occur at some later time and consequently the usual error handling is performed in the controller.
Observe that if an error message (σ err , σ cor ) is sent at some time n a = σ err , then error messages with identical σ cor are sent at each time ña ∈ {σ cor + 1, . . ., n cons }, where n cons is the smallest time at which a sequence with σ pre = σ cor is found in the buffer.In particular, since the time stamps σ pre of the sequences sent in (C4) are strictly increasing unless an error message arrives, the occurrence of an error triggers an error message in each sampling instance until one of the messages reaches the controller, the prediction sequence is corrected and the corrected control sequence is received and processed in the actuator.
For simplicity of exposition, we assume that the scheme starts at n a = 0 and that 0 ∈ S a and 0 ∈ S c for all times n a ≥ 0 and n c ≥ 0, respectively.This means that the control sequence µ(x(0), •) has been computed, successfully transmitted, and is used in the actuator at time n a = 0.In particular, this implies that S c and S a are never empty.This is always physically possible if the controlled process can be stopped until the first feedback control sequence has been computed and successfully transmitted.If the process to be controlled is already running when the controller is turned on, this can be obtained by applying a default control value at the plant until the first successful transmission from the controller to the actuator and resetting all times n s = n c = n a to 0 at this time instant.
The following figures illustrate the scheme graphically.Finally, Figure 3 shows how this error is resolved.Upon arrival of an error message (possibly a later one than the one shown in Figure 2

IV. ANALYSIS OF THE SCHEME
In this section we analyze the prediction consistency of our proposed scheme.We proceed in three steps: first, in Lemma 4.2 we show that only consistently predicted feedback control sequences are applied by the actuator.Second, in Lemma 4.3 we derive a bound on the time needed for the error recovery of our scheme.Third, in Lemma 4.4 we give conditions under which ( 4) is satisfied on a finite interval.These three steps are then put together in Theorem 4.5 providing conditions on the network ensuring prediction consistency of our scheme.
Since the time stamp sets S c and S a vary with time, it will be convenient to define S c (n c ) := {σ 0 , . . ., σ k } and S a (n a ) := {σ 0 , . . ., σ k} to be the ordered set of time stamps which are contained in S c and S a after step (C4) and (A3) of the respective algorithms have been executed.Since in both algorithms no time stamp once removed from these sets can be inserted again, the inclusions follow for all ñc ≥ n c ≥ 0 and all ña ≥ n a ≥ 0.
Since both the control sequence applied by the actuator and the control sequence used for prediction are generated via (2), we first clarify which feedback sequences (which are uniquely determined via the entries in the respective time stamp sets S c and S a ) are actually needed in the respective computations of u(n) in (2).(ii) From (2) it follows that the control value used at time n a is uniquely determined by max{s ∈ S a (n a ) | s ≤ n a }.From (5) it follows that this value coincides with max{s ∈ S a (n 2 ) | s ≤ n a } which implies the assertion.
In order to show (7), let S t (n c ) = {σ 0 , . . ., σk } be the set of all σ-values for which steps (C0)-(C4) have been performed up to time n c , i.e., which have been computed and transmitted to the actuator.Clearly, both S a (σ) and S c (n c ) are subsets of S t (n c ).
More precisely, the values

Figure 1 Fig. 1 .
Fig. 1.Operation of scheme when no error occursFigure2shows a situation in which an error occures.Here the sequence arrives too late at the controller, i.e., at a time later than n a = n c +τ max c +τ max ca and is thus not inserted into the buffer B a .The actuator detects this missing sequence at time n a , sends an error message with the values σ err = n a and σ cor = max{σ k ∈ S a | σ k < n a }, and continues using the feedback sequence with time stamp σ cor until the error is resolved.Finally, Figure3shows how this error is resolved.Upon arrival of an error message (possibly a later one than the one shown in Figure2but with identical σ cor ) the prediction sequence is updated at time n c and the next feedback value sequence is computed based on the corrected prediction.If this sequence arrives at the actuator in time (which is the situation in the figure), the error is resolved at time σ = n c + τ max but with identical σ cor ) the prediction sequence is updated at time n c and the next feedback value sequence is computed based on the corrected prediction.If this sequence arrives at the actuator in time (which is the situation in the figure), the error is resolved at time σ = n c + τ max c + τ max ca and the sequence µ(x(σ), •) is used.

Lemma 4 . 1 :
(i) The control values used for the prediction in step (C1) at time n c ∈ {n s , . . ., σ} are uniquely determined by the feedback sequences corresponding to time stamps in the set S c (n c ) ∩ {σ c , . . ., σ − 1}, where σ c := max{s ∈ S c (n c ) | s ≤ n s }.