Approximation of reachable sets by direct solution methods for optimal control problems

A numerical method for the approximation of reachable sets of linear control systems is discussed. The method is based on the formulation of suitable optimal control problems with varying objective function, whose discretization by Runge–Kutta methods leads to finite-dimensional convex optimization problems. It turns out that the order of approximation for the reachable set depends on the particular choice of the Runge–Kutta method in combination with the selection strategy used for control approximation. For an inappropriate combination, the expected order of convergence cannot be achieved in general. The method is illustrated by two test examples using different Runge–Kutta methods and selection strategies, in which the run times are analysed, the order of convergence is estimated numerically and compared with theoretical results in similar areas.


Introduction
The subject of this article is the description of an algorithm for the approximation of reachable sets of linear control problems. The problem of determining convex reachable sets can be equivalently described by infinitely many optimal control problems (OCPs), where the objective function is adapted. By choosing only finitely many directions, approximations of reachable sets can be obtained. The occurring OCPs are not solved theoretically by use of the Pontryagin's maximum principle as in ref. [1], but numerically by suitable discretization methods. This allows to treat also time-dependent linear problems and even special nonlinear ones with convex reachable sets. Nonpolyhedral control regions can be treated as nonlinear inequalities and equalities. Results concerning the convergence of discretized OCPs can be found in refs. [2][3][4]  In this context, the particular choice of the selection strategy used for control approximation turns out to be crucial for the order of convergence and depends on the choice of the Runge-Kutta scheme used for the discretization of the underlying differential equations. To illustrate this dependency, several Runge-Kutta methods with different selection strategies (piecewise constant, piecewise linear and independent selection) are discussed in more detail for two illustrative examples.
This approach avoids cumbersome set operations (like Minkowski sums, unions of sets, etc.) and leads to known optimization methods, which in addition yield not only the endpoints of optimal trajectories, but also the entire trajectory including the corresponding optimal control. Furthermore, this approach is useful for linear control problems with control regions formulated with nonlinear restrictions [12] and for nonlinear control problems yielding convex reachable sets also. However, the close connection between set-valued analysis and optimal control is shown in section 3. A comparison with set-valued methods as in refs. [5][6][7][8][9][10][11] is beyond the scope of this article.
Methods for linear differential inclusions based on set-valued quadrature methods or set-valued Runge-Kutta methods are mentioned in ref. [7] as well as other methods, e.g. estimation methods for reachable sets [12] and ellipsoidal methods (cf. [11] for an overview). Newer developments of these methods achieve inner approximations [13,14] and outer approximations [15] of the reachable set [7].
The problem of the approximation of reachable sets appears in several disciplines: control theory, ordinary differential equations with uncertainties or with discontinuities in the state, necessary conditions for a minimum in nonsmooth analysis, differential games and viability theory, cf. [16][17][18][19][20]. The convexity of these reachable sets can be guaranteed for linear differential inclusions, but may also appear for special nonlinear problems (see e.g. [9,Example 5.2.3]).
The article is organized as follows. In section 2, basic notations and properties of reachable sets are summarized. Basic facts on the description of convex sets and arithmetic set operations are introduced and form the basis for the results of section 3. In particular, the Hausdorff and the Demyanov distances are defined, which are used to measure the speed of convergence with respect to the optimal value and the optimal trajectory, respectively. In section 3, the problem of calculating the boundary of the reachable set is reformulated, as infinitely many OCPs differ only in the objective function. These OCPs are discretized by the use of explicit Runge-Kutta methods and suitable control approximations resulting in finite-dimensional (linear/nonlinear) optimization problems. Herein, several approximation classes for the control lead to different selection strategies in the discretization. The section ends with a formulation of the proposed method for the approximation of reachable sets and its implementation. Several combinations of Runge-Kutta methods and selection strategies are discussed in section 4 with some illustrative test examples. Tables with convergence results, run times and visualizations of reachable sets are included. The numerically observed order of convergence for these combinations is compared with known result for the order in the setting of classical ODEs, OCPs and in the set-valued case. Finally, an outline for further research concludes the article.

Notation
In this section, some introductory definitions and results are collected.
The basic underlying problem is the following control problem For a given control function u : I → R m with u(·) ∈ L ∞ (I, R m ), we are looking for an absolutely continuous solution x(·) ∈ W 1,1 (I, R n ) of the differential equatioṅ Some notations from convex analysis are recalled, which are necessary for the explanation of the algorithm described later. DEFINITION 2.4 Denote the set of all nonempty convex compact sets by C(R n ) in R n and let C ∈ C(R n ), l ∈ R n . Then, δ * (l, C) := max c∈C l c and Y(l,C) := {c ∈ C | l c = δ * (l, C)} defines the support function of C, respectively, the set of supporting points of C in direction l. Support functions, respectively, supporting points describe fully a convex compact set.
where ∂C denotes the boundary of C and co(·) the convex hull of a set.
Proof See e.g. [ The last equation follows easily if one estimates the support function of the right-hand side in direction l by l y(l, C) = δ * (l, C) from below.
A common arithmetic operation on sets is the scalar multiplication and the Minkowski sum, which are recalled here. DEFINITION 2.6 Let C, D ∈ C(R n ), λ ∈ R and A ∈ R m×n . Then, defines the scalar multiplication, the image of C under the linear map x → Ax and the Minkowski sum.
We need the following theoretical result which states convexity and compactness of the set operations defined earlier.
LEMMA 2.7 Let C, D ∈ C(R n ), λ ∈ R and A ∈ R m×n . Then, λC and C + D are elements of C(R n ), and AC is an element of C(R m ). Furthermore, Proof To guarantee that the operations give results in C(R n ) and the equations on the support functions, see [23,  defines the one-sided Hausdorff distance, respectively, the Hausdorff distance as well as the Demyanov distance between two sets. Hereby, T C is defined as a set of all normed directions in R n for which the supporting set Y (l, C) consists of only one point y(l, C) (T D is defined analogously for the set D). T C and T D are subsets of the unit sphere S n−1 ⊂ R n of full measure.
Well-known properties of the support function make it easy to prove the following result for the Hausdorff distance:

Connection to OCPs
As we know from Proposition 2.3 that the reachable set for Problem 2.1 is convex, it is sufficient to calculate merely the boundary of the reachable set. Proposition 2.5 gives a motivation to calculate at least one support point (which lies automatically at the boundary) of the reachable set in direction l ∈ R n with l 2 = 1. Note that even in the case that the reachable set is not strictly convex and the set of supporting points is a (n − 1)-dimensional face, for a fixed direction l, one supporting point in this direction is sufficient to reconstruct the reachable set.
Thus to calculate a supporting point x(t f ) on the boundary of the reachable set R(t f , t 0 , x 0 ) in a fixed direction l, we have to find an admissible control function u(t) ∈ U that maximizes the functional y → l y (resulting in the support function δ * (l, R(t f , t 0 , x 0 )) as optimal value). This constitutes the following special OCP of Mayer type: .
We denote the optimal solution of (OCP l ) by x * (·; l) and u * (·; l), where the argument l indicates the dependency of the direction l.
As already mentioned in Proposition 2.5, the convexity and compactness of the reachable set guaranteed by Proposition 2.3 lead to the equivalent representation by considering supporting points in all directions l ∈ S n−1 : In section 3.3, comparable equalities to (2) and (3) are shown for the discrete reachable sets.

Approximation of reachable sets by discretized OCPs
For the moment, let l ∈ S n−1 be fixed in Problem (OCP l ). We introduce a grid with grid points for N ∈ N, N ≥ 1 and step-size h by Each explicit Runge-Kutta scheme with s stages can be characterized by its Butcher array: whereû = (u 0 , u 1 , . . . , u P −1 ) ∈ U P is a finite-dimensional vector parametrizing the selection strategy for the control appearing in the Runge-Kutta scheme. The set of all admissible control approximations u app (·;û) on the whole interval I is denoted by U app . For such a control approximation u (i) , the corresponding state approximation x app (t;û) is obtained via an explicit s-step Runge-Kutta discretization scheme: where (·, ·, h) is the characterizing function of the method with Suitable values for the coefficients in the Butcher array can be found in ref. [26].
Let us now consider examples for selection strategies used furtheron.
(i) Piecewise constant approximation: with P = N independent selection variables u i ∈ U . (ii) Continuous and piecewise linear approximation: with P = N + 1 selection variables. (iii) Independent selections at intermediate grid points t i + γ j h (cf. e.g. [4]): with P = s · N variables.
Here, each Runge-Kutta stage creates a new selection variable for each subinterval. Modified Euler's method (see section 4 and figure 4 in Example 4.2) yields γ 1 = 0, γ 2 = 1/2 so that two selection variables u 2i and u 2i+1 are chosen from U for each subinterval [t i , t i+1 ]. Similarly, Heun's method (see section 4 and figure 3 in Example 4.2) yields γ 1 = 0, γ 2 = 1 so that two selection variables u 2i and u 2i+1 are chosen for each subinterval [t i , t i+1 ], although t i + γ 2 h = t i+1 + γ 1 h coincides for i = 0, . . . , N − 1 and this additional freedom of choice may result in two different values for u (i) app (t i+1 ;û) and u (i+1) app (t i+1 ;û). Owing to the definitions (i)-(ii) and the convexity of U , is guaranteed for all three presented strategies. Note that further selection strategies are possible, e.g. independent selections with additional continuity constraints at the inner grid points t i , i = 1, . . . , N − 1, or additional equality constraints at those intermediate grid points Additional care has to be taken for other selection strategies to fulfill (7). Thus, by this discretization, the infinite-dimensional OCP l is approximated by the finitedimensional convex programming problem Note thatû implicitly defines a control approximation u (i) We denote the optimal solution of (CP 1 l ) byû * . If the conditions ( * ) can be written with a finite number of affine inequalities, (CP 1 l ) is a linear programming problem and called (LP 1 l ), otherwise a nonlinear (convex) programming problem.
In the sequel, we investigate the simplest case in more detail: Euler's method with piecewise constant control approximation. In this case, it is possible to derive explicit solutions for the finite-dimensional problems (CP 1 l ). Nevertheless, every explicit Runge-Kutta method with the selection strategies (i)-(iii) will give a similar (but more complicated) representation. The explicit formula for the solution stresses the strong connection to set-valued methods, e.g. in [5,6,8], via support functions, respectively, supporting points.
In the case of Euler's method, the characterizing function in (5) reduces to The recursive evaluation in (5) for Euler's method yields with The matrix product is defined as Introducing this expression for x app (t f ;û) in (LP 1 l ) yields the linear programme Note that this linear programme has the same solutionû as (LP 1 l ), whereas the optimal objective function values are different because we neglected constant terms.
To compute the objective function in (LP 2 l ) very efficiently, we introduce additional artificial variables λ N := l, These artificial variables are calculated backward in time and correspond to the discretized adjoint variable of the OCP l (cf. [27]).
Then, (LP 2 l ) can be replaced by as optimal value of (LP 3 l ) and hence (u 0 , u 1 , . . . , u N −1 ) with supporting points u k ∈ Y (B k λ k+1 , U) as a solution.
In the special of box constraints, i.e.
is maximized, if each term S j k · u j k is maximized, the solution of (LP 3 l ) is given for j = 1, . . . , m, k = 0, . . . , N − 1 by the choice

Discrete reachable sets
Discrete reachable sets are the reachable sets of the discretized equations and could be defined as endpoints of discrete solutions of the following problem.
Given the data in Problem 2.1, the discretized problem depends on the choice of the set U app of all discretized control functions and on the Runge-Kutta scheme.
Problem 3.1 For a time discretization (4) with step-size h = (t f − t 0 )/N and a given discretized control function u app (·,û), we are looking for a solution x app (·,û) at the grid-points Proof For a chosen discretized control function u app (·,û), the discrete solution is defined by (5). The discrete reachable set coincides with the union of all such discrete solutions for all feasible discretized control functions. In (5), common appearances of u k , k = 0, . . . , P − 1, and x app (t i ;û) must be grouped together (first within η (j ) i and within (x app (t i ;û),û, h), then also within x app (t i+1 ;û)). Repeating this step for all i = 0, . . . , N − 1, one arrives at a representation of the form with appropriate matricesÂ N andB k , k = 0, . . . , P − 1. Finally, the union over all independent selections u k ∈ U gives with Definition 2.6 which shows together with Lemma 2.7 the assertion.
In the special case of Euler and piecewise constant approximation of the controls, this corresponds to the union over all vectorsû ∈ R m(N+1) . The explicit form of the discrete reachable set for Euler (cf. (8)) is Although only demonstrated for Euler's method in section 3.2, the calculation of (CP 1 l ) (respectively, its reformulation (LP 3 l )) for every explicit Runge-Kutta scheme and selection strategy (i)-(iii) automatically involves the computation of supporting points in direction l of the discrete reachable set R N (t f , t 0 , x 0 ) (the maximizers in (CP 1 l )). The optimal value of problem (CP 1 l ) coincides with the support function δ * (l, R N (t f , t 0 , x 0 )). (2) and (3):

Propositions 2.5 and 3.3 justify the analogue result to
In the implementation, only a finite number of different normed directions l i , i = 1, . . . , M, is chosen in (10) and (11). This corresponds to the fact that for complex problems, we can compute a solution of (OCP l ) neither analytically nor numerically for all directions l. Hence, we suggest to approximate (OCP l ) numerically and consider only a finite number of directions l i , i = 1, . . . , M. This yields an approximation of the reachable set, which will be studied in the next proposition. The results in ref. [7] are slightly improved by the following result.

PROPOSITION 3.4 Proposition 2.5 suggests to use one of the approximations
Set For the estimation of C M , it is clear that C ⊂ C M . Let us estimate C M by C . For this purpose, choose an arbitrary l ∈ S n−1 and an appropriate l i ∈ G M with l − l i = dist(l, GM). Then, Using C M = max l∈S n−1 |δ * (l, C M )|, we arrive at the estimation (1 − ε) C M ≤ C . Now, choose l ∈ S n−1 such that d(C M , C) = δ * (l, C M ) − δ * (l, C). Again, take an appropriate l i ∈ G M with l − l i = dist(l, G M ). Hence, we have for ε ≤ 1/2.
One method for the choice of G M in Proposition 3.4 is to parametrize the directions by spherical coordinates and use equidistant partitions on the parameter intervals for the angles (see ref. [7, section 3.1.2]).

Implementation
In the sequel, we briefly discuss some numerical methods, which are suitable for solving the discretized OCP (CP 1 l ). Of course, the choice of an appropriate method depends on the explicit representation of the control region U . Hence, we restrict the discussion to convex sets U defined by where X:={u ∈ R m |Ã u = b, u ≥ 0} with a matrixÃ ∈ R q×m and the functions g i (·), i = 1, . . . , r, could be either linear or nonlinear. If the functions g i in (12) are affine, then problem (CP 1 l ) is a linear optimization problem and can be solved by the well-known simplex method or some interior point method [28], suitable for linear programmes. In the special case of an Euler approximation and U defined by box constraints only, a very efficient method is described in section 3.2.
If the functions g i are convex and smooth, i.e. at least continuously differentiable, then the resulting problem (CP 1 l ) is a convex but nonlinear programming problem and the sequential quadratic programming method is appropriate provided the functions g i are defined for infeasible points [29]. Alternatively, the method of feasible directions is applicable, especially if the functions g i are only defined for admissible points [30].
If the functions g i are convex but nonsmooth, the bundle method, respectively, the bundle trust region method (BT method) is suitable [31,32]. In addition, Kelley's cutting plane method is also applicable [33].
In the case that the support function or the supporting points of the convex control set U are known, general control regions U can be approximated according to Proposition 3.4. ALGORITHM 1 Let us formulate the overall method.

Choose a ( pointwise explicit) Runge-Kutta method.
2. Choose a suitable selection strategy (e.g. one of (i)-(iii) in section 3.2) for the control approximation (in the optimal case, the order of the set-valued method is known as O(h p )). Solve the optimal control problem (OCP l i ) numerically with a direct solution method by discretizing (OCP l i ) with the Runge-Kutta method in (2) and the selection strategy in (3).
Solve the resulting finite-dimensional optimization problem (CP 1 l i ). The optimal value l i , x app (t N ;û) of this problem gives the support function δ * (l i , R N (t f , t 0 , x 0 )) in direction l i , the maximizer x app (t N ;û) yields a support point y(l i , R N (t f , t 0 , x 0 )) for this direction. accuracy  O(max{h p , ε}).

Plot the convex polygon with one of the representations
Often, a speed-up of the earlier algorithm can be achieved by exploiting the observation that the numerical solutions of (OCP l ) for neighbouring directions l, say l i and l i+1 , are usually close. Hence, in a so-called warm-start, the solution for direction l i is used as an initial guess for the neighbouring direction l i+1 .

Examples
In the sequel, we refer to the OCP l , the differential equation (1a) and (1b), the control constraint (1c) and the control approximations discussed in (i)-(iii) in section 3.2.
The following Runge-Kutta methods are studied for the numerical computation of reachable sets: For simplicity, the methods with different selection strategies are tested for time-independent two-dimensional problems (in which one could even calculate a theoretical solution for reference purposes). Nevertheless, the framework presented before is still valid and the methods could be used also in more complicated problems met in practice (time-dependent and higher dimensional ones).
From Definition 2.8 and Proposition 2.9, it is clear that the approximation of the reachable set corresponds to a uniform convergence of the optimal value functions measured with the Hausdorff distance, whereas the approximation of trajectories corresponds to the uniform convergence of the maximizers and the choice of the Demyanov distance.
In all examples below, a combination method in [6,7] consisting of a (point-wise) Runge-Kutta method and a set-valued quadrature method is used with a huge number N of subintervals to calculate a reference set R ref (t f , t 0 , x 0 ). For these reference methods, the order of convergence is already known so that this reference set yields approximations for the Hausdorff and the Demyanov distance by chosen directions l i ∈ G M , i = 1, . . . , M (the latter requires to take directions of unique support). By comparing the different values on the basis of the optimal value function in the calculation of the discrete reachable set, the order of convergence is estimated by assuming For two subsequent values of N in the computed approximations, the left-hand side in (15) is calculated by the programme and as a consequence also p from the two equations (15) involving N 1 and N 2 . The analogue estimation is done for the Demyanov distance and the corresponding maximizers in the calculation of the discrete reachable set. The angle ϕ j for the direction l j = cos(ϕ j ) sin(ϕ j ) ∈ R 2 , ϕ j = 2jπ/M, j = 0, . . . , M − 1, for which the maximum in (13) and (14) is attained as shown in the following tables.
In figure 1, approximations to the reachable set R(1, 0, x 0 ) are shown: in the left picture, approximations for Euler's method with piecewise constant selections are shown (first order of convergence) and in the right one, the corresponding ones for Heun's method with continuous and piecewise linear control approximation (second order of convergence) are depicted. Here, the reference set is calculated by the combination method of set-valued iterated trapezoidal rule together with Heun's method introduced in refs. [6,7] withN = 1,000,000 and differs from the reachable set R (1, 0, x 0 ) by O(1/N 2 ). In both cases, the reference set is drawn with solid lines. The dashed lines show the approximations for N = 10, 20, 40 for Euler's method on the left picture (note the halfening of the distance of the upper right corner of the sets when the number of subintervals is doubled). At the right one, the dashed lines show the approximations for N = 1, 2, 4 for Heun's method (a smaller number of subintervals is chosen so that one could still see a difference of the corresponding approximations in figure 1). Note the more rapid convergence even for these small numbers of subintervals in comparision with Euler's method.
Also In table 1, the approximated distances (13) and (14) to the reference set and the estimated value of the convergence order p from (15) are shown. Table 1 shows the expected convergence order 1 for the reachable set and the trajectories for Euler's method. The approximated Hausdorff distance δ M H is attained at the upper right corner. For Heun's method with continuous and piecewise linear control approximation, table 2 shows order of convergence 2 for the reachable set and only order 1 for the trajectories.
Remark 1 Let us consider again the problem stated in Example 4.1. As Veliov explains in ref. [35], the convergence of the trajectory could not be better than O(h) in this example. During the calculation of the discrete reachable set, the direct solution method calculates firstorder approximations to the control and to the state components (coordinates This example introduces the nonlinear constraint 'u 2 1 + u 2 2 ≤ 1' for the control variable u = (u 1 , u 2 ) ∈ U .
The second-order approximations to the reachable set R(2, 0, Both selection strategies seem to converge with order 2, which is assured by tables 3 and 4. Nevertheless, figure 4 shows that the choice of the selection strategies for the control should depend on the Runge-Kutta method. In figure 4, the piecewise constant selection strategy is  compared with the independent control selections in t i and t i + (h/2) for modified Euler's method (see (6)). The latter selection strategy destroys order of convergence 2 of the Runge-Kutta method. This is verified in tables 5 (order O(h 2 )) and 6 (only order O(h)) for the convergence to the reachable set and the trajectories.
An overall comparison of the CPU times for comparable accuracies shows that methods of higher order are superior. The extremely high CPU times for the modified Euler's method with independent selection strategy (iii) in table 6 are due to a highly oscillatory structure of the numerical solutions of the respective optimal control problems.
Modified Euler's method with piecewise constant control approximation is faster than Heun's method with piecewise constant control approximation at the same accuracy and order, cf. tables 3 and 5. Note that it is possible to prove that the discrete reachable sets are identical for both methods because of time invariance. Numerically, small differences appear  in tables 3 and 5 due to different formulations of the corresponding optimization problems (CP 1 l ). At a first glance, Heun's method with continuous and piecewise linear control approximation seems to be slower than the previous two methods in view of CPU times, cf. table 4. But taking into account the Hausdorff distance δ M H and comparing CPU times for (approximately) the same accuracy, it turns out that Heun's method with continuous and piecewise linear control approximation is actually not worse than modified Euler's method (compare N = 20 in table 5 and N = 10 in table 4). Table 7 shows a comparison on results concerning the order of convergence and classical ODEs in ref. [26], for optimal control problems in refs. [3,4] and for the set-valued case in the approximation of reachable sets refs. [5,7,8,10].
In the last column, citations for proofs and corresponding assumptions on smoothness and on the problem are additionally mentioned. The indication '<2' based on the work [3] means that the conditions for order 2 (the main topic of this article) are not all fulfilled. In ref. [4], only  [8] results for the independent selection strategy are mentioned and all weights β i are assumed to be positive. This is not the case for modified Euler, and even an example in refs. [3,4] shows that for a special linear control problem without restrictions on the control and quadratic cost functional, this method with the strategy (iii) of two independent selections in [t j , t j +1 ] gives no convergence at all. The minimal order 1 is proven and an order breakdown to 1 for a special example is noticed for a set-valued realization of the algorithm in ref. [10]. Note that additional smoothness assumptions on the optimal solution as well as the coercivity are formulated in refs. [3,4] contrary to the results cited in the last column.
Let us remark that the last column in table 7 coincides with the numerical experiments of the direct solution method in Examples 4.1 and 4.2.

Outline of further research
It is known that set-valued quadrature methods in ref. [6] could lead to an order of convergence greater than 2 if the problem satisfies additional smoothness conditions, cf. [7]. In this case, selection strategies with piecewise constant controls are no longer appropriate. Preliminary computer experiments with the classical Runge-Kutta method show that order of convergence greater than 2 is attainable. However for these Runge-Kutta methods, suitable selection strategies for the approximation of reachable sets have to be studied in more detail. In this context, additional difficulties arise if state constraints are present, because these constraints should be fulfilled also at the intermediate stages of the Runge-Kutta scheme (as in ref. [9]).
Further research can be conducted towards the study of Runge-Kutta schemes as in refs. [36][37][38], where the selection strategy is motivated by Volterra series involving multiple control integrals. In the special case of two selections per Runge-Kutta step, this leads to alternative selection sets of the type where U × U corresponds to case (iii) of independent selections in section 3. This setÛ can be described by finitely many nonlinear inequalities and equalities, which can be easily imposed as additional constraints in the discretized optimal control problems of Algorithm 1.
The proposed method itself can be easily adapted to the calculation of convex reachable sets for special nonlinear differential inclusions. For the numerical solution of discretized optimal control problems, efficient algorithms are available, cf. [27,34,39]. In the more general case of nonconvex reachable sets, suitable modifications of our approach have to be studied. Theoretical results in this direction can be found in refs. [5,40] for Runge-Kutta methods of orders 1 and 2. A survey of other methods is given in refs. [9,20].
However, those Runge-Kutta methods with appropriate selection strategies, which show higher order of convergence in the linear case, are worth being investigated also in the nonlinear case. In addition, these methods have to be compared with set-valued Runge-Kutta methods on the basis of the set arithmetics, cf. [9], which also work in the general nonlinear case. First step in this direction can be found in ref. [9,Example 5.3.1].