NUMERICAL STABILIZATION OF BILINEAR CONTROL SYSTEMS

For bilinear control systems with constrained control values extremal Lyapunov exponents are computed numerically by solving discounted optimal control problems. Based on this computation a numerical algorithm to calculate stabilizing control functions is developed.

1. Introduction. In this paper we present numerical algorithms for the calculation of extremal Lyapunov exponents and stabilization of bilinear control systems in R d , i.e. systems of the forṁ with A j ∈ R d×d , j = 0, . . . , m, u(·) ∈ U := {u : R → U, u measurable} with a compact and convex set of control values U ⊂ R m with nonvoid interior. The Lyapunov exponent of (1.1) with respect to an initial value x 0 ∈ R d and a control function u(·) ∈ U is given by λ(x 0 , u(·)) := lim sup where x(t, x 0 , u(·)) denotes the trajectory of (1.1). Bilinear control systems arise e.g. by linearization of a nonlinear control system with a common fixed point x * for all control values u ∈ U with respect to x. They were first systematically studied by R. Mohler [18] in 1973. Lyapunov exponents were introduced by A.V. Lyapunov in 1892 (under the name of order numbers) as a tool to study nonlinear differential equations via their linearizations along trajectories. Recent results about the Lyapunov spectrum of families of time varying matrices (cfr. F. Colonius, W. Kliemann [11]) made it possible to characterize the domain of null controllability of bilinear systems using Lyapunov exponents (cfr. F. Colonius, W. Kliemann [10]). A basic property of the Lyapunov exponents is that λ(x, u(·)) < 0 iff x(t, x 0 , u(t)) converges to zero faster than any exponential e at with λ(x 0 , u(·)) < a < 0. As an easy consequence inf u(·)∈U λ(x 0 , u(·)) < 0 implies that there exists a control function such that the corresponding trajectory converges to zero. The domain of null controllability -the set of all points x 0 with negative minimal Lyapunov exponent -may be only a part of R d and as a consequence stabilization may only be possible for subsets of R d . Null controllability in this context always means asymptotical null controllability since the origin is not reachable in finite time from any other point of the state space. This implies that an approach via the mimimum time function (cfr. e.g. M. Bardi, M. Falcone [1]) does not apply here.
In contrast to the direct approach to this stabilization problem via Lyapunov functions (cfr. e.g. R. Chabour, G. Sallet, J. Vivalda [5]) the method developed here is in some sense an indirect approach: First a numerical approximation of the extremal Lyapunov exponents of (1.1) is calculated. This enables us to characterize the stability properties of (1.1). Once this approximation is known we stabilize the system (i.e. we find control functions such that the corresponding trajectories converge to zero) by searching for control functions such that the corresponding Lyapunov exponent is close to the minimal exponent or at least negative. In §2 these problems are discussed in terms of optimal control theory. We show that the problem of calculating extremal Lyapunov exponents -which can be expressed as an average yield optimal control problem -can be approximated by discounted optimal control problems.
If we look at the uncontrolled system with U = {0} it turns out that the Lyapunov exponents are just the real parts of the eigenvalues of A 0 . Together with the corresponding eigenspaces they determine the stability properties of the system. For the controlled system we need suitable generalizations of eigenspaces associated with the Lyapunov exponents. The basic ideas of this concept are presented in §3 followed by an interpretation of the results of §2 in terms of calculating extremal Lyapunov exponents. Section 4 presents algorithms to solve discounted optimal control problems numerically based on a discretization scheme by I. Capuzzo Dolcetta [2], [4] and M. Falcone [12], [13] connected to the framework of dynamic programming (cfr. [3]). Section 5 contains several numerical examples calculated with these algorithms.
2. Discounted and average cost optimal control problem. In this section we will show that average yield optimal control problems can be approximated by discounted optimal control problems. Consider a control system on a connected n-dimensional C ∞ -manifold M given bẏ x(t) = X(x(t), u(t)) for all t ∈ R (2.1) x(0) = x 0 ∈ M (2. for all x ∈ M, u(·) ∈ U the trajectory ϕ(t, x, u(·)) exists for all t ∈ R (2. 6) We now consider the following two optimal control problems given by the control system (2.1)-(2.6) and a cost function g satisfying g : M × U → R continuous on M × U (2. 7) |g(x, u)| ≤ M g ∀(x, u) ∈ M × U (2.8) The δ-discounted cost for δ > 0 and the average cost are defined by J δ (x, u(·)) := ∞ 0 e −δt g(ϕ(t, x, u(·)), u(t))dt (2.9) J 0 (x, u(·)) := lim sup T →∞ 1 T T 0 g(ϕ(t, x, u(·)), u(t))dt; (2.10) The associated optimal value functions are A basic property of the discounted optimal value function is Bellman's optimality principle: for any t > 0 we have For the average cost a similar estimate is valid: for any t > 0 we have Results about the relation between discounted and average cost optimal control problems as the discount rate tends to 0 have been developed by F. Colonius [6] and F. Wirth [20]. Here we will first show the relation between the value of δJ δ and J 0 along certain trajectories. Then we will use similar techniques as in [6] and [20] to obtain convergence results for the optimal value functions. The first theorem shows that J 0 is bounded if δJ δ is bounded. Since J 0 has an infinite time horizon it is not sufficient that δJ δ is bounded for the initial value. It has to be bounded for all ϕ(t, x, u(·)), t > 0 and the corresponding shifted control function.
Proof. We may assume C = 0 by using g − C instead of g. In the first step we show that for every t > 0 there exists aτ (t), such that Abbreviate f (s) := e −δ(s−t) g(ϕ(s, x, u(·)), u(s)). Obviously there exists aτ (t) such that (2.15) is true for the shifted discounted functionalτ In the case thatτ and we can choose Fixing ε > 0 we can define a monotone increasing sequence (τ i ), i ∈ N by τ 1 := t, τ 2 := t + γ, From the construction of this sequence it follows that τ i converges toτ (t) and we may truncate the sequence by choosing k ∈ N such that |τ k−1 −τ (t)| < ε and set τ k :=τ (t). Now we can estimatẽ which proves (2.15) since ε > 0 was arbitrary.
To prove the theorem we first fix T > 0 and define a sequence ( and as a conclusion which finishes the proof. Note that it is possible just to replace ≤ by ≥ and −α by +α to obtain the analogous result for a lower bound of J 0 . Theorem 2.2. (Approximation Theorem II) Consider optimal control systems on M given by (2.1)-(2.6) and (2.7)-(2.10). Assume there exists a control function u(·) ∈ U such that J 0 (x, u(·)) ≤ C − α for constants C ∈ R, α > 0. Then there exists a constant R = R(x, u(·), α) > 0 such that δJ δ (x, u(·)) < C for all δ < R.
Proof. We may again assume C = 0. Hence it follows that there exists T 0 ≥ 0 such that Now assume that δJ δ (x, u(·)) ≥ 0 for arbitrarily small δ > 0. The first step of the proof of Theorem 2.1 for the opposite inequality with t = 0 applied to g + α 2 then yields that there exist arbitrarily large timesT > 0 such that T 0 g(ϕ(t, x, u(·)), u(t))dt +T α 2 > 0 which contradicts (2.17). Hence the assertion follows. In contrast to the first Approximation Theorem here it is not possible simply to replace ≤ by ≥ and −α by +α to obtain a analogous result for the lower bound. Estimate (2.17) does only hold for the reverse inequality if in (2.10) the lim sup is replaced by the lim inf.
We will now combine these two theorems with controllability properties to obtain results about the relation between δv 0 and v δ as δ tends to 0. To do this we first introduce some definitions.
for every x ∈ D there is u(·) ∈ U such that the corresponding trajectory ϕ(t, x, u(·)) stays in D for all t ≥ 0 (iii) D is maximal with the properties (i) and (ii) A non invariant control set is called variant.
In order to avoid degenerate situations we need the following setup: Let L = LA{X(·, u), u ∈ U } denote the Lie-algebra generated by the vector fields X(·, u). Let ∆ L denote the distribution generated by L in T M , the tangent space of M . Assume that This assumption guarantees that the positive and negative orbit of any point x ∈ M up to any time T = 0 have nonvoid interior. Note that the definition of control sets demands only approximate reachability (i.e. existence of controls steering into any neighbourhood of a given point); as a consequence of assumption (2.18) we have exact controllability in the interior of control sets, more precisely intD ⊂ O + (x) for all x ∈ D.
The following proposition shows -as an extension of [7, Proposition 2.3] -that we have exact controllability in finite time on certain compact subsets: Proposition 2.5. Consider a control system on M given by (2.1)-(2.6) and satisfying (2.18). Let D ⊂ M be a control set and consider compact sets Then there exists a constant r > 0 such that for every x ∈ K 1 , y ∈ K 2 there exists a control function u(·) ∈ U with ϕ(t 0 , x, u(·)) = y for some t 0 ≤ r.
Proof. (i) We first show that for every x ∈ K 1 , z ∈ K 2 there is an open neighborhood U (x) such that all y ∈ U (x) can be steered to z in bounded time t 0 . By (2.18) there is T < ∞ and z 1 ∈ intD ∩ O − ≤T (z) and an open neighborhood . For x ∈ K there exists a control u(·) ∈ U and a time t 1 < ∞ such that ϕ(t 1 , x, u(·)) = z 1 (as a consequence of exact controllability in the interior of control sets). Since the solutions of the system depend continuously on the initial value, there is an open neighborhood U (x) whith ϕ(t 1 , x 1 , u(·)) ∈ U (z 1 ) for all x 1 ∈ U (x). Putting this together yields U (x) ⊂ O − ≤t1+T (y) which proofs the assertion with t 0 ≤ t 1 + T .
(ii) For x ∈ K 1 , y ∈ K 2 we now show that there exists a time t y < ∞ such that all z in some open neighborhood of y can be reached from x in time t y . Let x 1 ∈ intD and u 1 (·) ∈ U, t 1 < ∞ such that ϕ(t 1 , x, u(·)) = x 1 (the existence of x 1 , u 1 (·), t 1 follows from (2.18)). Again by (2.18) there exists T < ∞ and y 1 ∈ intD ∩ O − ≤T (x 1 ), let U (y 1 ) be an open neighborhood of y 1 contained in intD ∩O − ≤T (x 1 ). Now because of the exact controllability there exists u 2 (·) ∈ U, t 2 < ∞ with ϕ(t 2 , y 1 , u 2 ) = y. Since the solution of the control system using the control u 2 (·) defines a semigroup of homeomorphisms on M , the open neighborhood U (y 1 ) is mapped onto some open neighborhood U (y) and U (y) ⊂ O + ≤t1+T +t2 (x). This means that all z ∈ U (y) can be reached from x in time t y = t 1 + T + t 2 .
(iii) Because of the compactness of K 1 and K 2 now the proof of the Proposition follows.
The following proposition summarizes the consequences of these controllability properties for the optimal value functions.
Proof. Just combine (2.13) and (2.14) with the controllability properties stated above.
Now we can formulate the results about the relation between the optimal value functions.
Combining these inequalities finishes the proof. Using the estimate of proposition 2.6 two results on uniform convergence can be obtained.
Proof. Since C is a compact subset of C and no trajectory can leave C the conditions of Theorem 2.10 (with K = C) are fulfilled. Hence the assertions (i) and (ii) follow.
If M is compact and C is the unique invariant control set it follows that From Proposition 2.6, (ii) and (iv) and the compactness of M = O − (C) it follows for any compact subset Q ⊂ intC that v 0 (x) ≤ v 0 (y) and δv δ (x) ≤ δv δ (y)+ ε(δ) for all x ∈ M, y ∈ Q. Since we have uniform convergence on Q the assertion (iii) is proved.
Remark 2.12. Note that these results are not valid in general for the corresponding maximization problems, since the second Approximation Theorem is not valid for the reverse inequality. However some of the results remain valid and others are valid under additional conditions: (i) The application of the results to the maximization problems is possible if the lim sup in (2.10) can be replaced by a lim inf without changing the value of v 0 . This is possible if there exist approximately optimal trajectories and controls -with respect to the maximization problem -such that the lim sup is a limit. From [20, proof of Proposition 1.4, (a)] it is clear that this is the fact if there exist approximately optimal trajectories and controls which are periodic. A sufficient condition for this is that there exists an optimal trajectory that stays inside some compact subset K ⊂ intD (cfr. [20,Proposition 2.7]).
(ii) Adding this condition to the assumptions of Theorem 2.10 we obtain Theorem 2.10 from F. Wirth [20] under the weaker condition that the optimal trajectories with respect to the discounted problems stay inside a compact subset of a control set instead of a compact subset of the interior of a control set.
(iii) For invariant control sets C we can use [7,Corollary 4.3] to conclude that for any initial value x 0 ∈ intC there exist approximately optimal periodic control functions and trajectories. Hence Theorem 2.11 remains valid for the maximization problem without any additional assumptions.
3. Lyapunov exponents of bilinear control systems. We will now return to the bilinear control systems in R d , i.e. systems of the forṁ We denote the unique trajectory for any initial value x 0 ∈ R d and any control function u(·) ∈ U by x(t, x 0 , u(·)).
In order to characterize the exponential growth rate of the solutions of (3.1) we define the Lyapunov exponent of a solution by The minimal Lyapunov exponent with respect to x 0 ∈ R n \ {0} ist defined by and the extremal Lyapunov exponents of the control system by The Lyapunov exponent can be interpreted as a measure for the exponential growth of trajectories. Our aim is to calculate numerical approximations of the minimal and maximal Lyapunov exponents with respect to the initial values. If λ * (x 0 ) < 0 the system can be steered asymptotically to the origin from x 0 . Using the approximation of the Lyapunov exponents we then are able to calculate controls that stabilize the system.
For a bilinear control system (3.1) the following identity is obvious: Due to this observation we can identify all x = 0 lying on a straight line through the origin. Hence it is sufficient to consider initial values s 0 in P d−1 , the real projective space. To calculate the Lyapunov exponents we can project the system onto the unit sphere S d−1 via s 0 := x 0 / x 0 . This yields the projection onto P d−1 by identifying opposite points. A simple application of the chain rule shows that the projected system can be written asṡ The Lyapunov exponent (3.2) with respect to s 0 = x 0 / x 0 can be written as We recall some facts about projected bilinear control systems and their Lyapunov exponents.
For the projected bilinear system assumption (2.18) reads . Under this assumption the following facts hold (cfr. [9,Corollary 4.4], [8, Theorem 3.10]): If κ 1 denotes the maximal Lyapunov exponent of the original system and κ * 2 the minimal exponent of the time reversed system the identity κ 1 = −κ * 2 holds. For the projected system there exist k control sets with nonvoid interior where 1 ≤ k ≤ d. These are called the main control sets. They are linearly ordered by The control set D 1 is open, the control set C := D k is closed and invariant. All other control sets are neither open nor closed. Furthermore we have O − (p) = P d−1 for all p ∈ intC.
The linear order of the control sets implies a linear order on the minimal Lyapunov exponents (which can easily be proved using Proposition 2.6): Under the following condition there is a stronger relation between the control sets of the projected and the Lyapunov exponents of the bilinear system: Considering the set of control values ρU := {ρu | u ∈ U } for ρ ≥ 0 and the corresponding set of control functions U ρ we assume the following ρ-ρ ′ inner pair condition: Let D ρ be a main control set corresponding to U ρ . We define the Lyapunov spectrum of (3.1) over D ρ by Σ ρ Ly (D ρ ) := {λ(p, u(·)) | ϕ(t, p, u) ∈ D ρ for all t ≥ T for some T ≥ 0} and the Lyapunov spectrum of (3.1) by for all exept at most countably many ρ < ρ ′ , where k(ρ) is the number of main control sets D ρ i corresponding to U ρ ([11, Corollary 5.6]) Furthermore Σ ρ Ly (D ρ i ) are closed intervals and thus it is sufficient to calculate the minima and the maxima of Σ Ly (D ρ i ) to obtain the whole Lyapunov spectrum of the system. These maxima and minima can be approximated by periodic trajectories with initial values in intD ρ i . In the case d = 2 these results hold for all ρ > 0 without assuming the ρ-ρ ′ inner pair condition ( [11,Corollary 4.9]).
We will now give an interpretation of the results of §2 in terms of calculating Lyapunov exponents and stabilization. Since we are going to solve the discounted optimal control problem numerically we cannot expect to calculate optimal control functions but only ε-optimal control functions. We call a control function u x (·) ∈ U uniformly εoptimal with respect to x ∈ M iff |δJ δ (ϕ(t, x, u x (·)), u x (t + ·)) − δv δ (ϕ(t, x, u x (·)))| < ε for all t ≥ 0. Then the following estimates hold with ε → 0 as δ tends to 0.
(vii) Ifκ < 0 and u s (·) is uniformly ε-optimal with respect to s then ϕ(t, x, u s (·)) is asymptotically stable for all x ∈ R d with s = x x provided δ and ε are sufficiently small.
(viii) If λ * < 0 in the interior of some control set D and u s (·) is uniformly εoptimal with respect to s and ϕ(t, s, u s (·)) stays inside a compact subset of O − (D) for all times then ϕ(t, x, u s (·)) is asymptotically stable for all x ∈ R d with s = x x provided δ and ε are sufficiently small.
Proof. All assertions follow directly from the results in §2. Assertion (iv) is true since the invariant control set of the projected system is compact. Assertions (v) and (vi) are proved using the fact that the projective space is compact and that there exists a unique invariant control set for the projected system. Remark 3.2. Knowing the facts cited in this section we can see that even more can be calculated: (i) Property (vi) can be used to calculate κ * by calculating κ of the time reversed system. Hence it is possible to approximate κ, κ * andκ for any bilinear control system satisfying (2.18) by solving discounted optimal control problems.
(ii) For all main control sets D i we can approximate the minimal Lyapunov exponent over intD i as follows: Proposition 2.8 yields that δv δ < λ * + ε uniformly on compact subsets of intD i . If we find control functions as described in Theorem 3.1 (viii) for ε > 0 we know that there exists a Lyapunov exponent λ * < δv δ + ε, hence λ * ∈ [δv δ − ε, δv δ + ε]. However, the existence of such control functions is not guaranteed; nevertheless for all examples discussed in §5 it was possible to find them.
(iii) For systems with d = 2 or systems with d > 2 satisfying the ρ-ρ ′ inner pair condition we are also able to compute Σ ρ Ly (D) for D = C and D = D 1 at least for all but countably many ρ > 0, since in this cases the upper and lower bounds of this intervalls coincide with κ andκ of the original or of the time reversed system, respectively. For all other main control sets we can apply the technique from (ii) to both the original and the time reversed system to calculate Σ ρ Ly (D i ). (iv) In the case that d > 2 and ρ > 0 is one of the (at most countably many) exeptional points of the spectrum (3.10) we can use the monotonicity of v δ and Σ ρ Ly in ρ. This implies that there exist values ρ 1 < ρ < ρ 2 arbitrarily close to ρ such that the approximated spectrum contains Σ ρ1 Ly and is contained in Σ ρ2 Ly . 4. Numerical solution of the discounted optimal control problem. A discretization scheme to solve discounted optimal control problems in R n has been developed by I. Capuzzo Dolcetta and M. Falcone [2], [3], [4], [12], [13]. The algorithm used here to solve these problems is based on this discretization. We will first describe this discretization scheme and then present the modifications for our case, where the system is given on P d−1 instead of R n .
Hence we first assume that we have a discounted optimal control problem defined by (2.1)-(2.6) and (2.8) with M = R n . In addition we need the following conditions on X und g: |g(x, u) − g(y, u)| ≤ L g x − y ∀x, y ∈ W ∀u ∈ U for a L g ∈ R (4.3) The δ discounted cost functional J δ and the optimal value function v δ are defined as in (2.9) and (2.11).
Under the assumptions made above the value function v δ satisfies for all x, y ∈ R n (cfr. [4], the second estimate can be proved by using [4,Lemma 4.1]). For small δ > 0 we have C = M δ for a constant M independent on δ and γ is a constant satisfying γ = 1 for δ > L X , γ = δ LX for δ < L X and γ ∈ (0, 1) arbitrary for δ = L X . Furthermore (cfr. [17]) v δ is the unique bounded and uniformly continuous viscosity solution of the Hamilton-Jacobi-Bellman equation The first discretization step is a discretization in time. By replacing Dv δ by the difference quotient with time step h one obtains It turns out that the unique bounded solution of this equation is the optimal value function v h of the discretized optimal control system with respect to the space U h of all controls constant on each intervall [jh, (j + 1)h), j ∈ N: x 0 := x, x j+1 := x j + hX(x j , u j ), j = 0, 1, 2, . . . , (4.7) with running cost for all h ∈ (0, 1 δ ). Here we have C = M δ 2 for small δ > 0 and γ is the constant from (4.4).
The discretization error of the functionals for any u(·) ∈ U h can be estimated as sup x∈R n , u(·)∈U h |J h (x, u(·)) − J δ (x, u(·))| ≤ Ch γ (4.10) where C = M δ for small δ > 0 and γ as above ([4, Lemma 4.1]). In order to reduce (4.6) to a finite dimensional problem we apply a finite difference technique. To do this we assume the existence of an open, bounded and convex subset Ω of the state space R n which is invariant for (2.1). Thus a triangulation of Ω into a finite number P of simplices S j with N nodes x i can be constructed (cfr. [13, Proposition 2.5]) such that Ω k := ∪ j=1,...,P S j is invariant with respect to the discretized trajectories (4.7). Here k := sup{ x − y | x and y are nodes of S j , j = 1 . . . , P }. We are now looking for a solution of (4.6) in the space of piecewise affine functions W := {w ∈ C(Ω k ) | Dw(x) = c j in S j }.
Every point x i + hf (x i , u) can be written as a convex combination of the nodes of the simplex containing it with coefficients λ ij (u). Let Λ(u) := [λ ij (u)] i,j=1,...,N be the matrix containing this coefficients and G(u) := [g(x i , u)] i=1,...,N an N -dimensional vector containing the values of g with control value u at the nodes of the triangulation. Now we can rewrite (4.6) as a fixed point equation It follows that T k h is a contraction in R N with contraction factor β := 1−δh and therefore has a unique fixed point V * . If v k h denotes the function obtained by v k h (x i ) := [V * ] i and linear interpolation between the nodes the discretization error can be estimated by with γ as in (4.4) and C = M δ 2 for small δ > 0 (cfr. [13, corrigenda]). For the whole discretization error we obtain the following estimate: with the constants from (4.9) and (4.12).
with similar constants C and γ. Remark 4.2. Note that the convergence becomes slow if the discount rate δ becomes small. For the approximation of the average cost functional as described in §2 it is nevertheless necessary to calculate v δ for small δ > 0. This means that for this purpose we need a fine discretization in time and space to get reliable results. If one uses estimate (4.13) we obtain as an additional condition that k should be smaller that h, using (4.15) convergence for the case k = h is guaranteed.
To handle the optimal control problem on P d−1 we use the following modifications on this scheme: We first consider the optimal control problem on S d−1 defined by the projected system (3.7). The optimal value function v δ then again satisfies (4.4) and is the unique bounded and uniformly continuous viscosity solution of (4.5). This can be proved exactly the same way as in the R n case by using the metric on S d−1 induced by the norm on R d .
We have seen that the discretization in time of (4.5) corresponds to the Euler discretization of the control system. Hence here we use the following Euler method on S d−1 ; for h > 0 and any s ∈ S d−1 we define Φ h (s, u) := s + hX(s, u) s + hX(s, u) (4. 16) i.e. we perform an Euler step in R d and project the solution back to S d−1 . With this (4.6) reads  The estimates (4.8)-(4.10) remain valid; again all proofs from the R n case apply by using the metric on S d−1 induced by the norm on R d .
We will now use the fact that this discrete time control system on S d−1 defines a (well defined) control system on this mapping is well defined and g(s, u) = g(−s, u) implies that v h does not change if we only consider trajectories in W . Hence we can define a discrete time optimal control problem on W viaΦ To obtain a region Ω ⊂ R d−1 suitable for the space discretization we use a parametrization Ψ of S d−1 which is invertible on W such that Ψ and by definition ofΦ h the set Ω is invariant for this discrete time system. We can rewrite (4.17) by using Φ h,Ω and g Ω and denoting the solution by v h,Ω . This solution satisfies v h (Ψ(x)) = v h,Ω (x) and since Ψ is Lipschitz continuous estimate (4.4) remains valid for v h,Ω .
Thus we can proceed as in the R n case described above. Keeping in mind that there exists a one-to-one relation between the system on W and the system on Ω we can simplify the notation by writing Φ h , g and v h instead of Φ h,Ω , g Ω and v h,Ω .
We will now turn to the problem how the fixed point equation (4.11) can be solved numerically. In order to do this it is possible to use the contraction T k h to construct an iteration scheme but since the contraction factor β = 1 − δh is close to 1 this iteration converges rather slow. An acceleration method for this iteration scheme has been proposed by M. Falcone [12]. Falcone uses the set V of monotone convergence of T k h given by V := {V ∈ R N | T k h (V ) ≥ V } where "≥" denotes the componentwise order. A simple computation shows that V is a convex closed subset of R N . Given a V 0 ∈ V the operator T k h is used to determine an initial direction. The algorithm follows this direction until it crosses the boundary of V, then determines a new direction using T k h and continues the same way.
A different algorithm to calculate V * can be developed by observing that V * is the componentwise maximum of V and that V can be written as Note that the fraction on the right side does not depend on [V ] i . Thus we can construct the increasing coordinate algorithm: Step 2: compute sequentially Step 3: continue with Step 2 and the new vector V.
Falcone's accelerated method Increasing coordinate algorithm Note that for every arrow in the left picture the intersection between the initial direction and the boundary of V has to be determined. To do this -e.g. by bisection as in the implementation used here -the operator T k h has to be evaluated several times to decide if a point is inside or outside V. In the increasing coordinate algorithm N arrows (i.e. two arrows in this figure) are calculated by N evaluations of the fraction in step 2. These N evaluations are about as expensive as one evaluation of T k h . This means that one iteration in the increasing coordinate algorithm corresponds to one evaluation of T k h in the acceleration method. The convergence of this algorithm is guaranteed by the following lemma: Lemma 4.3. Let V 1 be the vector obtained by applying step 2 for i = 1, . . . , N to a vector V 0 ∈ V. Then The convergence of the increasing coordinate algorithm therefore is a consequence of the monotone convergence of the iteration scheme using the contraction T k h . All iteration methods described here have in common that during the iteration a minimum over all u ∈ U has to be calculated. The following lemma shows that this can be done by minimizing over a finite set U ε ⊂ U .
Lemma 4.4. Assume that X and g are uniformly Lipschitz continuous in the control u ∈ U with Lipschitz constant L u . Let U ε ⊂ U such that for all u ∈ U there existsū ∈ U ε with u −ū < ε. Let U ε denote the corresponding set of control functions. Then for all s ∈ S d−1 it holds that Proof. For all u(·) ∈ U there existsū(·) ∈ U ε such that u(t) −ū(t) < ε for almost all t ∈ R. Hence we have ϕ(t, s, u(·)) − ϕ(t, s,ū(·)) < L u εt + t 0 L X ϕ(τ, s, u(·)) − ϕ(τ, s,ū(·)) dτ where · denotes the norm on R d . Now the Gronwall Lemma and [4,Lemma 4.1] can be used to estimate this integral equation and the assertion follows.
For the projected bilinear control system with cost function g = q the assumptions of Lemma 4.4 are fulfilled and hence we may use a finite set of control values to calculate v k h . Once v k h is calculated it can be used to construct ε-optimal control functions: Step 1: Let x 0 = x, n = 0.
Step 4: Let x n+1 = Φ h (x n ,ũ k xn,h ), n = n + 1 and continue with Step 2. In step 2 a uniqueũ k xn,h ∈ U may be found e.g. by using a lexicographic order on U .
Theorem 4.5. Let u k x,h denote the control function defined above. Then for every ε > 0 there exist H > 0, K(h) > 0, such that for all h < H, k ≤ K(h): Proof. Using (4.12) or (4.14) and the definition of u k,i x,h := u k x,h | [ih,(i+1)h) we have for sufficiently small k and x i from (4.7): u)), u ∈ U attains its minimum: Putting this together yields By induction we can conclude that for every ε > 0, p ∈ N, h > 0 there exists k > 0 such that: Since β < 1 for all h > 0 and g and v k h are bounded on Ω k , for every ε > 0 we may find a p h ∈ N such that Combining (4.12) or (4.14), (4.21) and (4.22) yields Using estimates (4.10) and (4.9) the assertion follows. Remark 4.6. The proof also shows how k and h have to be chosen: First choose h such that (4.10) and (4.9) hold for the desired accuracy, then choose k dependend on p h from (4.22) such that (4.21) is fulfilled.
To construct a control function that is uniformly ε-optimal we can put together the ε-optimal control functions according to the following definition and lemma.
Choosing i ∈ N maximal with τ i ≤ σ and x 0 := ϕ(τ i , x,ū x (·)) it follows that: which finishes the proof. Remark 4.9. This lemma does not answer the question which switching times τ i are optimal. In estimate (4.23) we have to assume the worst case, i.e. that the error up to the time t may be equal to ε for all t > 0 and hence the error becomes large if a = min(τ i+1 − τ i ) becomes small. The numerical examples discussed in the next section show that good results can be obtained for small a.
Using the results from Theorem 3.1 we can use the control functions constructed here to develop an algorithm to stabilize bilinear control systems: Step 1: Calculate v k h , the approximation of the optimal value function for small discount rate δ > 0 to approximate the minimal Lyapunov exponents of the systems (under the assumptions of Theorem 3.1).
Step 2: Given an initial value x ∈ R d with λ * (x) < 0 compute the control function that is ε-optimal along its trajectory according to Definition 4.7 (using the projected system). The trajectory of the bilinear system using this control is asymptotically stable under the assumptions of Theorem 3.1 provided h and k are small enough.
Note that the main numerical expense lies in the calculation of the approximated optimal value function v k h . Once this function is known the algorithm to calculate the control functions is numerically simple and quite fast.
For this algorithm only the information x(t, x 0 , u(·))/ x(t, x 0 , u(·)) of the bilinear system is needed. In particular the calculated control functions are exactly the same for all x 1 , x 2 ∈ R d with x 1 / x 1 = x 2 / x 2 and hence the algorithm works for arbitrarily large or small x . It is not necessary to discretize the trajectory of the bilinear system or to lift the discretized solution from S d−1 to R d which then would imply that small discretization errors on S d−1 could become large in R d .
The value function and the corresponding optimal control values for each point can also be used to "verify" the assumptions of Theorem 3.1, (viii) numerically: if there exists a set such that v k h < 0 and this set is invariant with respect to the numerically computed optimal controls the corresponding trajectory will tend to 0 for any initial value from this set, provided the discretization is fine enough (see also Remark 3.2).
Remark 4.10. The way the stabilizing control functions are constructed leads to the question if v k h can be used to construct a stabilizing feedback for the bilinear control system. This question is closely related to the optimal switching times τ i . If it is possible to choose (τ i+1 − τ i ) arbitrarily small it could also be possible to obtain an ε-optimal feedback e.g. by linear interpolation or averaging of the feedback for the discrete time system.
The main problem in proving this property of the switching times lies in the fact that the Euler method yields only linear convergence in h, hence quadratic convergence for one time step. Thus the difference between v h (ϕ(h, x, u)) (the value that can be reached after the first time step) and v h (Φ h (x, u)) (the value that is supposed to be reached) is of the order h 2γ . For γ < 1/2 this error will accumulate and convergence is no longer guaranteed. However, there is hope to overcome this difficulty by using a higher order method to calculate Φ h (x, u) which then will require a different proof of the convergence of v h .

Numerical examples.
In this section we will present some numerical examples calculated with the algorithm developed in the previous sections. All examples were computed on an IBM6000 Workstation.
The first example is a bilinear control system in R 2 , the two-dimensional linear oscillator given byẍ + 2bẋ + (1 + u)x = 0 or written as a two-dimensional system by The projection of the system to S 1 by s = x x readṡ s = s 2 (1 + us 2 1 + 2bs 1 s 2 ) −(1 + u)s 1 − 2bs 2 + s 2 2 (us 1 + 2bs 2 ) For the one-dimensional sphere we may use the parametrization via polar coordinates Ψ(ϕ) = (cos ϕ, sin ϕ) where Ψ −1 (s) = arcsin(s 2 ), s 2 ∈ [−π/2, π/2], Ψ −1 (s) = arcsin(π − s 2 ), s 2 ∈ [π/2, 3π/2]. In polar coordinates the cost function reads g(ϕ, u) = − sin ϕ(u cos ϕ + 2b sin ϕ) and we can choose Ω = (0 − ε, π + ε) to cover the whole projective space (identified with one half of the sphere).  For ρ = 0.5 the system is asymptotically stable for all control functions since the maximal Lyapunov exponent is negative. But as the Lyapunov exponents corresponding to D 1 are much smaller than those of the control set D 2 it can be expected that the optimal trajectories with initial value inside D 1 tend to zero much faster.   All trajectories in this section were computed using the extrapolation method for ordinary differential equations by Stoer and Bulirsch [19, §7.2.14]. The parameter a from Definition 4.7 was chosen as a = h (see Remark 4.9).
The second example is the three-dimensional linear oscillator given bẏÿ + aÿ + bẏ + (c + u)y = 0 (5. 3) or written as a three-dimensional system by with a, b, c ∈ R and u ∈ U . The projected system on S 2 readṡ For S 2 the parametrization by spherical coordinates is not suitable since this parametrization maps two opposite points to a line and hence it is not invertible on one half of the sphere. Thus the stereographic projection is used instead; it is given by with Ψ = (Ψ 1 , Ψ 2 , Ψ 3 ).
The set Ω was chosen as Ω = (−1 − ε, 1 + ε) × (−1 − ε, 1 + ε) to cover the whole P 2 (identified with the upper half of S 2 ). All values given have been checked according to Remark 3.2 (ii); in all cases it was possible to find trajectories that realized the values as Lyapunov exponents. Hence the calculated values at least give an approximation of the minimal Lyapunov exponents over the interior of the control sets. To apply the results of Remark 3.2 (iii), i.e. to make sure that this is indeed the Lyapunov spectrum we have to check the ρ − ρ ′ -inner pair condition described in Section 3. Unfortunately up to now it is not known how to check this condition analytically. However, the program CS2DIM from Gerhard Häckl [15] has been used to calculate reachable sets for the system for different ρ-parameters numerically. Since they turned out to be strictly increasing in this example there is strong evidence that the condition is fulfilled.
The first parameters considered for this system were a = 1, b = 0, c = 0.5 and U = {−0.3, −0.25, . . . , 0.25, 0.3}. Figure 5.3 shows the two control sets of this system. The control sets were computed again using the program CS2DIM [15]. The numerical parameters used for this example are k = 0.003 around D 1 , k = 0.09 elsewhere, h = 0.05 and δ = 0.01. The discounted value function of this system around D 1 is shown in Figure 5.4. The calculated minimal Lyapunov exponent over D 1 is -1.25, the maximal exponent is -1.15. The calculated minimal and maximal exponents over D 2 are 0.019 and 0.24 and the value function is constant outside D 1 . Figure 5.5 shows two trajectories of the projected system with initial values inside D 1 . Table 5. 5 shows the values of one corresponding trajectory in R 3 . 0.000000 0.000000 0.000000 Table 5.5 Stabilized trajectory for system (5.4) with a = 1, b = 0, c = 0.5 The second set of parameters considered for this system is a = −1, b = −3, c = 0.5 and U = {−1.0, −0.9, . . . , 0.9, 1.0}. Figure 5.6 shows the three control sets of the projected system, the domain of attraction of D 2 (denoted by A − (D 2 )) and the domain of attraction of D 2 of the time reversed system (denoted by A + (D 2 )).
Here the numerical parameters were k = 0.002 around D 1 , k = 0.045 elsewhere, h = 0.05 and δ = 0.01. Figure 5.7 shows the discounted optimal value function around D 1 .   Table 5.6 shows the corresponding trajectory (x 1 , x 2 , x 3 ) in R 3 and another trajectory (y 1 , y 2 , y 3 ) in R 3 with projected initial value in D 1 . This trajectory tends to 0 much faster which is exactly what one would expect since the minimal Lyapunov exponent inside D 1 is much smaller.