STABILIZATION OF CONTROLLED DIFFUSIONS AND ZUBOV’S METHOD

We consider a controlled stochastic system which is exponentially stabilizable in probability near an attractor. Our aim is to characterize the set of points which can be driven by a suitable control to the attractor with either positive probability or with probability one. This will be done by associating to the stochastic system a suitable control problem and the corresponding Zubov equation. We then show that this approach can be used as a basis for numerical computations of these sets.


Introduction
Lyapunov's theory supplies necessary and sufficient conditions for the stability of attractors of dynamical systems. This theory, originally developed for deterministic systems, has been extended to stochastic ones using different notions of stability ( [14], [16], [18]).
In recent years, an increasing interest has been devoted to stability properties of stochastic processes with control inputs. In this case, the basic problem is the existence of an (open loop) control law steering the system to a desired target. This property, which is called controllability in the deterministic jargon, is called stabilizability in the stochastic context. Suitable Lyapunov characterizations of this property have been obtained and a corresponding theory of control Lyapunov functions (CLFs) has been developed, see e.g. ([9], [1], [6]). One of the main features and advantages of Lyapunov theory is that stability may be checked in terms of infinitesimal decrease conditions along a suitable positive definite function. Note, however, that even for uncontrolled diffusions, the analogue of the converse Lyapunov theorem by Kurzweil and Massera only yields a continuous Lyapunov function (this is a result by Kushner, see [17]), and it is not known if smooth Lyapunov functions exist in general, unless the diffusion is strictly non-degenerate away from the equilibrium (see [14], [16]). Hence it is not reasonable to assume too much regularity for the Lyapunov function and it is therefore important to reformulate infinitesimal decrease conditions in an appropriate weak sense ( [1], [6]). A central problem in this context is the construction of a Lyapunov function in the domain of attraction of the equilibrium. In this paper we address this problem for controlled stochastic systems. Since, unless that the system has a particular structure, a CLF is not explicitly known, it is important to provide constructive techniques yielding such a function. In general, the approaches available in literature, see e.g. [10], rely heavily on regularity properties of the CLF. In this paper we present methods based on the theory of viscosity solutions of suitable PDEs, which will in general provide nonsmooth CLFs.
In the deterministic case, a characterization of Lyapunov functions as a solution of a first order PDE goes back to the work of Zubov [22]. Recently this idea has been reinterpreted in the framework of Crandall-Lions viscosity solution theory (see [2]). In [5], [3], a Lyapunov function for an (uncontrolled) system locally almost surely (a.s.) exponentially stable near an attractor has been characterized as the unique viscosity solution of a second order PDE satisfying a Dirichlet boundary condition on the attractor. This equation is a generalization of the classical Zubov equation to the stochastic case. Moreover, it can be used as the basis for the numerical approximation of the Lyapunov function ( [3]).
In this paper we improve the results in [5], [3] in two directions: (i) we consider a controlled stochastic differential equation and we obtain a characterization of a CLF as unique solution of a second order Hamilton-Jacobi-Bellman equation; (ii) we assume a stability condition in probability near the attractor, assuring that the trajectories of the stochastic systems are exponentially stable with a probability decreasing to zero. This condition is weaker than a.s. exponential stability which is assumed in [5]. The latter implies that almost surely for each fixed sample path the system is exponentially stable in the usual sense.
In particular, we characterize two different types of stabilizability domains: the set D of points stabilizable to the attractor with a positive probability and the set of points D p stabilizable with a given probability p, for any p ∈ [0, 1] (which includes as a special case the set of points which can be steered to the attractor a.s.). A characterization of D is obtained by introducing a suitable optimal control problem associated to the stochastic system. The corresponding value function is a CLF on D for the stochastic system and D may be characterized as a suitable sublevel set of this CLF.
To characterize the set D p we introduce a discount factor δ in the optimal control problem and the corresponding value function v δ . Passing to the limit for δ → 0, the value functions v δ converge monotonically to a lower semicontinuous function v 0 and D p is the set of points where v 0 has the value 1 − p. Moreover, it is shown that the previous characterizations can be effectively used to construct approximate CLF on the corresponding domain of attraction. The paper is organized as follows. Section 2 introduces the stochastic control problem and recalls the definitions and the basic properties of the controls which are used. Sections 3 and 4 are devoted respectively to the characterization of the sets D and D p . Finally, Section 5 discusses a numerical example based on a financial model.

Assumptions and Preliminaries
We consider the autonomous stochastic differential equation in R N where W is a standard M -dimensional Brownian motion and α(t), the control applied to the system, is a progressively measurable process taking values in a compact set A ⊂ R L . We assume that b, σ are continuous functions defined in R N × A, taking values, respectively, in R N and in the space of N × M matrices, and satisfying for all x, y ∈ R N and all α ∈ A These assumptions guarantee the existence and uniqueness of a strong solution to (2.1) for any t > 0. We denote by A the set of the admissible control laws α(t), see Remark 2.1. Solutions corresponding to an initial value x and a control law α ∈ A will be denoted by X t (x, α) (or X t if there is no ambiguity). Throughout we denote the distance of x ∈ R N to a set M ⊂ R N by For system (2.1) we study the problem of characterizing the domains of stabilizability of a viable, compact set K, which is locally exponentially stabilizable in probability for (2.1). A set K is called viable if for any x ∈ K, there exists a control α such that X t (x, α) ∈ K a.s. for any t > 0. The property of local exponential stabilizability in probability is defined by the requirement that there exist positive constants r, λ such that for every ε > 0, there exists a C > 0 such that for every x ∈ K r := {x ∈ R N : d(x, K) ≤ r} there is a control α for which Our aim is to study properties and to find characterizations of the following sets which describe the stabilizability properties of the process and for p ∈ [0, 1] Remark 2.1. We assume that the class of admissible control laws A satisfies the properties of stability under concatenation and stability under measurable selection.
The condition for stability under concatenation is the following. For a stopping time T , we define the T -concatenation of two control processes by setting else. Stability under concatenation holds if α 1 ⊕ T α 2 is an admissible control for all admissible controls α 1 , α 2 and all stopping times T . For the more technical condition of stability under measurable selection we require that for all stopping times T and all maps Φ : Ω → A, measurable with respect to the corresponding σ-algebras, there exists a ν ∈ A such that In the following we need both these properties for our class of controls, to ensure a controllability property. We assume for every x in K r the existence of a control α x such that the stability property (2.4) holds. Thus when we steer the process to the set K r we want to switch to the process α x . This can be done as follows. Given an initial condition x 0 and a process α we define the stopping time Then the set V := {ω ∈ Ω | T (ω) < ∞} is F(T ) measurable and the map is measurable from (Ω, F(T )) to (A, B A ). So, if stability under measurable selection holds, there exists ν ∈ A such that Then, if stability under concatenation holds, the T -concatenated control α ⊕ T ν, for any α ∈ A, is an adapted admissible process. The reader should keep in mind the preceding construction in the following.
Observe that the property of stability under concatenation and stability under measurable selection also guarantee the validity of the Dynamic Programming Principle, see (3.7), under standard regularity assumptions on the coefficients of the problem (see [15] and [20]).
An explicit construction of a class of controls laws satisfying the properties of stability under concatenation and stability under measurable selection can be performed under the convexity condition Then, fixed a priori a probability space (Ω, F, F t , P) with a right continuous increasing filtration, the class of the progressively measurable processes with values in the compact set A satisfies the desired properties. If this convexity condition is not satisfied we need to consider relaxed controls and we refer to [7], [15] and [20] for the construction of a canonical probability space associated to the control problem and the corresponding class of admissible controls satisfying the previous properties.
Indeed for every ε > 0, by (2.4) we find α and C for which Remark 2.3. In [3,5], the problem of stability is studied (i.e. no control in (2.1)) and the equilibrium is assumed to be almost surely locally exponentially stable. This is to say that there exist positive constants r, λ and a finite random variable β such that for any x ∈ K r , we have This assumption implies local exponential stability in probability: for every ε > 0 it is sufficient to choose C such that

The domain of null controllability
In this section, we study the properties of the set D, i.e. the set of points which can be steered to the set K with positive probability. Throughout this section all assumptions discussed in Section 2 are assumed to hold. In the following the stopping time τ (x, α), defined as the hitting-time of K r , will play a vital role. It is defined by ii) D is open, connected and contains K r as a proper subset.
Proof. i) It is easy to show that for any α So if x ∈ D then there exists α such that P (τ (x, α) < +∞) > 0.
ii) In order to prove that D is a open, observe that if x ∈ D, then there exist α and T > 0 such that P(d(X T (x, α), K) ≤ r/2) > 0. For δ sufficiently small and y ∈ B(x, δ), this implies P(d(Y T (y, α), K) ≤ r) > 0 and therefore Since D is open and K r is closed we obtain that K r is a proper subset of D. Finally D is connected since for any x ∈ D there exists at least a control α and a continuous path X t (x, α) connecting x to K r .
To construct a CLF for the stochastic system (2.1), we introduce v : R n → R + as a value function of an optimal control problem. Define where g : R N ×A → R is a nonnegative bounded function such that g(x, a) = 0 if and only if x ∈ K. Furthermore, we assume that there exist L g , g 0 > 0 such that Note that these assumptions imply inf d(y,K)≥δ,a∈A g(y, a) > 0 for each δ > 0.
We recall the Dynamic Programming Principle for the value function v.
The proof of this principle relies on standard arguments from the theory of optimal control and on the stability properties of the class of admissible controls A, see Remark 2.1. For details we refer to [20] and [8].
We note that (3.8) is assumed, in order to avoid a situation in which V decreases along solutions in R N \ D, although stabilization to K is not possible. Similarly, the conditions (3.9),(3.10) ensure that decrease in V is only possible within the set D by approaching K (in terms of the sublevel sets of V , which are compact in D by (3.10)).
The notion of Lyapunov function for uncontrolled stochastic systems was introduced by Has'minskii in [14] and Kushner in [16]. They considered twice continuously differentiable Lyapunov functions for which, by the Dynkin formula, condition (3.11) is equivalent to the infinitesimal decrease condition: In [9] Florchinger used twice continuously differentiable control Lyapunov functions in the context of feedback stabilization. Recently, in [6] Cesaroni considered only continuous control Lyapunov functions for stochastic systems. In this case condition (3.11) is equivalent to v being a viscosity supersolution of denotes the generator of the Markov process associated to (2.1).
The next theorem provides a characterization of the set D by means of v.  Proof. Note first, that v(x) = 0 for x ∈ K, as by assumption K is viable and g(x, a) = 0 for x ∈ K. The properties v(x) > 0 for x / ∈ K and conditions (3.8), (3.9) all follow, if we show the characterization (3.12).
We now show that v is proper on D and that it is a continuous function. Towards the first end we are going to show that 14) The continuity of v is then shown by proving v is continuous in D , (3.15) by which the continuity of v on R N follows using that v ≡ 1 in R N \ D.
To prove (3.13), we argue by contradiction. Assume that there exists a sequence of points x n ∈ D converging to x 0 ∈ ∂D such that lim n→∞ v(x n ) ≤ 1 − η for some η > 0. Then for any x n we can find a control α n such that E e − τ (xn,αn) 0 g 0 dt ≥ E e − +∞ 0 g(Xt(xn,αn),αn,t)dt ≥ η/2 , and therefore there exist ε > 0 and T such that P(τ (x n , α n ) ≤ T ) ≥ ε. For n sufficiently large, we have for any α ∈ A. This is possible because of (2.2), (2.3), see e.g. [19, p. 49]. For such a fixed n, arguing as in (3.3) with x n and x 0 in place of x and, respectively, y, we obtain that This is a contradiction to x 0 ∈ R N \ D.
To prove (3.14) note that because of linear growth condition (2.3) for every x / ∈ K r there is a time T (x) such that P(τ (x, α) < T (x)) = 0 for all controls α. Furthermore, As T (x n ) → ∞, the right hand side tends to 1 as n → ∞. This shows the assertion.
We now prove claim (3.15). First of all, we prove that v is continuous at K (recall that v |K ≡ 0). Fix x 0 ∈ K and ε > 0. By (2.6) there exists a C > 0 such that for all x ∈ K r there is an α x such that so that P(B x ) ≤ ε for all x ∈ K r . Fix T > 0 in such a way that Ce −λT ≤ ε and let δ > 0 be such that This shows continuity of v in x 0 . Now, let ε > 0 and x ∈ D \ K. Fix δ 0 > 0 such that, if d(y, K) ≤ 2δ 0 , then v(y) ≤ ε and define g * := inf d(y,K)≥δ 0 /2,a∈A g(y, a) > 0. Finally, choose T > 0 such that Let δ > 0 be such that if y ∈ B(x, δ), α ∈ A, then P(E α ) ≤ ε holds for Let α be an ε-optimal control for x and define the set For all paths we define by η = η(ω) the minimal time for which d(X η (x, α), K) ≤ δ 0 holds with the convention η(ω) = T for paths in B x . Then, recalling the Dynamic Programming Principle (3.7) we obtain for To show a bound for v(x) − v(y) for y ∈ B(x, δ), note that we can argue in the same way, if we choose an ε-optimal control α * for y and define the set B y analogously to (3.16) considering X t (y, α * )). Then similar estimates to the above yield v(x) − v(y) ≤ (4 + L g )ε. This shows (3.15). Finally, using the Dynamic Programming Principle (3.7), if α is an εoptimal control for x ∈ D \ K we have and as ε > 0 is arbitrary we obtain the decrease condition (3.11).
Interestingly, the function v can be characterized as the unique viscosity solution of an Hamilton-Jacobi-Bellman equation (see [8] and [21] for the definition of viscosity solutions).
Proof. In proof we use the notations t ∧ s := min{t, s} and t ∨ s := max{t, s} and use the standard sub-and superoptimality principles for viscosity suband supersolutions, see [8] or [6]. First we show that if u is an upper semicontinuous bounded subsolution in R N \ K of (3.17) with u(x) = 0 for x ∈ K then u(x) ≤ v(x). Assuming that u is upper semicontinuous, for every ε > 0 there exists a δ > 0 (without loss of generality, δ ≤ r) such that u(x) ≤ ε for every x such that d(x, K) ≤ δ. Denote g * := inf d(y,K)≥δ,a∈A g(y, a) > 0 and let u * > 0 be an upper bound for u on R N . Now fix x ∈ R N and ε > 0 and choose δ as prescribed above. We denote τ δ (x, α) := inf{t | d(X t (x, α), K) ≤ δ)} and we choose a controlᾱ such that For each t > 0 we define the set Then by the suboptimality principle we have As ε > 0 was arbitrary, this shows the claim. Now we prove that if w is an lower semicontinuous bounded supersolution in R N \ K of (3.17) with w(x) = 0 for x ∈ K then w(x) ≥ v(x). Fix ε > 0. Then by lower semicontinuity of w and the continuity of v there exists a δ > 0 such that w(x) ≥ −ε and v(x) ≤ ε and thus w(x) − v(x) ≥ −2ε holds for every x with d(x, K) ≤ δ.
Now fix x ∈ R N and define τ δ (x, α), g * and B t as in the proof for u, above. Then for any controlᾱ and any t ≥ 0 the Dynamic Programming Principle Let w * ≤ 0 be a lower bound for w and recall that v is bounded from above by 1. We choose a controlᾱ such that, by the superoptimality principle, This yields the assertion because ε > 0 was arbitrary.

Null-controllability with a given probability
In this section we are interested in the sets D p , see (2.5), of the points which are stabilizable to K with a given probability p. In order to describe these sets we consider a family of Zubov functions depending on a positive parameter δ. These functions are defined by v δ (x) = inf α∈A E +∞ 0 δg(X t (x, α), α t )e − t 0 δg(Xs(x,α),αs)ds dt (4.1) where g is a function satisfying all conditions that we imposed for (3.4). Since δ > 0 is only a scaling factor, v δ satisfies the same properties as v defined in (3.4). In particular, the characterization provided by Theorem 3.4 holds with v δ in place of v, for any δ > 0. Moreover v δ is the unique bounded continuous viscosity solution of the equation such that v δ (x) = 0 for x ∈ K. The following result shows how the functions v δ (x) may be used to characterize the sets D p .
Remark 4.2. A property corresponding to (4.3) was proved in [3] for the uncontrolled process, under the stronger assumptions of almost sure exponential stability (see (2.6)) of K and a technical condition on E[d(X t , K) q ] for some q ∈ (0, 1] (see (11) in [3]).
To prove the theorem, we need two preliminary lemmas.
To obtain the converse inequality in (4.4), choose α ∈ A, T sufficiently large such that Now fix δ > 0 small enough so that, for t < T , we have e −δt ≥ 1 − ε. Then Since ε > 0 is arbitrary, it follows that The second lemma is an estimate of v δ in K r . Proof. By (2.4), given ε > 0, we can find C > 0 such that for any x ∈ K r there exists an α ∈ A such that P(B) := P(sup t d(X t (x, α), K)e λt ≥ C) ≤ ε.
Select δ in such a way that Cδ < ε. Hence where C is independent of ε. This shows the assertion.
Proof of Theorem 4.1. Using a slight extension of (3.2) in the the proof of Proposition 3.1 and Remark 2.2 we see that Thus the statement of the theorem follows immediately from Lemma 4.3 if we prove that Note furthermore that, if α ∈ A is ε-optimal, we have and hence lim inf As ε > 0 is arbitrary, this shows the claim. To prove the converse inequality in (4.5) for fixed ε > 0, we choose T > 0 large enough such that e −δMgT ≤ ε, where M g is an upper bound for g. Now v δ is continuous so that we have by the Dynamic Programming Principle that The second term on the right hand side of (4.6) can be bounded from above in the following way By the choice of T , the third term satisfies Inserting (4.7) and (4.8) in (4.6), we obtain As ε > 0 is arbitrary, by Lemma 4.4 we get Remark 4.5. Note that the sequence v δ is decreasing in δ. By stability properties of viscosity solution, this implies that the sequence v δ converges to a function v 0 whose lower semicontinuous envelope (see [8]) is a supersolution of the Hamilton-Jacobi-Bellman equation with v 0 (x) = 0 on K. The previous equation is related to an ergodic control problem for (2.1). In this respect the Zubov equation with positive discount factor can be seen as a regularization of the limit ergodic control problem which gives the appropriate characterization of the sets D p .

A numerical example
We illustrate our results by a numerical example. The example is a stochastic version of a deterministic creditworthiness model discussed in [12,13]. Consider In this model k = x 1 is the capital stock of an economic agent, B = x 2 is the debt, j = α is the rate of investment, H is the external finance premium and f is the agent's net income. The goal of the economic agent is to steer the system to the set {x 2 ≤ 0}, i.e., to reduce the debt to 0 and the goal of the analysis is to determine the maximum level of debt B * (k 0 ) for which this is possible, depending on the initial capital k 0 . In other words, we look for the domain of controllability of the set K = {(x 1 , x 2 ) ∈ R2 | x 2 ≤ 0}. Observe that in practice the problem can be restricted to a finite interval I for the x 1 -value. So we consider it in a compact set. Under this restriction, conditions (2.2) and (2.3) on the drift and diffusion of the stochastic system hold.
In contrast to other formulations of such problems here the credit cost, modelled by the external finance premium H, is not given by a constant interest rate, i.e., H(x 2 ) = θx 2 but with an interest rate which grows with the ratio of debt over capital stock, i.e., the larger x 2 /x 1 becomes the higher the interest rate becomes. The main goal of the study of the deterministic model in [12,13] is the analysis of the dependence of the maximum debt level B * (k 0 ) on the shape of H. Here we pick one particular form of H and add a stochastic uncertainty in the equation for the capital stock k = x 1 , i.e., the capital is now subject to random perturbations. Instead of a domain of controllability we will now get controllability probabilities which can be characterized by our method and computed numerically by a suitable numerical scheme.
In order to show that the stochastic version of the model satisfies our exponential controllability assumption we extend H to negative values of x 2 via H(x 1 , x 2 ) = θx 2 . Then it is easily seen that for the deterministic model controllability to K becomes equivalent to controllability to K = {(x 1 , x 2 ) T ∈ R2 | x 2 ≤ −1/2}. Furthermore, also for the stochastic model any solution with initial value (x 1 , x 2 ) with x 2 < −1/4 will converge to K for α ≡ 0, even in finite time, which implies the assumed exponential controllability to the modified set K, even almost surely.
Using the parameters λ = 0.15, α 2 = 100, α 1 = (α 2 + 1)2, µ = 2, θ = 0.1, a = 0.29 ν = 1.1, β = 2, γ = 0.3 and the cost function g(x 1 , x 2 ) = x 2 2 we have numerically computed the solution v δ for the corresponding Zubov equation (4.2) with δ = 10 −4 using the scheme described in [3] extended to the controlled case (see [4] for more detailed information). For the numerical solution we used the time step h = 0.05 and an adaptive grid (see [11]) covering the domain Ω = [0, 2] × [−1 /2, 3]. For the control values we used the set A = [0, 0.25]. As boundary conditions for the outflowing trajectories we used v δ = 1 on the upper boundary and v δ = 0 for the lower boundary, on the left boundary no trajectories can exit. On the right boundary we did not impose boundary conditions (since it does not seem reasonable to define this as either "inside" or "outside"). Instead we imposed a state constraint by projecting all trajectories exiting to the right back to Ω. We should remark that both the upper as well as the right boundary condition affect the attraction probabilities, an effect which has to be taken into account in the interpretation of the numerical results. In order to improve the visibility, we have excluded the values for x 1 = 0 from these figures. Observe that for x 1 = 0 and x 2 > 0 it is impossible to control the system to K, hence we obtain v δ ≈ 1 in this case. This can be seen in Figure 2 which shows the result including the values for x 1 = 0 for σ = 0.5. Note that due to the degeneracy of the solution, which is almost discontinuous for x 1 = 0 and x 2 ≥ 0, the use of the adaptive space discretization method from [11] is crucial in order to obtain accurate results.  Figure 2. Numerically determined controllability probabilities for σ = 0.5 including the value for x 1 = 0