On the control of time discretized dynamic contact problems

We consider optimal control problems with distributed control that involve a time-stepping formulation of dynamic one body contact problems as constraints. We link the continuous and the time-stepping formulation by a nonconforming finite element discretization and derive existence of optimal solutions and strong stationarity conditions. We use this information for a steepest descent type optimization scheme based on the resulting adjoint scheme and implement its numerical application.


Introduction
The following work concerns the optimal control of time discretized, dynamic contact problems involving a linearly viscoelastic body and a rigid obstacle in the absence of friction, where a linearized non-penetration condition is employed. This condition is also referred to as the Signorini condition, after first being introduced by Signorini [45] in the static one body context. Contact problems have a multitude of applications in mechanics, engineering and medicine and are well understood in the static context nowadays. They are closely related to obstacle problems and both are modeled through structurally similar, elliptic variational inequalities. Their theoretical properties can therefore oftentimes be examined simultaneously. There are, however, two main additional complications concerning contact problems. While obstacle problems are scalar problems that extend the Poisson problem, contact problems are vector-valued problems, extending linear elasticity. Furthermore, while the constraints in the obstacle problem are formulated on the reference domain, the non-penetration condition in contact problems is imposed on part of the reference boundary.
In [38], Lions and Stampacchia were the first to show the existence of a generally nonlinear but Lipschitz continuous solution operator to these variational inequalities (cf. also [25]) and from Mignot's work in [39], we know the solution operator to even be directionally differentiable in case the admissible set is polyhedric at the solution with respect to the contact forces. Existence of solutions and first order optimality conditions for optimal control problems of variational inequalities and complementarity constrained problems have been investigated, e.g., in [39,50], and in [52], optimization algorithms for optimal control of static contact problems are considered in a medical optimal design application.
Numerically, static contact problems can be solved, e.g., with optimal complexity by the multigrid techniques developed in [29,31] or alternatively by a combination of regularization and semi-smooth Newton [19,46,49].
In time-dependent contact problems there is a complex interplay of the different physical effects. This leads to many different models and analytic techniques, which all cover different aspects of the physical situation. There is a vast number of publications which consider different combinations and variants of aspects like viscosity, friction, adhesion, dynamic effects, damage, plasicity, thermal effects, and ways to model or regularize the contact condition. Since our interest is limited to a specific type of model, we do not discuss all available results in detail, but refer to the standard textbooks that give an overview over the field [13,18,23] and the references therein.
The type of model that we consider, namely fully dynamic, frictionless, viscoelastic contact without regularization, is conceptually relatively simple and has also been considered frequently from an algorithmic point of view. However, analytic results are harder to find. To the best of our knowledge only the authors in [1] investigate the existence of possibly non-unique solutions by studying weak convergence of a time-discretization scheme. However, some crucial steps in the proofs are implausible to us. Existence results for closely related models are [20] (viscoelasticty with singular memory) and, e.g., [10,21,35] (contact with friction). The counter examples in [42] demonstrate that dynamic contact problems are a very delicate matter.
However, as already mentioned, there are several time-stepping schemes for the model under consideration, often based on the Newmark scheme, which was introduced in [40], and that include reasonable adaptations for the contact constrained case. We restrict our examinations to time-stepping formulations of the dynamic contact problems. Oftentimes, solvers for static contact problems are employed for the step computation in those time-stepping schemes. For our purposes, the energy dissipative, contact implicit modification by Kane et al. [22] seems to be suited best. This method, which has been designed for the simulation of problems with complex configurations, relies on a non-smooth formulation and has good stability properties. Moreover, it is relatable to a temporal finite element discretization of the continuous problem. This allows for a consistent derivation of an adjoint scheme in the optimal control context. In the context of spatial finite element discretization, a couple of modifications have been proposed and analyzed [11,17,[26][27][28]32], overviews are given in [12,33]. Spatial adaptivity based on the Newmark scheme is considered in [7]. These variants mostly coincide with [22] in the spatially continuous case. They differ, however, in the way spurious oscillations, that result from spatial discretization of the problem, are treated. A recent alternative spatial discretization of the contact conditions has been proposed in [8]. An additional class of methods, which stress conservation principles and also cover nonlinear contact problems, is presented in [36,37].
Therefore, although information on a control-to-state operator on the continuous level is still not complete, reasonable time-stepping schemes are available, which motivates the consideration of optimal control of dynamic contact problems in a timediscretized, spatially continuous setting.
The aims of this work are thus the following: For the optimal control of dynamic contact problems, we first derive a discontinuous finite element formulation in time that yields the contact implicit Newmark scheme [22] with slight modifications concerning the treatment of external forces and the influence of the control. Then, we deduce basic results for the time-discretized optimal control problem, such as existence of optimal solutions and strong stationarity conditions. These results are used to obtain a backward-in-time scheme for the computation of an adjoint state, which is in turn the basis for a gradient-like method used for the numerical solution of the optimal control problem. In this method, the forward problem is solved by the aformentioned, modified contact implicit scheme using a monotone multigrid solver [29,31] for the computation of solutions in each timestep.
Structure Section 2 gives an introduction into the modeling of one body contact problems. A reformulation of the usual second order hyperbolic variational inequality is used to convert the fully continuous optimal control problem into a system of first order. The subsequent Sect. 3 demonstrates a finite element semi-discretization of the underlying functional spaces to the aforementioned first order system that results in a time-stepping formulation of the contact problem which closely resembles the contact implicit Newmark scheme for contact problems. Section 4 deals with the optimal control of the semi-discretized system and includes the existence of a Lipschitz continuous solution operator to the state equation, i.e., the time-stepping scheme. We can therefore show the existence of minimizers to the optimal control problem under standard assumptions. This operator is shown to be directionally differentiable in case the set of admissible states is polyhedric with respect to the solution and the residual to the variational inequality. Using this differentiability, we provide a rigorously derived system of first order necessary optimality conditions in the polyhedric case. The information on the adjoint state will be used in a preconditioned, steepest descent type optimization algorithm in Sect. 5, where the optimization algorithm is applied in a numerical example. Finally, Sect. 6 concludes the paper with an outlook on possible extensions of the presented framework.
For any set X ⊂ Rñ,ñ ∈ N and a Banach space Y , we write L 2 (X, Y ) for the Bochner-Lebesgue spaces of square integrable functions, C(X, Y ) for the space of continuous functions and H s (X, Y ) for Bochner-Sobolev spaces with s > 0. We always consider measurability with respect to the Lebesgue measure, which we denote ζ X = ζ X 1 × ζ X 2 for product spaces X = X 1 × X 2 ⊂ Rñ. A property is said to hold almost everywhere (a.e.) if it is violated only on sets of measure 0 and quasi everywhere (q.e.) if it is violated only on sets of capacity 0. With capacity theory being a complex topic itself, we refer the interested reader to an overview of Sobolev-capacity concepts and their respective behavior near the boundary of the underlying domain in [9]. In the setting at hand, it turns out that the sets of vanishing capacity, and therefore the associated notion of "q.e.", coincide on the relevant parts of the boundary, regardless of the underlying Sobolev space. One can therefore consider the natural capacity based on the space H 1 ( , R). Whenever X = , Y = R n , we omit the arguments to the Lebesgue and Sobolev spaces and abbreviate L 2 := L 2 ( , R n ), For the treatment of weak time differentiation, we use the space W ([0, T ]) := W 1,2 (I, L 2 , H 1 ), cf. [55,Sec. 23.6], and we denote the weak time derivative of y ∈ Furthermore, whenever we want to identify an H 1 -function with an (H 1 ) *functional, we always consider the mapping We have assumed the boundary to be C 0,1 -regular, so for and a measurable subset˜ ⊂ there exist linear and bounded trace operators associated with the boundary or boundary segment˜ ⊂ . For the sake of brevity, we will notationally suppress trace operators if the meaning is clear from context. On a boundary segment˜ , we consider the standard surface measure, denoted as ζ˜ .
We write the scalar product on a Hilbert space X as (·, ·) X : X × X → R and for a reflexive Banach space Y and its dual space Y * we denote the dual pairing by ·, · Y : Y * × Y → R. Thus, for a Hilbert space X (which is a reflexive Banach space) we distinguish between the scalar product (·, ·) X and the dual pairing ·, · X . Further, we define the polar cone and annihilator for subsets and denote the adjoint operator to an operator A : X → Y by A * : Y * → X * . For a convex subset K ⊂ Y , we write the radial cone and tangent cone to K at y ∈ K as cf. [50], where cl(·) is the norm closure of a set and we denote the interior of a set by int(·). For y ∈ K , r ∈ T K (y) • , we denote the critical cone to K with respect to (y, r ) as The components of a vector y ∈ R n or a vector valued function y : → R n are denoted by y (k) , k = 1 . . . n and we denote the positive and negative parts by y + and y − respectively.

Dynamic one body contact
This paper focuses on optimal control problems with a weak formulation of dynamic, viscoelastic contact problems as side constraints and the following section is dedicated to the presentation of the configuration of interest. The initial modeling of the physical setting is followed by a short overview of the reasoning behind the chosen approach and the limitations of linear contact conditions in general. The modeling will result in the well known second order, hyperbolic variational inequality that describes the contact problem. The second order form will be rewritten as a system of first order and considered in an optimal control problem on the continuous level.

Modeling and contact condition
We model a linearly viscoelastic body that comes into contact with a rigid obstacle on the time interval I ⊂ R and in the absence of friction. The undeformed reference state of the body is described by the domain and on it, we seek displacements y : I × → R n describing the deformation of the body when external forces act on parts of its boundary and interior.
To this end, we identify three disjoint parts D , N , C ⊂ on the boundary, with D ∪ N ∪ C = and dist( D , C ) > 0, see Fig. 1, where the body is clamped with Dirichlet conditions, can experience boundary forces by Neumann conditions or where we consider contact to potentially occur, respectively. We additionally assume C , D to be quasi closed and ∂ C to have vanishing capacity, see, eg., [2,Sec. 5.8.3]. The elastic and viscose properties of the material are described by the two bounded, coercive bilinear forms Fig. 1 Reference configurations of one body contact problems a, b : which are assumed to be of the form for sufficiently smooth tensors E and V . More details can be found, e.g., in [13,23]. For the time dependent problem, we define As usual, homogeneous Dirichlet boundary conditions are incorporated into the state space and we denote accordingly, with a.e. meaning the surface measure sense. Furthermore, the external forces are composed of boundary and volume forces and are modeled by f ext ∈ L 2 (I, We choose the state space for possible displacements as and the rigid obstacle will be modeled by a set of admissible states. The obstacle is described by the closed set O ⊂ R n \ , which implies that the reference configurations is stress free. Contact is modeled by a linear non-penetration condition. To this end, we assume the existence of a contact mapping : C → ∂O, mapping all points on the contact boundary to an associated point on the boundary of the obstacle. This allows for the definition of a contact normal on the contact boundary C of the viscoelastic body, namely where ν : C → R n denotes the geometric outer normal on the contact boundary of the body in the reference configuration. We assume that ν ∈ W 1,∞ ( C , R d ) and refer to the mapping ψ : ω → ω − (ω) as the initial gap function on the contact boundary of the reference configuration, which is assumed to be quasi continuous on C . The choice of the underlying notion of capacity is not relevant in this formulation, as the polar sets coincide on C for all reasonable Sobolev capacities, cf. [9, Cor. 6.2]. In the case of the one body problem with linearized contact condition, the set of admissible states can then be described bȳ 3) The contact condition describes that the contact boundary of the body may not move into a certain direction further than its initial distance from the obstacle. Lastly, we point out that the continuous embedding of W ([0, T ]) → C(I, L 2 ) and the restriction of the state space to functions in C(I, At this point all modeling aspects to the setting have been described and we will focus on a mathematical formulation next.

Second order dynamics
With the preparation of Sect. 2.1 in mind, we can now establish the mathematical model for the optimal control of dynamic contact. The time continuous, viscoelastic contact problem comes to finding a y ∈ Y ∩K with y(0) = y ini ,ẏ(0) = v ini for which the hyperbolic variational inequality holds. This can be stated in a more compact way using the normal cone TK (y) • toK at y, so that the contact problem then reads In order to keep notation compact, we define the set K and the test space P by where the time dependent test functions are dense in L 2 (I, H 1 D ). For y ∈ L 2 (I, H 1 ) the mapping p → T 0 a(y(t), p(t))dt defines a bounded, linear functional on L 2 (I, H 1 ) because: for all p ∈ L 2 (I, H 1 ), since y(·) H 1 , p(·) H 1 are square integrable over time. The same holds for b. We also consider the operators A : Y → P * , f ∈ P * with Ay, (p, p 0 , q 0 ) P = ÿ, p L 2 (I,H 1 ) +a I (y, p)+b I (ẏ, p)+(y(0), q 0 ) which finally allows us to rewrite the continuous problem (2.4) in a more compact way as Here, (2.5b) represents the variational inclusion (2.4b) and enforces the initial values (2.4c) as well, because it splits up intö The last two lines in the latter ensure the initial values in the L 2 -sense and therefore in the H 1 -sense in the case of y(0) and y 0 since the weak derivative is unique and y ini ∈ H 1 D . Therefore, the two inclusions (2.5a)-(2.5b) represent the entire contact problem.

First order dynamics
As mentioned above, Sect. 3 will include a time discretization that can be interpreted as a Newmark type scheme. We will elaborate on this in the appropriate section. In order to describe this time-stepping procedure by a finite element discretization, the time continuous framework with second order dynamics needs to be modified beforehand to obtain a system of first order.
We redefine some of the sets in the previous subsection and fix as well as the modified operators A : this leads to the first order reformulation of the contact problem Here, (2.6b) splits up intȯ v + a I (y, The second line ensures that the velocity and the time derivative of the displacement coincide in the L 2 (I, H 1 ) * -sense.
Since v,ẏ ∈ W ([0, T ]) and the weak time derivative is unique, they coincide in the W ([0, T ])-sense as well. Finally, the first line is simply a restatement of the variational inclusion (2.4b) and the initial values have been adapted to fit the first order system.
Note that the variational inclusion (2.7a) can equivalently be expressed with the help of a multiplier λ ∈ TK (y) •  The multiplier λ can be interpreted as the contact forces acting upon the area of active contact when the unconstrained movement of the body is disrupted by the obstacle.

Continuous optimal control problem
Based on the first order reformulation, we consider the continuous optimal control problem with the dynamic contact problem as constraints as well as distributed control u ∈ U = L 2 (I, L 2 ). With the operator and a cost functional J : Y × V × U → R, this amounts to which is an optimal control problem with a dynamic contact problem as constraints, where the states are controlled in a distributed manner by the forces in the state system.

Semi-discretization of the contact problem
In this section, we present a finite element time discretization of the optimization problem (2.9), where the resulting discretized constraints correspond to the application of the contact implicit Newmark scheme, proposed by Kane et al. [22], to the constrained formulation of second order. The advantage of the contact implicit scheme over the classical Newmark scheme is improved stability in the constrained case, whereas there is no difference to the classic scheme, when no constraints are active, cf. also [26]. The finite element framework allows for the consistent derivation of an adjoint timestepping scheme that will be presented in Sect. 4.5 and leads to an optimal control problem with semi-discretized dynamic contact as constraints.

Finite element discretization
In order to handle the inequality structure in (2.9), we begin by introducing the multiplier λ ∈ TK (y) • , mentioned in (2.8), so the set of constraints (2.5a)-(2.5b) can equivalently be expressed by the system The semi-discretization follows the temporal part of the Petrov-Galerkin discretization presented in [34], where the authors investigate optimal control problems with control constraints for the wave equation. The discretization consists of dividing the temporal and we restrict ourselves to the equidistant case here, assuming |I k | = τ > 0 to be constant. The displacements, velocities, forces, controls, test functions and multipliers are then chosen from finite element spaces in the following way: The discretization is nonconforming with respect to the test functions p, which are discretized discontinuously. Also, the velocity, which is assumed to be the derivative of the piecewise linear state, is assumed piecewise linear itself see Figs. 2, 3. This leads to a symmetric averaging of implicit and explicit information when the states are updated from the velocities in the time-stepping scheme. All in all, we have piecewise linear and continuous states and velocities as well as forces and test functions that are piecewise constant and continuous from the left. Further, the multiplier to the contact condition is a linear combination of vector Dirac measures acting at the subinterval endpoints.
For the respective parts in (3.1b), we obtain: This decouples w.r.t. the test functions' values due to the discontinuous form of the test space and yields a time-stepping scheme. The velocity update (3.1c) in the discretized form reads Recall that we identify H 1 -functions with (H 1 ) * -functionals by use of the L 2 -Riesz isomorphism instead of the H 1 -isomorphism. Therefore, the dual pairings lead to the L 2 -terms seen in (3.2d)-(3.2e) and (3.3).
Since the initial values do not require any time discretization, only the state and multiplier constraints in (3.1a) are left to be discussed. To this end, we introduce the setK The pointwise in time state constraint formulation in (2.3) for a continuous, piecewise linear state with y k = y(t k ), k = 0 . . . N then reduces to y k ∈K τ , k = 0 . . . N due to the convexity of the setK τ . The multiplier constraint λ ∈ TK (y) • for y ∈K results in The variation over ϕ includes the choice

5) decouples and leaves us with the componentwise condition
(3.6)

Time-stepping scheme
The discontinuity of the test functions in the discretization of (3.1) leads to a set of equations that is decoupled with respect to the test functions' degrees of freedom and yields a modified Crank-Nicolson time-stepping scheme in the values ( (3.7e) The Crank-Nicolson scheme for an equivalent system of first order is well known to be equivalent to the symmetric (2β = τ = 0.5) , classic Newmark scheme applied to the corresponding form of a second order ordinary differential equation. The modifications in (3.7) lie in the purely implicit treatment of the contact forces λ k and the volume forces u k , f ext k .
In the case of the contact forces, this is the desired modification to the classical scheme, first presented in [22], guaranteeing energy dissipativity in the appropriate situation.
The implicit treatment of external forces in the time-stepping scheme is due to the discretization of the volume forces as piecewise constant in time, whereas a piecewise linear continuous discretization would yield an averaged input of current and future forces. This step is justified physically, since there is no apparent reason for the system to be influenced in a continuous manner only. Algorithmically, this discretization is sound as well, as we will see in the optimization Sect. 4, where we employ an adjoint based minimization technique and need test functions and controls (volume forces) to be contained in the same space in order to be able to add the computed corrections to the iterates without leaving the control space.
This implicit treatment of the external forces does not spoil the advantage of energy dissipativity gained by the implicit treatment of the contact condition because this only holds for constant external forces in the first place. The proof of energy dissipativity of the modified Newmark scheme in the viscoelastic framework can be obtained by minor modifications of [11,Thm. 2

.1] and its extension in [26, Thm. 2.4.2].
Discussion of the modified discretization In this subsection, we want to justify the particular choice of discretization. Specifically, the reason why the modifications to the temporal part of the Petrov-Galerkin discretization used in [34] were necessary. In the aforementioned paper, the authors present a nonconforming finite element discretization for the wave equation that results in the Crank-Nicolson scheme.
The key differences between the case in [34] and our application are twofold. Firstly, we do not want to obtain a discretization that corresponds to the symmetric Newmark scheme, which is equivalent to the Crank-Nicolson scheme in that case, but instead we want to obtain the contact implicit Newmark scheme. This requirement is due to the poor stability properties of the symmetric Newmark scheme in the contact constrained case, see, e.g., [37]. Secondly, we deal with a hyperbolic variational inequality instead of a hyperbolic partial differential equation. We want this variational inequality to be discretized so that it results in a set of N time independent variational inequalities in which the solutions to the variational inequalities are coupled sequentially and where the multiplier condition λ ∈ TK (y) • decouples completely.
By nature of the variational inequality, the multiplier condition (3.7e) in the continuous formulation is tested with a difference of two ansatz functions from the admissible set, meaning Here, ϕ, y are chosen fromK ⊂ Y and as ansatz functions, they are discretized piecewise linear and continuous. This introduces a coupling in (3.8) unless the multiplier is chosen to act only on the time discretization points t k , k = 1 . . . N , which leaves the vector valued Dirac measures as the only viable option.
The discretization as a whole retains physical relevance because the behavior of realistic displacements and velocities needs to be modeled continuously, while forces may change instantly. Allowing the contact forces to only act locally at the times of discretization to respect the contact constraints at those specific times is justified as well, due to the convex set of piecewise linear admissible states.

Optimal control of the semi-discretized problem
Following the time discretization in the previous section, we will now focus on the optimal control framework for the semi-discretized dynamic contact problem. We shortly state the analytic setting that all of the results in this section will be based upon and which we will assume to be known in this section.
The discrete setting involves the discretized controls u k , k = 1 . . . N , and the discretized tuples of states and velocities (y ini , v ini ), (y k , v k ), k = 0 . . . N . First note the following observation, which allows for a more compact notation: and can therefore be eliminated from the semi-discretized system. The initial values (y ini , v ini ) can be considered as right hand side input, removing y 0 from the unknowns.
Proof This immediately follows by a recursion argument for the velocity coupling (3.3) and by the correspondence between the initial values and the first states and velocities seen in (3.7a).
The examinations in this chapter therefore build on the discretized state-, controland test spaces Following Proposition 4.1, we define the linear operator and for the discretized operators, cf.
The right hand side f τ ∈ P τ * is a result of all affine parts that influence the system, i.e., the (scaled) external forces f ext,τ ∈ P τ * and the part involving all initial value We assume an appropriate representation of the discretized cost functional J : Y × V × U → R to be given as J τ : Y τ × U τ → R and define the set of admissible displacements as This leads to the semi-discretized optimization problem for all p ∈ P τ . The optimal control problem (4.1) includes all of the discretized constraints, since the velocity coupling and the initial values have been incorporated explicitly.
The reason for including the initial values in the right hand side directly, instead of enforcing the equality of y 0 and y ini , is a formal one. While the formulations are equivalent, the variational equation that enforces the equality of the initial values is not influenced by the control and we lose density of the image space of the operator B τ in P τ * , which is needed later on. This also means that we need y ini ∈K τ , which is a reasonable requirement.
In the following subsection, we will establish the existence of a solution operator to the variational inequality (4.1d), which allows us to show the existence of minimizers to the optimal control problem (4.1). We will show directional differentiability of the solution operator under the assumption of certain polyhedricity properties, cf. Definition 4.6, for the set of admissible states and we use the differentiability in order to derive optimality conditions of first order for the minimizers of (4.1).

Solutions of the state problem
In this subsection, we will show the existence of a Lipschitz continuous solution operator to the variational inequality (4.1c)-(4.1d). The considerations are largely based on the time-stepping interpretation of the variational inequality.
We begin by establishing the existence of a solution operator to the variational inequalities in each time step of the discretized dynamic contact problem that will be used in the representation of the solution operator to the complete variational inequality.
associated with the bilinear form is an isomorphism.

There exists a Lipschitz continuous solution operator
that maps any right hand side l ∈ (H 1 D ) * to the solution y of the variational inequality is well defined and Lipschitz continuous.
Proof The form d : Boundedness and coercivity with constantsM(τ ),M(τ ) > 0, both depending on the time discretization, follow by the respective properties of the forms (·, ·) L 2 , a, b: The Lax-Milgram lemma yields the first proposition. [38, Thm. 2.1] yields the existence of the Lipschitz continuous solution operator s : (H 1 D ) * →K τ , sinceK τ is closed and convex. This is due to the existence of a q.e. pointwise convergent subsequence to any H 1 D convergent sequence. Furthermore, for w ∈ P τ * , we denote z = (y 0 , . . . , y k , w) ∈ (H 1 D ) k+1 × P τ * . Then the operator l k+1 (z) : H 1 D → R is linear and because of for ϕ ∈ H 1 D and constantsM a/b , M l k+1 > 0, it is continuous as well and therefore the operator l k+1 : The Lipschitz continuity of z → l k+1 (z) follows from the affine linear structure with bounded linear part.
With the quantities from the previous lemma, we find the existence of a solution to the next time step in the time-stepping scheme.

Lemma 4.3 (Solution of a time step)
Let w ∈ P τ * be given. Under the assumptions of the discretized setting and assuming y ini = y 0 , y 1 , . . . , y k ∈K τ to be the solutions of the first k < N time steps of the discretized dynamic contact problem  (y 0 , . . . , y k , w)) where the operator l k+1 : (H 1 D ) k+1 × P τ * → (H 1 D ) * maps a right hand side for the time-stepping problem to a right hand side of the time step k + 1.
Proof For k = 0 . . . N −1, a time step t k → t k+1 corresponds to solving the variational inclusion which can be seen in the decoupling of (4.1d) with respect to the test functions, cf.  Proof Let w ∈ P τ * . We can recursively define the solution operator S to the state problem as where y k = S k (w) with, S 1 (w), . . . , S k−1 (w), w).
Therefore, the Lipschitz continuity of S : P τ * → K τ follows from the Lipschitz continuity of each component mapping The recursive definition of the time stepping solution operator shows that these Lipschitz constants will generally be dependent on the time discretization, specifically the number of time steps. A straightforward estimation of the Lipschitz constants yields no bound and we expect this to be an aspect that requires a significant amount of work before being able to pass to the time continuous problem in the limit.

Existence of optimal controls
The control-to-state operator now allows for deriving the existence of minimizers to the optimization problem, which is stated in the following theorem. then the optimal control problem min J τ (y, u) (4.8a) admits a solution (ȳ,ū).
Proof We follow the standard technique focusing on weak subsequential convergence of a minimizing sequence where compactness is supplied by the embedding of the L 2 controls into (H 1 ) * . Let (u (i) ) i∈N be a feasible minimizing sequence to J τ , so that J τ (S(u (i) ), u (i) ) → inf U τ J τ (S(·), ·). Due to the coercivity of the functional J τ , the sequence (u (i) ) i∈N is bounded in U τ , we obtain existence of a weakly convergent subsequence, which will also be denoted (u (i) ) i∈N , u (i) u, from the reflexivity of H 1 and L 2 .
From application of Schauder's theorem to the embedding H 1 → L 2 one obtains the compact embedding of (L 2 ) * → (H 1 ) * and we therefore obtain strong convergence The assumptions in Theorem 4.5 hold, e.g., for the tracking functional with quadratic regularization with α > 0, which is convex and continuous w.r.t. all arguments and therefore weakly lower semicontinuous, cf. [15, p. 562], and the regularization part guarantees the coercivity (4.7). The state dependent part in (4.9) follows from an approximation of y − y d L 2 (I,L 2 ) by the trapezoidal rule.

Differentiability properties of the solution operator
Before we can state optimality conditions for the minimizers, the differentiability properties of the solution operator need to be discussed. This subsection addresses these properties of the operator S : P τ * → Y τ . We begin by examining the directional differentiability of the operator s : (H 1 D ) * →K τ in an abstract setting and extend the results to the solution operator S. We end this subsection with examples, in which the proposed conditions are satisfied.

Directional differentiability of s
This subsection focuses on conditions that guarantee directional differentiability of the time-stepping operator s : (H 1 D ) * →K τ . Our examination of the operator's differentiability properties is based on Mignot's central result in [39,Sec. 2], which uses the notion of polyhedricity of a set as a key property and we therefore recall: The set that we will examine with respect to polyhedricity will be the set of admissible states. It is apparent that this is a property of the physical setup and its modeling. Theorem 2.1 of the aforementioned paper states K is polyhedric w.r.t. (y, d(y−w, ·)), then the projection operator P d K is directionally differentiable at w and the derivative at w in direction δw can be computed as the d(·, ·)-orthogonal projection of δw onto the critical cone to K w.r.t. (y, d(y − w, ·)), i.e., K K (y, d(y − w, ·)). Therefore, the previous theorem yields the directional differentiability of s : (H 1 D ) * → K τ together with an explicit expression for its derivative as long as the polyhedricity assumption holds. The aim of this section is therefore to show polyhedricity ofK τ w.r.t. (y, Dy − l) with D and l as in Lemma 4.2.

As stated in
In [39], the case of a simplified, "scalar" contact problem on an n-dimensional domain is studied as an example, where constraints are enforced solely on the boundary of the reference domain, but the unknown displacement is assumed to be a scalar function, modeling a displacement with respect to a prescribed direction. This results in the set of admissible displacements being For our setting of n-dimensional displacements on an n-dimensional domain, additional work is required in order to obtain polyhedricity. The additional difficulty is introduced, because the set C is replaced by a more complex set, involving the vector field ν on the contact boundary C , namelȳ The important case, where the contact normal coincides with the geometric normal on the contact boundary, ν = ν, has been considered by Betz [6], where he extends Mignot's proof of polyhedricity to this case and obtains polyhedricity of the admissible set in the sense of Theorem 4.7. For this special case, the assumptions of Theorem 4.7 are therefore satisfied and one obtains the directional differentiability of s : Recently, a comprehensive study of polyhedric sets by Wachsmuth has been published in [51] that extends to a rather general setting in Banach spaces with vector lattice structure with strongly-weakly lattice operators.
In the following, we will show polyhedricity of the setK τ ⊂ H 1= H 1 ( , R 3 ) in more general frameworks than those of Betz. Our strategy will be to reduce the question of polyhedricity in the vector valued case to a scalar case, that can be analyzed with either the techniques studied by Mignot or the ones of Wachsmuth. Our idea is to regardK τ as the preimage of C under a linear operator. We will derive some abstract results on how polyhedricity is inherited by preimages and then give examples, where these abstract conditions can be verified, later in this subsection.
To this end, let us fix the assumptions for a more general setting, in which we want to investigate polyhedricity. Note that the linear operator L is generally not injective, so it will have a nontrivial kernel. For a set R ⊂ H , the expression L −1 R in the following denotes the preimage of R in H . We begin with the following theorem on the commutativity of the preimage and the interior/closure operations of a set.
Due to the open mapping theorem by Banach and Schauder, L is also an open mapping. Several of the following results do not rely on the Hilbert space structure of H, V and can be extended to hold in Banach spaces. We restrict ourselves to the Hilbert space case for simplicity. This allows us to formulate the following lemma, which gives some insight into how we can express the kernel of L and the tangent cone to K at y ∈ K .
The adjoint operator to L will play a key role from here on. The closed range theorem, cf., e.g., [54,Section VII.5], states that L * : V * → H * is injective and has closed range and therefore im(L * ) = ker(L) ⊥ . (4.10) However, L * generally is not surjective, so we may only consider a linear, bounded inverse operator on its image, meaning L − * : im(L * ) → V * .

Lemma 4.10
In the setting of Assumption 4.8, we have the following additional information on the kernel of L as well as the radial, tangent and normal cones to K at a point y ∈ K : Proof Part 1 follows because for y ∈ K and δy ∈ ker(L) the equation Ly + λLδy = Ly ∈ C holds for any λ ∈ R and therefore δy ∈ R K (y). For part 2, let δy ∈ R K (y) andδy ∈ ker(L), then Ly + λLδy ∈ C for a λ > 0 and Ly + λL(δy +δy) = Ly + λLδy ∈ C holds for the same λ. Therefore δy +δy ∈ R K (y). The relation then implies part 3. Part 4 follows directly from Lemma 4.9 part 2 due to the commutation of preimage and closure: For part 5, fix a μ ∈ T C (Ly) • , then L * μ, δy H = μ, Lδy V ≤ 0 ∀δy ∈ T K (y) since Lδy ∈ T C (Ly). On the other hand, part 1 and the closed range theorem, cf. Consequently, assuming an r ∈ T K (y) • , there exists a μ ∈ V * such that r = L * μ. For any δv ∈ T C (Ly) there exists a w ∈ T K (y) with Lw = v and therefore μ, v = μ, Lw = L * μ, w = r, w ≤ 0.
The missing surjectivity of L * also explains the requirement r ∈ im(L * ) of the following lemma. Proof The idea to this proof is, to rewrite the radial cone and annihilator in the polyhedricity Definition (Def. 4.6) with the help of the linear operator L and use the commutativity of the closure and preimage from Lemma 4.9. We will start by gathering the prerequisites for the actual proof.
Due to Lemma 4.10, we know that R K (y) = L −1 R C (Ly).
By assumption, we know r ∈ im(L * ) = ker(L) ⊥ and we therefore directly obtain ker(L) ⊂ {r } ⊥ from duality. Therefore, Using the polyhedricity properties assumed on C, the commutativity results in Lemma 4.9-4.10 lead to the proof of the initial claim:

Remark 4.12
For the residual r of a variational inequality of the type seen in the time stepping solution operators with respect to a closed, convex admissible set K , we of course obtain r ∈ T K (y) • ⊂ im(L * ) by Lemma 4.10 part 5.
At this point, in order to show polyhedricity of the set K w.r. t. (y, r ), it suffices to give a linear mapping L : H → V so that L(K ) = C where C is polyhedric w.r.t. (Ly, L − * r ). We want to investigate the set C = {v ∈ H 1 D ( ) : v ≤ ψ q.e. on C } and show its polyhedricity as well as construct an operator L that satisfies the requirements of the previous section, which will yield information on the form of the tangent, normal and critical cones that are involved with our problem. There are several aspects to accomplishing this.
Mignot's results in [39] include characterizations of the involved cones. They help, e.g., with the interpretation of the adjoint problem. Polyhedricity of the set is shown, only for specific points in C and specific linear functionals, which are both linked to the scalar product in the Dirichlet space.
Using the polyhedricity results for C with Lemma 4.11 would therefore require the construction of a scalar product on H 1 D ( ) in which these specific directions coincide with the directions required in Lemma 4.11, i.e., to define a bilinear, bounded and coercive form d E on V = H 1 D ( ) and a right hand side g ∈ V * , such that Ly is the unique solution to the variational inequality associated with d E , g and the set C, where the residual coincides with L − * r and (V, d E ) forms a Dirichlet space.
The polyhedricity condition is therefore more easily obtained by the results in [51].

Lemma 4.13 (Polyhedricity of C) The set C
It is therefore a Hilbert space with lattice structure and bounded max operator, which implies a strongly-weakly continuity of the latter. Further, C is a set with upper bound in the sense of [51,Def. 4.9]. Therefore, all requirements for [51,Thm. 4.18] are satisfied, implying the polyhedricity of C. Remark 4.14 The previous lemma actually shows polyhedricity of C with respect to · H 1 ( ) . It is immediately clear from the Definition 4.6 that this property of course holds for all equivalent scalar products on W 1 D ( ).
The remaining task for our method is the construction of the linear operator L. Note that the previous corollary yields directional differentiability of s : (H 1 D ) * →K τ at any f ∈ (H 1 D ) * , not just at specific points in (H 1 D ) * . Remark 4. 16 We have chosen to assume existence ofν as a unit vector field on for simplicity of presentation. In applications, ν may be a given unit vector field, defined on C only. Then we have to extend ν to , e.g., by techniques of differential geometry, with a C 1 mapping ϕ : → C and definingν (ω) := ν (ϕ(ω)) or, alternatively, by a combination of extension results for Lipschitz functions and some mild regularity assumptions on C .

Hadamard differentiability of S
Assuming the time-stepping operator s : (H 1 D ) * →K τ is directionally differentiable, we can extend the differentiability to the time-stepping solution operator S : P τ * → K τ . The structure of the right hand sidesl k+1 (·) in each time step results from the sequential nature of the time-stepping scheme, so the right hand side in a time step depends on the solutions of the previous steps. Since a chain rule generally does not hold for directionally differentiable operators, directional differentiability of the timestepping solution operators s : (H 1 D ) * →K τ may not be sufficient for differentiability of the solution operator S : P τ * → K τ . The differentiability concept of Hadamard allows for an extension of the chain rule to the case of "tangential" directional differentiability and is recalled for the reader's convenience.
• If F and G are Hadamard differentiable, the composition H = F • G : X → Z is Hadamard differentiable as well.
Proof The computations are straightforward and included, e.g., in [44].
We already know s : (H 1 D ) * →K τ to be Lipschitz continuous, so whenever it is directionally differentiable, it is Hadamard differentiable as well and the chain rule holds. Therefore, the properties of the operators in the time steps transfer to the discretized contact problem.  w, δw) . . . l 1 (w, δw)) . . . (4.6). If δl → s (l, δl) is Lipschitz continuous, then so is δw → S (w, δw).
Proof This proof follows from induction. The operatorl 1 : P τ * → (H 1 D ) * , mapping the right hand side w ∈ P τ * to the right hand side of the first time step, has the form which is affine linear with a bounded linear part. Thereforel 1 is Fréchet differentiable, implying Hadamard differentiability with (4.11) Because s : (H 1 D ) * → H 1 D was assumed directionally differentiable and it is Lipschitz continuous, it is Hadamard differentiable as well. Lemma 4.18 then yields the Hadamard differentiability of the operator S 1 : P τ * → H 1 D with S 1 (w, δw) = s (l 1 (w),l 1 (w, δw)).
For 1 < k ≤ N this argument holds analogously. The mappingsl k (·) : P τ * → (H 1 D ) * are compositions of the affine linear maps l k : (H 1 D ) k × P τ * → (H 1 D ) * and the component mappings S i : P τ * → H 1 D , i = 1 . . . k − 1 of the solution operator S. The maps l k have bounded linear part and are therefore Fréchet differentiable, while the component maps S i , i = 1 . . . k, are Hadamard differentiable. Therefore, Hadamard differentiability of the operator S k : P τ * →K τ follows from the chain rule in Lemma 4.18, which also yields the representation of the directional derivative as The Lipschitz continuity of δw → S (w, δw) follows analogously to the Lipschitz continuity of the solution mapping S : P τ * → K τ from the Lipschitz continuity of the component mappings δw → S k (w, δw), which follows from the same type of induction argument.
In the following theorem, we summarize the results of the previous subsections and specify the form of the derivatives in the cases where Mignot's results on polyhedricity can be used.
allows for a unique solution operator with the critical cone K = N k=1 KK τ (y k ,l k (w) − Dy k ).
Proof We have already seen the existence of the Lipschitz continuous solution operator in Sect. 4.1.
A straightforward calculation, using the particular form of thel k and A τ yields the form (4.13) of the derivative. The Lipschitz continuity of δl → s (l, δl) is clear because of the representation as the solution operator of the variational inclusion associated with the critical cone KK τ (y k ,l k (w) − Dy k ) and [38,Thm. 2.1]. Therefore, we obtain the Lipschitz continuity of δw → S (w, δw) by Theorem 4.19.
At this point, we have a set of abstract conditions for existence and differentiability of the solution operator to the discretized variational inclusion that need to be verified in the respective concrete settings.

Examples
A first canonical example for the applicability of our theory is one body unilateral contact with a rigid plane, for which we will verify the assumptions of Theorem 4.20.

Example 4.21 (One body unilateral contact with plane)
The setting is the one, displayed on the left side of the illustration in Fig. 1. The plane may be slanted as well. The description then amounts to 1. the spaces H = H 1 D and V = H 1 D ( , R), 2. the constant contact normal ν : C → R n , 3. a positive gap function ψ ∈ L 2 ( C , R) and 4. the set of admissible statesK τ N withK τ = {y ∈ H 1 D : y · ν ≤ ψ q.e. on C }. The constant extension of ν toν : → R n yields a W 1,∞ ( )-function that satisfies the requirements of Theorem 4.15, soK τ is polyhedric in the sense of Theorem 4.20 and we obtain a Lipschitz continuous, Hadamard differentiable solution operator by the same theorem.
Even though we focus on one body contact problems in this paper, the techniques are applicable to two body problems as well, therefore we want to give a short outlook for the two body problems at this time. An overview on the specifics for modeling two body problems can be found in, e.g., [23]. being a smooth bijection with a smooth inverse and uniformly bounded Jacobian, for which (ω)−ω | (ω)−ω| is constant (Fig. 4). The description then amounts to
We define

First order optimality conditions
First, note the following lemma.

Lemma 4.23 (Density of controls) The image im(B τ ) of the operator B
and that the embedding operator E : H 1 → L 2 has trivial kernel.
Following the same argument used in Wachsmuth's example [50,Sec. 5.1], we can now derive necessary conditions of first order, which are stated in Theorem 4.24, based on a linearization of the optimal control problem and a density argument. The following will include an adjoint state p ∈ (P τ ) * * that can be identified with an element of the primal spacep ∈ P τ due to the reflexivity of P τ , and we denote both by p without further differentiation.
IfK τ is polyhedric in the sense of Theorem 4.20, then there exist multipliers p ∈ P τ , μ ∈ P τ * with Proof Due to the polyhedricity assumptions, we have Hadamard differentiability of the solution operator and the optimality of (ȳ,ū) = (Sū,ū) therefore implies Testing the previous line with ±δu as in [39,50] yields the existence of M(τ ) > 0 with δu defines a bounded functional and can be extended to a functional p ∈ (P τ ) * * = P τ , where ∂ u J τ (x) = B τ * p, see [50]. The density of im(B τ ) in P τ * yields implying that (δy, δξ) = (0, 0) is a global minimizer to the problem Together with the state inequality (4.14c)-(4.14d) the adjoint problem (4.15) and the stationarity condition (4.16) form the first order optimality system. When we refer to (4.15) as the adjoint problem, this is meant to include the constraint on the multiplier μ. We change the sign on the adjoint state p for consistency reasons.

Discussion of the optimality conditions
In this subsection, we take a closer look at the optimality conditions to the problem (4.14) that were established in the previous section, specifically at the adjoint equation.
We show how to interpret the adjoint problem as a sequential step-by-step scheme and shortly discuss existence of solutions to the adjoint problem and their role in the optimality conditions. We defineλ = B τū + f τ − A τȳ as the residual for the elastic problem at the optimizer. Recall thatλ can be interpreted as a set of contact forces in the forward problem. Now, the adjoint problem in (4.15)-(4.16) consists of the conditions for the adjoint state p and the multiplier μ, with as well as the equation Variational form Testing (4.18) with y ∈ Y τ yields which can be rewritten as a(y 1 , p 1 ) cf. the definition of A τ in the beginning of this section. There is a close resemblance to the form in (4.2), where p was the test function. The decoupling into a time-stepping scheme was apparent in that case. Here, y is the test function, but the decoupling is inherent to the form of A τ . When p is discretized, (4.18) decouples as well and reveals the same step-by-step structure when the components y k , k = 1 . . . N are varied independently. The adjoint problem (4.18) can then equivalently be interpreted as the following stepping scheme for being interpreted as adjoint velocities. The adjoint velocities are stated explicitly in p k for k = 1 . . . N , just as the velocities in the forward problem have been earlier in this section, cf. Proposition 4.1. We also recognize the coercive, bounded, bilinear form d : a(·, ·) that defined the operator in all of the time steps to the state problem, c.f. Lemma 4.2. The structure with respect to the multiplier μ is similar as well, as it is treated fully implicitly in each backward step. The linearization of the cost functional contributes to the right hand side of the adjoint problem, as usual in adjoint problems.
Adjoint stepping scheme When we replace the explicit representation of the adjoint velocities (4.21)-(4.22) by a step-based update and include the restrictions (4.17) on p and μ, this leads to the following backward time-stepping scheme with terminal condition and time steps for k = 1 . . . N − 1 The contact forces in each timestep arē as defined in (4.5b). The system decouples with respect to the values p k , q k and involves computing p k from (4.24c) under the constraints (4.24a)-(4.24b). The value to q k is then computed from an explicit update in (4.24d) and the same holds for the terminal condition.
Adjoint boundary conditions In order to better understand the structure of the time steps of the adjoint problem, specifically the boundary conditions for the adjoint problem, we will derive a pointwise interpretation of (4.24a) for k = 1 . . . N . This means that we want to analyze the critical cone TK τ (y k ) ∩ {λ k } ⊥ under Assumption 4.8, i.e., given the extensionν ∈ W 1,∞ ( , R n ) of the contact normal ν on C in Corollary 4.15. To this end, let y ∈ Y τ and be the region of contact, defined up to sets of capacity 0. We seek to obtain a decomposition A k = S k ∪ A k \ S k = S k ∪ W k of the active set into a set of strong contact, i.e., a set where the contact forces actively prevent interpenetration of the body and the obstacle, and a set of weak contact, i.e., a set where the body and the obstacle are in contact without penetration and without any input of contact forces. Since C is quasi closed and is quasi continuous, the functioñ where M + (cl( )) denotes the positive Radon measures on the closure of the domain. Now consider λ k ∈ TK τ (y k ) • and let ξ k be the measure that represents λ k . The set {v ∈ H 1 D ( ) : v = 0 ξ k -a.e.} is a closed ideal, cf. [47], and Theorem 1 of the same note yields a set S k ⊂ cl( ), that is uniquely defined up to capacity 0, such that on W k and y · ν = 0 q.e. on S k }.
[24, Theorem 1.5] implies that S k ⊂ A k , in the sense that S k \ A k is a polar set, and we call S k the region of strong contact and W k = A k \ S k the region of weak contact. The boundary conditions for the adjoint state can therefore be rewritten as p k · ν ≤ 0 q.e. on W k , p k · ν = 0 q.e. on S k , i.e., sliding boundary conditions for p k on the region of strong contact and unidirectional nonpenetration conditions for p k on W k . On C \ A k , there are no restrictions on p k , therefore we have homogeneous Neumann boundary conditions on this part of the boundary.
With these characterizations in mind, we further obtain the following result, which will later help with the choice of a numerical approach. Proof When the critical cone is linear, the time steps in the adjoint problem are linear problems with coercive bilinear forms and thus have a linear solution operator. Linearity of the directional derivative to the solution operator of the contact problem follows directly from the linearity of the critical cone the directional derivative projects onto, cf. Theorem 4.7. Fréchet differentiability of the reduced problem is a result of the compact operator B : U τ → Y τ * and Fréchet differentiability of the cost functional. The form of the Fréchet derivative is obtained by proving that J y (S(Bu + f ))S (Bu + f, ·) also is the unique solution to the linear adjoint problem.
It is clear, that the linearity condition for the critical cone is only satisfied whenever the biactive set W k has vanishing capacity, i.e., there is no weak contact.
Relation to Crank-Nicolson scheme Condition (4.24d) can be restated explicitly for p k and from (4.24c), we can compute an expression for p k+1 − p k that can be plugged into (4.24d). Combining the two resulting conditions, we obtain

Numerics
This section is dedicated to the presentation of numerical results for a simple optimization scheme, based on the adjoint problem in Sect. 4.5. Theorem 4.25 states that if the region of weak contact is a polar set, i.e., of capacity 0, the one body optimal control problem for sufficiently regular C is Fréchet differentiable. Since weak contact is a rather unlikely scenario in many applications, we expect to be able to obtain Fréchet derivatives by solving the linear adjoint problem and using the known representation of the Fréchet derivative via the adjoint state in all but few degenerate scenarios. Our algorithm therefore takes the following form: Algorithm 1 Optimization 1: Fix initial control u 0 , set k = 0; 2: while stopping criterion not satisfied do 3: Compute state y k from control u k ; 4: Compute p k as the unique solution to the adjoint problem but enforce sliding boundary conditions on the entire active set A k ; 5: Use L 2 Riesz representative of B * p k + J u as correction with suitable damping; 6: k → k + 1; In all of our computations, we have used a fixed upper bound on the number of iterations as the stopping condition. The modification in step 4 can be interpreted as treating the entire active set as the set of strong contact. Consequently, when we obtain a state where this is the case anyway, i.e., where the set of weak contact is a polar set, then our method performs a gradient step in step 5. If there is weak contact present, we modify the adjoint problem and obtain a uniquely solvable system. The approach is motivated by the research presented in [41], which suggests that the resulting correction step may coincide with a subgradient step.
Clearly, this approach is a very simple one and convergence analysis is unavailable. A detailed analysis of algorithms to be used for these problems is outside the scope of this article however, and we would like to point out that the numeric results in this section seem to support our thesis in the beginning of this section, i.e., that the non Fréchet differentiable situation is not encountered often or even regularly.
As a testproblem, we consider a problem of the type (4.1) with a linearly viscoelastic body in the shape of a half sphere of radius 15m with a Kelvin-Voigt type response that comes into contact with a rigid plane on the time interval I = [0, 0.075 s], which equals 150 time steps of length τ = 5 · 10 −5 s. The body is considered to be at rest at time t = 0 and homogeneous Dirichlet conditions are prescribed on the top section of the boundary while the contact boundary is assumed to lie within the middle third of the spherical boundary section.
Young's modulus and Poisson's ratio of the body's material were chosen to be E = 10 8 Pa, ν = 0.3 and bulk as well as shear viscosity were taken as 10 4 Pa s.
We search for minimizers to an approximation of the tracking type objective functional where u const ∈ L 2 and u const (ω) = ue y , u ∈ R n for the nth normal vector e n , resulting in a bouncing motion of the ball with contact being established and released several times while being damped by the viscose part. The final time observation entered the cost functional scaled by a parameter γ = 7.5 · 10 −4 . We chose the Tychonoff parameter to be α = 10 −2 . The control u was of first order of magnitude and was scaled by 10 6 N/m 2 when entering the right hand side as a force distribution in order to avoid handling controls with high order magnitudes, which would lead to very small Tychonoff parameters and poor optimizer behavior in the first iterations, especially w.r.t. the scaling of search directions.
This amounts to the optimal control problem min J τ (y, u) = The proposed algorithm to finding minimizers is based on an iterative procedure in the framework of [43], where a one dimensional search space, i.e., the search direction, is computed from the stationarity condition stated in Theorem 4.24 and appropriate stepsize control factors are calculated based on a quadratic regularization technique.
For the numerical treatment, we extend the time discretization to a full discretization with a P1 nodal basis for a spatial triangulation of the domain ⊂ R 2 . The resulting time-independent variational inequalities (4.5) in each of the time steps have been solved by a monotone multigrid solver [30] using (projected) block Gauß-Seidel schemes as base solver and smoothers. No weak contact occurred in the forward problems and therefore no additional treatment was required.
In this example, the stationarity condition (4.16) requires u = 1 α p. The adjoint state p satisfies sliding boundary conditions and homogeneous Dirichlet conditions on the sections of active contact and the Dirichlet boundary section, respectively. Therefore, any minimizing control to the problem will necessarily have no input on the Dirichlet boundary patch and no input in y-direction on active contact patches. Figure 5 shows that the resulting control is being reduced to values close to zero where contact is active and where the body is clamped by the Dirichlet conditions, as expected. The symmetry of the problem is responsible for the entire control being reduced to zero on the contact patch, instead of only its y-component. Over the course of the algorithm, the distance of the iterates to the reference control decreases at first and starts increasing again after the first 200 iterations, see Fig. 6. From 200 iterations on, the Tychonoff term in the costfunctional becomes increasingly relevant.
The development of the functional value during the iteration is shown in Fig. 7 as the difference of the current iterate's value and the functional value of the resulting control after 1000 iterations and behaves almost linearly.
Due to the end point observation in the cost functional, the difference of the result state and the desired state is roughly of the same order over the entire time intervall, Finally, Fig. 9 shows the qualitative behavior of the contact forces on the contact boundary, plotted over the horizontal position in the reference configuration. As expected, the contact forces are symmetric, as the problem is symmetric itself. The oscillatory behavior of the contact forces can be attributed to the spatial discretization and the effect decreases with mesh refinements. Some modifications can be applied to alleviate this effect (cf., e.g., [11,12,17]). Since this paper considers time-discretization only, we did not apply any of these for simplicity.

Conclusion and outlook
In this work, some steps towards the optimal control of dynamic contact problems, particularly in finding numerical solutions, have been taken. While of the optimal control problem in the time continuous case is still out of scope, we were able to establish a satisfactory theory for the time-discretized case.
Key to this analysis and to the numerical solution was the construction of a finite element method in time that represents a variant of the contact implicit Newmark scheme due to Kane et. al. For this discretization, we were able to extend the results of Mignot on strong stationarity from the scalar valued stationary case to the vector valued time-sequential case. Key ideas were the study of inheritance of polyhedricity under linear mappings and the use of Hadamard differentiability. A further extension to the time continuous case seems to be a very difficult, but also rewarding task. The straightforward idea of passing to the limit for τ → 0 involves severe mathematical difficulties.
A major aim of our analysis was the derivation of a time discrete adjoint equation that can be evaluated numerically by a backward time-stepping scheme. This is the foundation for our gradient based algorithm, which enabled us to numerically solve an optimal control problem subject to time discretized dynamic contact. Up to now, this algorithm relies on the circumstance that the non-smoothness due to weak contact plays a minor role in the examples considered so far. It is subject to current research to extend this algorithm to situations where the effects of non-smoothness are more severe.
Up to now, the applied model is only valid for small deformations and thus only for small movements of the elastic body. For practical applications, an extension to larger movements, like rotations, which is often done by factoring out rigid body motions, will be necessary. While things become more involved numerically and notationally, we conjecture that our theoretical findings will carry over to that case. The treatment of dynamic contact in the context of finite strains, where the difficulties of nonlinear elasticity and dynamic contact merge, is a lot more demanding. The optimal control of such problems will certainly require a major research effort in the future.