Penalized Weighted Trace Minimization for Optimal Control Device Design and Placement

James Cheung Toyon Research Corporation, 6800 Cortona Drive, Goleta, CA 93117.
Abstract.

In this paper, we present a new analytical framework for determining the well-posedness of constrained optimization problems that arise in the study of optimal control device design and placement within the context of infinite dimensional linear quadratic control systems. We first prove the well-posedness of the newly minted ”strong form” of the time-independent operator-valued Riccati equation. This form of the equation then enables the use of trace-class operator analysis and the Lagrange multiplier formalism to analyze operator-valued Riccati equation-constrained optimization problems. Using this fundamental result, we then determine the conditions under which there exists unique solutions to two important classes of penalized trace minimization problems for optimal control device placement and design.

The material presented in this paper was developed independently by the author and is not associated with any work related to his affiliated organization.

1. Introduction

The purpose of the work is to address the fundamental question of well-posedness for optimization problems associated with optimal sensor and actuator placement and design in the context of infinite-dimensional linear systems theory. To consolidate definitions, we will collectively refer to sensors and actuators as control devices. In the context of the linear quadratic regulator (LQR), the optimization problem of interest is the following:

minp𝒫{minu()(+;U)0+[(z(t),𝐐z(t))H+(u(t;p),𝐑u(t;p))U]𝑑t}\min_{p\in\mathcal{P}}\left\{\min_{u(\cdot)\in\mathscr{L}(\mathbb{R}_{+};U)}\int_{0}^{+\infty}\left[\left({z(t),\mathbf{Q}z(t)}\right)_{H}+\left({u(t;p),\mathbf{R}u(t;p)}\right)_{U}\right]dt\right\}

subject to

{zt=𝐀z(t)+𝐁(p)u(t;p)z(0)=z0,\left\{\begin{aligned} \frac{\partial z}{\partial t}&=\mathbf{A}z(t)+\mathbf{B}(p)u(t;p)\\ z(0)&=z_{0}\end{aligned}\right.,

for all t[0,)t\in[0,\infty); where denoting HH, UU, and 𝒫\mathcal{P} as separable Hilbert spaces associated with the state, control, and parameter respectively, we define p𝒫p\in\mathcal{P} to be the generalized design parameter (e.g. actuator placement or geometric design variables), z()Hz(\cdot)\in H to be the state variable, u(;p)L2(+;H)u(\cdot;p)\in L^{2}(\mathbb{R}_{+};H) to be the control variable, 𝐐s(H)\mathbf{Q}\in\mathscr{L}^{s}(H) to be the output operator, 𝐑(U)\mathbf{R}\in\mathscr{L}(U) to be the control weighting operator, 𝐁(p)(U;H)\mathbf{B}(p)\in\mathscr{L}(U;H) to be the parametrized control operator associated with the control device (e.g. sensors or actuators), and 𝐀:𝒟(𝐀)H\mathbf{A}:\mathcal{D}(\mathbf{A})\rightarrow H to define the state process as the generator of a C0C_{0}-semigroup. The Dynamic Programming Principle determines that the optimal control, for a fixed value p𝒫p\in\mathcal{P}, is given by

uopt(t;p)=𝐑1𝐁(p)𝐗(p)z(t),u_{opt}(t;p)=-\mathbf{R}^{-1}\mathbf{B}^{*}(p)\mathbf{X}(p)z(t),

where 𝐗(p)(H)\mathbf{X}(p)\in\mathscr{L}(H) is the solution to the weak form operator-valued Riccati equation

(ϕ,[𝐀𝐗(p)+𝐗(p)𝐀𝐗(p)𝐁(p)𝐑1𝐁(p)𝐗(p)+𝐐]ψ)H=0.\left({\phi,\left[\mathbf{A}^{*}\mathbf{X}(p)+\mathbf{X}(p)\mathbf{A}-\mathbf{X}(p)\mathbf{B}(p)\mathbf{R}^{-1}\mathbf{B}(p)\mathbf{X}(p)_{+}\mathbf{Q}\right]\psi}\right)_{H}=0.

for all ϕ,ψ𝒟(𝐀)\phi,\psi\in\mathcal{D}(\mathbf{A}). It then follows [9, Theorem 6.2.4] that

minu()L(+;U)0+[(z(t),𝐐z(t))H+(u(t;p),𝐑u(t;p))U]𝑑t\displaystyle\min_{u(\cdot)\in L(\mathbb{R}_{+};U)}\int_{0}^{+\infty}\left[\left({z(t),\mathbf{Q}z(t)}\right)_{H}+\left({u(t;p),\mathbf{R}u(t;p)}\right)_{U}\right]dt =(z0𝐗(p)z0)H\displaystyle=\left({z_{0}\mathbf{X}(p)z_{0}}\right)_{H}
=trace(𝐗(p)𝐖),\displaystyle=\texttt{trace}\left(\mathbf{X}(p)\mathbf{W}\right),

where 𝐖():=z0(z0,)H\mathbf{W}(\cdot):=z_{0}\left({z_{0},\cdot}\right)_{H} is the operator generated through the exterior product using the initial condition of the system z0Hz_{0}\in H. With these combined observations, the optimal actuator placement and design problem becomes

minp𝒫trace(𝐗(p)𝐖)\min_{p\in\mathcal{P}}\texttt{trace}\left(\mathbf{X}(p)\mathbf{W}\right)

subject to

(ϕ,[𝐀𝐗(p)+𝐗(p)𝐀𝐗(p)𝐁(p)𝐑1(p)𝐁(p)𝐗(p)+𝐐]ψ)H=𝟎,\left({\phi,\left[\mathbf{A}^{*}\mathbf{X}(p)+\mathbf{X}(p)\mathbf{A}-\mathbf{X}(p)\mathbf{B}(p)\mathbf{R}^{-1}(p)\mathbf{B}(p)^{*}\mathbf{X}(p)+\mathbf{Q}\right]\psi}\right)_{H}=\mathbf{0},

for all ϕ,ψ𝒟(𝐀)\phi,\psi\in\mathcal{D}(\mathbf{A}). This same optimization problem, albeit with 𝐀\mathbf{A} and 𝐀\mathbf{A}^{*} switched in the statement of the operator-valued Riccati equation carries over to the setting of optimal sensor placement and design for linear state estimation systems (i.e. in the context of the Kálmán filter). Therefore, the study of operator-valued Riccati equation constrained weighted trace minimization problems has broad reaching implications in designing optimal systems for both control and state estimation.

The problem of optimal device placement and design has its origins in the work of [2], where the notion of optimal sensor placement has been introduced for infinite-dimensional systems. This theory was then extended to actuator placement in LQR systems in [16]. Additional notable extensions of the theory have been made to device placement in \mathcal{H}_{\infty} control systems [13] and joint parameter estimation and sensor placement in partially observed systems [18]. The theory of optimal device placement and design has been applied to many practical problems including optimizing thermal control [14], vibration damping [17], and mobile sensor [4] systems. A general observation made when studying the existing literature on optimal device placement and design indicates that a constrained optimizer popt𝒫p_{opt}\in\mathcal{P} that minimizes the weighted operator trace of 𝐗(p)\mathbf{X}(p) can be shown to exist; however uniqueness of the optimizer is almost always left as an open problem. This is the key motivator for writing this paper.

We approach the problem of determining the conditions by which a unique solution to the constrained trace minimization problem associated with optimal device placement and design through penalization techniques. The first problem we study is the control penalization problem, i.e., to seek a constrained minimizer to

𝒥β(p)=trace(𝐗(p)𝐖)+β2p𝒫2,\mathcal{J}_{\beta}(p)=\texttt{trace}\left(\mathbf{X}(p)\mathbf{W}\right)+\frac{\beta}{2}\left\|p\right\|^{2}_{\mathcal{P}},

where we have introduced the penalization parameter β+\beta\in\mathbb{R}_{+} to regularize the cost functional. This penalization scheme is a classical technique used to improve the conditioning of optimization problems that arise in inverse problems [15] and optimal control [19]. We demonstrate in §4 that the expected result of choosing β\beta sufficiently large induces uniqueness of the constrained minimizer popt𝒫p_{opt}\in\mathcal{P}.

The study of the first penalized optimization problem will form the pedagogical basis for the second problem discussed in this work: the determination of conditions by which an unique optimizer popt𝒫p_{opt}\in\mathcal{P} exists on constraint manifolds defined by

trace(𝐁(p)𝐑1(p)𝐁(p))=γ,\texttt{trace}\left(\mathbf{B}(p)\mathbf{R}^{-1}(p)\mathbf{B}^{*}(p)\right)=\gamma,

where γ+\gamma\in\mathbb{R}_{+} is a positive constant. This constraint on 𝐁(p)𝐑1𝐁(p)\mathbf{B}(p)\mathbf{R}^{-1}\mathbf{B}^{*}(p) serves the practical purpose of constraining the so-called gain of the control device, e.g. the amount of control effort used by the control feedback law. Following the theme of penalization, this additional trace constraint is approximately enforced through the following penalized cost functional

𝒥β(p)=trace(𝐗(p)𝐖)+β2[trace(𝐁(p)𝐑1(p)𝐁(p))γ]2.\mathcal{J}_{\beta}(p)=\texttt{trace}\left(\mathbf{X}(p)\mathbf{W}\right)+\frac{\beta}{2}\left[\texttt{trace}\left(\mathbf{B}(p)\mathbf{R}^{-1}(p)\mathbf{B}^{*}(p)\right)-\gamma\right]^{2}.

Exact constraint enforcement is achieved in the limit as β+\beta\rightarrow+\infty. We determine the conditions under which there exists a unique constrained minimizer to this cost functional in §5. The penalization introduced in the second problem serves primarily as a mechanism to determine uniqueness on the additional constraint manifold.

The constrained optimization problems are cast into the Lagrange multiplier formalism to study the two penalized constrained trace minimization problems posed in the previous section. A fixed-point argument (c.f. the Banach fixed point theorem [7, Theorem 3.7-1]) is used to determine the uniqueness of the solution to the associated first-order optimality system associated with the Lagrangian functional for each problem. The second-order sufficient optimality condition is then applied to determine that the unique solution of the first-order optimality system is indeed a minimizer.

While this strategy is sound at first glance, we quickly run into technical issues when attempting to apply the Lagrange multiplier formalism to the operator-valued Riccati equation. The problem is that a sufficiently “strong” form of this equation, i.e. of the form

𝐀𝐗+𝐗𝐀𝐗𝐁𝐑1𝐁𝐗+𝐐=𝟎,\mathbf{A}^{*}\mathbf{X}+\mathbf{X}\mathbf{A}-\mathbf{X}\mathbf{B}\mathbf{R}^{-1}\mathbf{B}^{*}\mathbf{X}+\mathbf{Q}=\mathbf{0},

has not been shown to be well-posed on all of the Hilbert space HH associated with the state of the system. This inhibits the use of the appropriate trace-class operator analytic tools required to rigorously derive the first-order optimality system associated with the constrained optimization problems discussed in this work. The prior literature (see e.g. [3]) only indicates that the operator-valued Riccati equation (without appealing to its Bochner integral form [5]) is only well-defined as an operator equation whose domain is defined on 𝒟(𝐀)\mathcal{D}(\mathbf{A}). However, we are able to overcome this technical hurdle in this work by determining that the “strong” form of the operator-valued Riccati equation is in fact well posed in Theorem 1.

The structure of the paper is the following: First, we begin in §2 with the notation that will be used throughout the work, then in §3 we analyze the well-posedness of the strong form of the operator-valued Riccati equations and its associated dual problem along with the Lipschitz continuity of their solutions with respect to the control parameter p𝒫p\in\mathcal{P}. Next, in §4 and §5 we analyze the penalized problems of interest, and finally we conclude this paper with a discussion of our findings in §6.

2. Notation

Let HH be a separable complex Hilbert space with its inner product denoted as (,)H:H×H+\left({\cdot,\cdot}\right)_{H}:H\times H\rightarrow\mathbb{R}_{+}. Throughout this work, we will define {ei}i=1\left\{e_{i}\right\}_{i=1}^{\infty} an orthonormal basis of HH. This means that any element ϕH\phi\in H can be represented as

ϕ=i=1ciei,\phi=\sum_{i=1}^{\infty}c_{i}e_{i},

where {ci}i=1\left\{c_{i}\right\}_{i=1}^{\infty}\in\mathbb{C} are scalar coefficients, and that (ei,ej)H=δij\left({e_{i},e_{j}}\right)_{H}=\delta_{ij} with

δij:={1if i=j0Otherwise\delta_{ij}:=\begin{cases}1&\textrm{if }i=j\\ 0&\textrm{Otherwise}\end{cases}

denoting the Kronecker delta function. We denote the space of bounded linear operators mapping HH onto HH as (H)\mathscr{L}\left({H}\right). where the (H)\mathscr{L}\left({H}\right) norm is defined by

𝐓(H):=supϕHϕ0𝐓ϕHϕH.\left\|\mathbf{T}\right\|_{\mathscr{L}\left({H}\right)}:=\sup_{\begin{subarray}{c}\phi\in H\\ \phi\neq 0\end{subarray}}\frac{\left\|\mathbf{T}\phi\right\|_{H}}{\left\|\phi\right\|_{H}}.

For any 𝐓(H)\mathbf{T}\in\mathscr{L}\left({H}\right), the definition of the adjoint operator 𝐓(H)\mathbf{T}^{*}\in\mathscr{L}\left({H}\right) is defined through the inner-product by the following

(ϕ,𝐓ψ)H=(𝐓ϕ,ψ)H\left({\phi,\mathbf{T}\psi}\right)_{H}=\left({\mathbf{T}^{*}\phi,\psi}\right)_{H}

for all ϕ,ψH\phi,\psi\in H.

In this work, we are also be interested in the space of bounded linear operators on Banach spaces. Let V1V_{1} and V2V_{2} be two complex Banach spaces. We then denote the space of bounded linear operators mapping V1V_{1} to V2V_{2} as (V1;V2)\mathscr{L}\left({V_{1};V_{2}}\right). The (V1;V2)\mathscr{L}\left({V_{1};V_{2}}\right) norm is then defined by

𝐓(V1;V2):=supϕV1ϕ0𝐓ϕV2ϕV1\left\|\mathbf{T}\right\|_{\mathscr{L}\left({V_{1};V_{2}}\right)}:=\sup_{\begin{subarray}{c}\phi\in V_{1}\\ \phi\neq 0\end{subarray}}\frac{\left\|\mathbf{T}\phi\right\|_{V_{2}}}{\left\|\phi\right\|_{V_{1}}}

for all ϕV1\phi\in V_{1}. With the basic functional analytic notation defined, we now move on to define the more technical notions needed for this work.

2.1. Trace-Class Operators

Trace class operators generalize the notion of finite dimensional matrices with finite trace (i.e. the matrix Lie algebra 𝔤𝔩(n;)\mathfrak{gl}(n;\mathbb{C})) to the setting of infinite dimensional linear operators. The formal definition for the space of trace-class operators 𝒥1(H)(H)\mathscr{J}_{1}(H)\subset\mathscr{L}(H) is given by

𝒥1(H):={𝐓(H):|trace(𝐓)|<},\mathscr{J}_{1}(H):=\left\{\mathbf{T}\in\mathscr{L}(H):|\texttt{trace}\left(\mathbf{T}\right)|<\infty\right\},

where the operator trace trace():𝒥1(H)\texttt{trace}\left(\cdot\right):\mathscr{J}_{1}(H)\rightarrow\mathbb{C} is defined by

trace(𝐓):=k=1(ek,𝐓ek)H\texttt{trace}\left(\mathbf{T}\right):=\sum_{k=1}^{\infty}\left({e_{k},\mathbf{T}e_{k}}\right)_{H}

for all 𝐓𝒥1(H)\mathbf{T}\in\mathscr{J}_{1}(H), where {ek}k=1\left\{e_{k}\right\}_{k=1}^{\infty} again forms an orthonormal basis of the Hilbert space HH. The 𝒥1(H)\mathscr{J}_{1}(H) norm is then defined by

𝐓1:=|trace(𝐓)|,\left\|\mathbf{T}\right\|_{1}:=|\texttt{trace}\left(\mathbf{T}\right)|,

where ||:+|\cdot|:\mathbb{C}\rightarrow\mathbb{R}_{+} denotes the modulus on the field \mathbb{C}. From [8, Theorem 18.11], we have that 𝒥1(H)\mathscr{J}_{1}(H) is a two-sided *-ideal in (H)\mathscr{L}(H), meaning that for any 𝐔(H)\mathbf{U}\in\mathscr{L}(H) and 𝐕𝒥1(H)\mathbf{V}\in\mathscr{J}_{1}(H), we have that 𝐔𝐕𝒥1(H)\mathbf{U}\mathbf{V}\in\mathscr{J}_{1}(H) and 𝐕𝐔𝒥1(H)\mathbf{V}\mathbf{U}\in\mathscr{J}_{1}(H).

Using the operator trace and the fact that 𝒥1(H)\mathscr{J}_{1}(H) is a two-sided *-ideal in (H)\mathscr{L}(H), we are able to induce the following definition of the duality pairing ,:(H)×𝒥1(H)\left<\cdot,\cdot\right>:\mathscr{L}(H)\times\mathscr{J}_{1}(H)\rightarrow\mathbb{C} as follows

𝐔,𝐕:=trace(𝐔𝐕)\left<\mathbf{U},\mathbf{V}\right>:=\texttt{trace}\left(\mathbf{U}^{*}\mathbf{V}\right)

for all 𝐔(H)\mathbf{U}\in\mathscr{L}(H) and 𝐕𝒥1(H)\mathbf{V}\in\mathscr{J}_{1}(H). There is a one-to-one correspondence (i.e. an isometric isomorphism) between 𝐔,:𝒥1(H)\left<\mathbf{U},\cdot\right>:\mathscr{J}_{1}(H)\rightarrow\mathbb{C} and 𝒥1(H)\mathscr{J}_{1}(H)^{\prime}, the dual space of 𝒥1(H)\mathscr{J}_{1}(H) [8, Theorem 19.2]. From the definition of the operator and the invariance of conjugation under the trace operation, we have that

(1) 𝐔,𝐕=𝐈,𝐔𝐕=𝐈,𝐕𝐔=𝐕,𝐔,\left<\mathbf{U},\mathbf{V}\right>=\left<\mathbf{I},\mathbf{U}^{*}\mathbf{V}\right>=\left<\mathbf{I},\mathbf{V}^{*}\mathbf{U}\right>=\left<\mathbf{V}^{*},\mathbf{U}^{*}\right>,

where we have denoted 𝐈\mathbf{I} as the identity element in (H)\mathscr{L}\left({H}\right). This identity will be utilized frequently in the derivation of the first-order optimality conditions associated with the penalized optimization problems studied in §4 and §5.

2.1.1. Symmetric Operators

Of particular interest in this work is the subspace of symmetric operators in 𝒥1(H)\mathscr{J}_{1}\left({H}\right) and (H)\mathscr{L}\left({H}\right). An operator 𝐓(H)\mathbf{T}\in\mathscr{L}\left({H}\right) is symmetric if

(ϕ,𝐓ψ)H=(ϕ,𝐓ψ)H\left({\phi,\mathbf{T}\psi}\right)_{H}=\left({\phi,\mathbf{T}^{*}\psi}\right)_{H}

for all ϕ,ψH\phi,\psi\in H. The subspace of all symmetric operators in (H)\mathscr{L}\left({H}\right) will be denoted as s(H)\mathscr{L}^{s}\left({H}\right). We will then define 𝒥1s(H):=𝒥1(H)(H)\mathscr{J}_{1}^{s}(H):=\mathscr{J}_{1}(H)\cap\mathscr{L}(H) as the space of symmetric operators in 𝒥1(H)\mathscr{J}_{1}(H). The definition of the norm for the spaces s(H)\mathscr{L}^{s}(H) and 𝒥1s(H)\mathscr{J}_{1}^{s}(H) coincides with the (H)\mathscr{L}(H) and 𝒥(H)\mathscr{J}(H) norms respectively.

Semi-definite operators arise frequently during the course of the discussion presented in this work. A symmetric operator 𝐓s(H)\mathbf{T}\in\mathscr{L}^{s}(H) is positive semi-definite if

(ϕ,𝐓ϕ)H0\left({\phi,\mathbf{T}\phi}\right)_{H}\geq 0

for all ϕH\phi\in H. A symmetric operator is then positive definite if the inequality is strict.

2.2. Exponentially Stable C0C_{0}-Semigroups

Following [12, §2], we define a C0C_{0}-semigroup as a one-parameter family of operators {𝐒(t)(H):t+0}\left\{\mathbf{S}(t)\in\mathscr{L}(H):t\in\mathbb{R}_{+}\cup{0}\right\} that satisfies the following

  1. i)

    𝐒(t)𝐒(s)=𝐒(t+s)\mathbf{S}(t)\mathbf{S}(s)=\mathbf{S}(t+s) for each t,s+t,s\in\mathbb{R}_{+},

  2. ii)

    𝐒(0)=𝐈\mathbf{S}(0)=\mathbf{I}, and

  3. iii)

    𝐒(t)ϕH\mathbf{S}(t)\phi\in H is norm-continuous with respect to t+t\in\mathbb{R}_{+} for all ϕH\phi\in H.

A C0C_{0}-semigroup is then said to be exponentially stable if there exists positive constants M,α+M,\alpha\in\mathbb{R}_{+} satisfying

(2) 𝐒(t)(H)Meαt\left\|\mathbf{S}(t)\right\|_{\mathscr{L}(H)}\leq Me^{-\alpha t}

for any t+t\in\mathbb{R}_{+}. An (unbounded) operator 𝐀\mathbf{A} is said to be a generator of 𝐒(t)\mathbf{S}(t) if

(3) 𝐀ϕ=limh0+h1[𝐒(h)ϕϕ]H\mathbf{A}\phi=\lim_{h\rightarrow 0^{+}}h^{-1}\left[\mathbf{S}(h)\phi-\phi\right]\in H

for all ϕ𝒟(𝐀)\phi\in\mathcal{D}(\mathbf{A}), where

𝒟(𝐀):={ϕH:𝐀ϕH<}.\mathcal{D}(\mathbf{A}):=\left\{\phi\in H:\left\|\mathbf{A}\phi\right\|_{H}<\infty\right\}.

The analysis provided in the proof of [12, Theorem 2.6] indicates that 𝒟(𝐀)\mathcal{D}(\mathbf{A}) is densely defined in HH.

The adjoint of 𝐒(t)\mathbf{S}(t), denoted by 𝐒(t)\mathbf{S}^{*}(t), is also a bounded operator on HH. This is easily observed by the following

(ϕ,𝐒(t)ψ)H=(𝐒(t)ϕ,ψ)H\left({\phi,\mathbf{S}(t)\psi}\right)_{H}=\left({\mathbf{S}^{*}(t)\phi,\psi}\right)_{H}

for all ϕ,ψH\phi,\psi\in H and all t+t\in\mathbb{R}_{+}. This identity indicates also that

𝐒(t)HMeαt\left\|\mathbf{S}^{*}(t)\right\|_{H}\leq Me^{-\alpha t}

through the induced operator norm. From [12, Thoerem 4.3], we have that 𝐀\mathbf{A}^{*}, the adjoint operator of 𝐀\mathbf{A}, is the generator of 𝐒(t)\mathbf{S}^{*}(t), i.e.

(4) 𝐀ψ=limh0+h1[𝐒(h)ψψ]H\mathbf{A}^{*}\psi=\lim_{h\rightarrow 0^{+}}h^{-1}\left[\mathbf{S}^{*}(h)\psi-\psi\right]\in H

for all ψ𝒟(𝐀)\psi\in\mathcal{D}(\mathbf{A}^{*}), where we have defined

𝒟(𝐀):={ϕH:limh0+h1(𝐒(h)ϕϕ)H}.\mathcal{D}(\mathbf{A}^{*}):=\left\{\phi\in H:\lim_{h\rightarrow 0^{+}}h^{-1}\left({\mathbf{S}^{*}(h)\phi-\phi}\right)\in H\right\}.

𝒟(𝐀)\mathcal{D}(\mathbf{A}^{*}) is also a densely defined subset of HH.

Bounded perturbations to a generator of an exponentially stable C0C_{0}-semigroup also generates a semigroup, i.e. if 𝐀:𝒟(𝐀)H\mathbf{A}:\mathcal{D}(\mathbf{A})\rightarrow H generates a semigroup 𝐒(t)\mathbf{S}(t), then 𝐀𝐓\mathbf{A}-\mathbf{T}, where 𝐓(H)\mathbf{T}\in\mathscr{L}\left({H}\right) is a bounded positive semi-definite operator, also generates a C0C_{0}-semigroup [12, Theorem 6.4]. The perturbation semigroup generated by 𝐀𝐓\mathbf{A}-\mathbf{T} is then also exponentially stable since 𝐓(H)\mathbf{T}\in\mathscr{L}\left({H}\right) is positive semi-definite.

3. The Operator-Valued Riccati Equation and Its Dual Equation

This section is dedicated to the study of the strong-form of the operator-valued Riccati equation and its dual equation that arises from the derivation of the first-order optimality system associated with the constrained optimization problems discussed in §4 and §5. The focus of this section is on determining the well-posedness and the Lipschitz continuity (with respect to varying control parameters p𝒫p\in\mathcal{P}) of these equations. In this section and the remainder of this work, we will take 𝐆:=𝐁𝐑1𝐁\mathbf{G}:=\mathbf{B}\mathbf{R}^{-1}\mathbf{B}^{*} and 𝐆p\mathbf{G}_{p} to be its parametrized analog to simplify notation.

3.1. Strong Operator-Valued Riccati Equation

We now motivate the definition of the strong form of the operator-valued Riccati equation. This form is essential in defining the Lagrangian first-order optimality system that we use to determine the well-posedness of the penalized weighted trace minimization problems studied in this work. We begin with the following.

Definition 1.

A symmetric positive semi-definite operator 𝐗s(H)\mathbf{X}\in\mathscr{L}^{s}(H) is said to be a solution of the strong operator-valued Riccati equation if it satisfies

(5) 𝐀𝐗+𝐗𝐀𝐗𝐆𝐗+𝐐=𝟎\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}\mathbf{X}+\mathbf{Q}=\mathbf{0}

in the (H)\mathscr{L}(H) topology (i.e. 𝐀𝐗+𝐗𝐀𝐗𝐆𝐗+𝐐(H)=0\left\|\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}\mathbf{X}+\mathbf{Q}\right\|_{\mathscr{L}(H)}=0) with the additional condition that 𝐀𝐗+𝐗𝐀s(H)\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}\in\mathscr{L}^{s}(H), where 𝐀:𝒟(𝐀)H\mathbf{A}:\mathcal{D}(\mathbf{A})\rightarrow H is the generator of an exponentially stable C0C_{0}-semigroup semigroup and the coefficient operators 𝐆,𝐐s(H)\mathbf{G},\mathbf{Q}\in\mathscr{L}^{s}(H) are symmetric positive semi-definite.

In [3, Chapter IV-1 Section 3] the notion of strict and classical solutions are presented to describe solutions to (5). The analysis regarding these solutions was done in the context of 𝐀𝐗+𝐗𝐀\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*} being an operator on 𝒟(𝐀)\mathcal{D}(\mathbf{A}^{*}). In contrast, we have determine in Theorem 1 that 𝐀𝐗+𝐗𝐀\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*} is actually a bounded operator on all of HH. This finding opens up the possibility of utilizing trace-class operator theory in the analysis of operator-valued Riccati equations without reformulating it as a Bochner integral equation as done in [5]. This is the key result that enables the derivation of the first-order optimality systems utilized in this work.

We now determine that the strong form of the operator-valued Riccati equation (5) is well-defined and is also equivalent to the Bochner integral form of the operator-valued Riccati equation (c.f. [5]) in the following. It is further determined that the solution to (5) is a trace-class operator if 𝐐𝒥1s(H)\mathbf{Q}\in\mathscr{J}_{1}^{s}(H).

Theorem 1.

Assume that 𝐀:𝒟(𝐀)H\mathbf{A}:\mathcal{D}(\mathbf{A})\rightarrow H is the generator of an exponentially stable C0C_{0}-semigroup 𝐒(t)(H)\mathbf{S}(t)\in\mathscr{L}(H), 𝐐,𝐆s(H)\mathbf{Q},\mathbf{G}\in\mathscr{L}^{s}(H) are symmetric positive semi-definite operators. Then the unique positive semi-definite solution 𝐗s(H)\mathbf{X}\in\mathscr{L}^{s}(H) to following Bochner integral equation

(6) 𝐗=0+𝐒(t)(𝐐𝐗𝐆𝐗)𝐒(t)𝑑t\mathbf{X}=\int_{0}^{+\infty}\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(t)dt

is the unique solution to (5).

Furthermore, if we assume that 𝐐𝒥1s(H)\mathbf{Q}\in\mathscr{J}^{s}_{1}(H), then we have that 𝐗𝒥1s(H)\mathbf{X}\in\mathscr{J}_{1}^{s}(H) is a trace class operator that satisfies the following

(7) 𝐗1M22α𝐐1,\left\|\mathbf{X}\right\|_{1}\leq\frac{M^{2}}{2\alpha}\left\|\mathbf{Q}\right\|_{1},

where M,α+M,\alpha\in\mathbb{R}_{+} are the constants associated with the stability bounds for 𝐒(t)(H)\mathbf{S}(t)\in\mathscr{L}(H) given in (2).

Proof.

Let 𝐗s(H)\mathbf{X}\in\mathscr{L}^{s}(H) be the defined by (6) and let 𝒟(𝐀𝐗+𝐗𝐀)\mathcal{D}(\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}) be the domain of 𝐀𝐗+𝐗𝐀\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}, i.e.

𝒟(𝐀𝐗+𝐗𝐀):={ϕH:[𝐀𝐗+𝐗𝐀]ϕH}.\mathcal{D}\left({\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}}\right):=\left\{\phi\in H:\left[\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}\right]\phi\in H\right\}.

We will demonstrate that 𝒟(𝐀𝐗+𝐗𝐀)=H\mathcal{D}(\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*})=H. Using the definition of the infinitesimal generators 𝐀\mathbf{A} and 𝐀\mathbf{A}^{*} given in (3) and (4) respectively and the definition of 𝐗s(H)\mathbf{X}\in\mathscr{L}^{s}(H) given by (6), we have that

(𝐀𝐗+𝐗𝐀)ζ\displaystyle\left({\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}}\right)\zeta
=0+[𝐀𝐒(t)(𝐐𝐗𝐆𝐗)𝐒(t)+𝐒(t)(𝐐𝐗𝐆𝐗)𝐒(t)𝐀]ζ𝑑t\displaystyle\quad=\int_{0}^{+\infty}\left[\mathbf{A}\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(t)+\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(t)\mathbf{A}^{*}\right]\zeta dt
=limh0+h10+[(𝐒(t+h)𝐒(t))(𝐐𝐗𝐆𝐗)𝐒(t+h)+𝐒(t)(𝐐𝐗𝐆𝐗)(𝐒(t+h)𝐒(t))]ζ𝑑t\displaystyle\quad=\lim_{h\rightarrow 0^{+}}h^{-1}\int_{0}^{+\infty}\left[\left({\mathbf{S}(t+h)-\mathbf{S}(t)}\right)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}(t+h)+\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\left({\mathbf{S}^{*}(t+h)-\mathbf{S}^{*}(t)}\right)\right]\zeta dt
=limh0+h10+[𝐒(t+h)(𝐐𝐗𝐆𝐗)𝐒(t+h)𝐒(t)(𝐐𝐗𝐆𝐗)𝐒(t)]ζ𝑑t\displaystyle\quad=\lim_{h\rightarrow 0^{+}}h^{-1}\int_{0}^{+\infty}\left[\mathbf{S}(t+h)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}(t+h)-\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(t)\right]\zeta dt

for all ζ𝒟(𝐀𝐗+𝐗𝐀)\zeta\in\mathcal{D}(\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}) . Notice here that we may already take ζ\zeta to be in HH without consequence since 𝐒(t)\mathbf{S}(t) and 𝐒(t)\mathbf{S}^{*}(t) are bounded operators on HH. Continuing, we have that

limh0+h10+[𝐒(t+h)(𝐐𝐗𝐆𝐗)𝐒(t+h)𝐒(t)(𝐐𝐗𝐆𝐗)𝐒(t)]ζ𝑑t\displaystyle\lim_{h\rightarrow 0^{+}}h^{-1}\int_{0}^{+\infty}\left[\mathbf{S}(t+h)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(t+h)-\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(t)\right]\zeta dt
=limh0+h1{limτ+0τ[𝐒(t+h)(𝐐𝐗𝐆𝐗)𝐒(t+h)𝐒(t)(𝐐𝐗𝐆𝐗)𝐒(t)]ζ𝑑t}\displaystyle\quad=\lim_{h\rightarrow 0^{+}}h^{-1}\left\{\lim_{\tau\rightarrow+\infty}\int_{0}^{\tau}\left[\mathbf{S}(t+h)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(t+h)-\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(t)\right]\zeta dt\right\}
=limh0+h1{limτ+[hτ+h𝐒(t)(𝐐𝐗𝐆𝐗)𝐒(t)ζ𝑑t0τ𝐒(t)(𝐐𝐗𝐆𝐗)𝐒(t)ζ𝑑t]}\displaystyle\quad=\lim_{h\rightarrow 0^{+}}h^{-1}\left\{\lim_{\tau\rightarrow+\infty}\left[\int_{h}^{\tau+h}\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(t)\zeta dt-\int_{0}^{\tau}\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(t)\zeta dt\right]\right\}
=limh0+h1[limτ+ττ+h𝐒(t)(𝐐𝐗𝐆𝐗)𝐒(t)ζ𝑑t0h𝐒(t)(𝐐𝐗𝐆𝐗)𝐒(t)ζ𝑑t]\displaystyle\quad=\lim_{h\rightarrow 0^{+}}h^{-1}\left[\lim_{\tau\rightarrow+\infty}\int_{\tau}^{\tau+h}\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(t)\zeta dt-\int_{0}^{h}\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(t)\zeta dt\right]
=limτ+[limh0+h1ττ+h𝐒(t)(𝐐𝐗𝐆𝐗)𝐒(t)ζ𝑑t]𝐐ζ+𝐗𝐆𝐗ζ\displaystyle=\lim_{\tau\rightarrow+\infty}\left[\lim_{h\rightarrow 0^{+}}h^{-1}\int_{\tau}^{\tau+h}\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(t)\zeta dt\right]-\mathbf{Q}\zeta+\mathbf{X}\mathbf{G}\mathbf{X}\zeta
=limτ+[𝐒(τ)(𝐐𝐗𝐆𝐗)𝐒(τ)ζ]𝐐ζ+𝐗𝐆𝐗ζ\displaystyle=\lim_{\tau\rightarrow+\infty}\left[\mathbf{S}(\tau)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(\tau)\zeta\right]-\mathbf{Q}\zeta+\mathbf{X}\mathbf{G}\mathbf{X}\zeta
=𝐐ζ+𝐗𝐆𝐗ζ\displaystyle=-\mathbf{Q}\zeta+\mathbf{X}\mathbf{G}\mathbf{X}\zeta

for all ζH\zeta\in H after applying the fact that 𝐒(t)(H)\mathbf{S}(t)\in\mathscr{L}(H), and consequently also 𝐒(t)(H)\mathbf{S}^{*}(t)\in\mathscr{L}(H), vanish in the t+t\rightarrow+\infty limit as a consequence of their exponential stability. The interchange of limits used in the above sequence of equalities is allowable owing to the continuity of the function that the limiting operations are applied to. We have therefore demonstrated that

(8) (𝐀𝐗+𝐗𝐀)ζ=(𝐐+𝐗𝐆𝐗)ζ\left({\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}}\right)\zeta=\left({-\mathbf{Q}+\mathbf{X}\mathbf{G}\mathbf{X}}\right)\zeta

for all ζH\zeta\in H. This then implies that 𝐗s(H)\mathbf{X}\in\mathscr{L}^{s}(H) defined as a solution to (6) also necessarily satisfies (5).

Next, we demonstrate that the solution of the strong operator-valued Riccati equation (5) satisfies (6). To do this, we derive the following weak form of the operator-value Riccati equation from (5)

(9) ([𝐀𝐗+𝐗𝐀𝐗𝐆𝐗+𝐐]ϕ,ψ)H=0,\left({\left[\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}\mathbf{X}+\mathbf{Q}\right]\phi,\psi}\right)_{H}=0,

for all ϕ,ψ𝒟(𝐀)\phi,\psi\in\mathcal{D}(\mathbf{A}^{*}). It then follows from [6, Proposition 4] that (9) is equivalent to the Bochner integral form of the operator-valued Riccati equation (6). Therefore, a solution 𝐗s(H)\mathbf{X}\in\mathscr{L}^{s}(H) satisfying (6) also satisfies (5). The uniqueness of 𝐗s(H)\mathbf{X}\in\mathscr{L}^{s}(H) that satisfies (5) is a consequence of the well-posedness of (6) determined in [5]. This argument presented in previous paragraph along with this paragraph indicates that there exists only one positive semi-definite solution to (5) and the solution coincides with the solution of (6). Symmetry of 𝐗\mathbf{X} is easily determined through inspection taking the adjoint of both sides of (5).

We conclude the proof by deriving the solution bound (7) under the assumption that 𝐐𝒥1s(H)\mathbf{Q}\in\mathscr{J}_{1}^{s}(H). To that end, we consider the following weak integral form of the operator-valued Riccati equation.

(10) (ψ,𝐗ϕ)H=0+(ψ,𝐒(t)(𝐐𝐗𝐆𝐗)𝐒(t)ϕ)H𝑑t\left({\psi,\mathbf{X}\phi}\right)_{H}=\int_{0}^{+\infty}\left({\psi,\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(t)\phi}\right)_{H}dt

for all ϕ,ψH\phi,\psi\in H. It is clear that the solution 𝐗s(H)\mathbf{X}\in\mathscr{L}^{s}(H) to (6) also satisfies (10). By choosing ϕ=ψ=ei\phi=\psi=e_{i}, where {ei}i=1\left\{e_{i}\right\}_{i=1}^{\infty} forms an orthonormal basis of HH, we then have that

(ei,𝐗ei)H+0+(ei,𝐒(t)𝐗𝐆𝐗𝐒(t),ei)H𝑑t=0+(ei,𝐒(t)𝐐𝐒(t)ei)H𝑑t.\left({e_{i},\mathbf{X}e_{i}}\right)_{H}+\int_{0}^{+\infty}\left({e_{i},\mathbf{S}(t)\mathbf{X}\mathbf{G}\mathbf{X}\mathbf{S}^{*}(t),e_{i}}\right)_{H}dt=\int_{0}^{+\infty}\left({e_{i},\mathbf{S}(t)\mathbf{Q}\mathbf{S}^{*}(t)e_{i}}\right)_{H}dt.

Since 𝐗s(H)\mathbf{X}\in\mathscr{L}^{s}(H) and 𝐆s(H)\mathbf{G}\in\mathscr{L}^{s}(H) are symmetric positive semi-definite, we have that both terms in the left hand side of the above equation are nonnegative. It then follows that

(ei,𝐒(t)𝐗𝐆𝐗𝐒(t)ei)H=𝐆12𝐗𝐒(t)eiH20,\left({e_{i},\mathbf{S}(t)\mathbf{X}\mathbf{G}\mathbf{X}\mathbf{S}^{*}(t)e_{i}}\right)_{H}=\left\|\mathbf{G}^{\frac{1}{2}}\mathbf{X}\mathbf{S}^{*}(t)e_{i}\right\|_{H}^{2}\geq 0,

where 𝐆12(H)\mathbf{G}^{\frac{1}{2}}\in\mathscr{L}(H) denotes the operator square root [1, Theorem 7.38] of 𝐆s(H)\mathbf{G}\in\mathscr{L}^{s}(H). From this, it follows that

(ei,𝐗ei)H0+(ei,𝐒(t)𝐐𝐒(t)ei)H𝑑s,\left({e_{i},\mathbf{X}e_{i}}\right)_{H}\leq\int_{0}^{+\infty}\left({e_{i},\mathbf{S}(t)\mathbf{Q}\mathbf{S}^{*}(t)e_{i}}\right)_{H}ds,

where summing both sides over all ii\in\mathbb{N} yields

trace(𝐗)\displaystyle\texttt{trace}\left(\mathbf{X}\right) 0+trace(𝐒(t)𝐐𝐒(t))𝑑t\displaystyle\leq\int_{0}^{+\infty}\texttt{trace}\left(\mathbf{S}(t)\mathbf{Q}\mathbf{S}^{*}(t)\right)dt
0+𝐒(t)(H)2𝐐1𝑑s\displaystyle\leq\int_{0}^{+\infty}\left\|\mathbf{S}^{*}(t)\right\|^{2}_{\mathscr{L}\left({H}\right)}\left\|\mathbf{Q}\right\|_{1}ds
M2𝐐10+e2αt𝑑t\displaystyle\leq M^{2}\left\|\mathbf{Q}\right\|_{1}\int_{0}^{+\infty}e^{-2\alpha t}dt
=M22α𝐐𝒥1(H),\displaystyle=\frac{M^{2}}{2\alpha}\left\|\mathbf{Q}\right\|_{\mathscr{J}_{1}\left({H}\right)},

and hence 𝐗𝒥1s(H)\mathbf{X}\in\mathscr{J}_{1}^{s}(H) and (7) follows from seeing that trace(𝐗)=𝐗1\texttt{trace}\left(\mathbf{X}\right)=\left\|\mathbf{X}\right\|_{1} because 𝐗𝒥1s(H)\mathbf{X}\in\mathscr{J}_{1}^{s}(H) is symmetric positive semi-definite. Since 𝐗𝒥1s(H)\mathbf{X}\in\mathscr{J}_{1}^{s}(H), we have that 𝐀𝐗+𝐗𝐀𝒥1s(H)\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}\in\mathscr{J}_{1}^{s}(H). ∎

We proceed with a discussion on the strong form of the Sylvester equation in the following subsection.

3.2. Sylvester’s Equation

In the analysis presented in the following subsections, we frequently encounter the strong form of the operator-valued Sylvester equation (it is “strong” in the same sense (5) is the strong form of the operator-valued Riccati equation), given by

(11) 𝐀1𝐓+𝐓𝐀2=𝐏\mathbf{A}_{1}\mathbf{T}+\mathbf{T}\mathbf{A}_{2}^{*}=\mathbf{P}

where 𝐀1:=𝐀𝐊1\mathbf{A}_{1}:=\mathbf{A}-\mathbf{K}_{1} and 𝐀2:𝐀𝐊2\mathbf{A}_{2}:\mathbf{A}-\mathbf{K}_{2} defined as generators of exponentially stable perturbation C0C_{0}-semigroups 𝐒1(t)(H)\mathbf{S}_{1}(t)\in\mathscr{L}\left({H}\right) and 𝐒2(t)(H)\mathbf{S}_{2}(t)\in\mathscr{L}\left({H}\right) respectively for all t+t\in\mathbb{R}_{+} with 𝐊1,𝐊2(H)\mathbf{K}_{1},\mathbf{K}_{2}\in\mathscr{L}\left({H}\right) being positive semi-definite operators. Because of the way 𝐀1,𝐀2\mathbf{A}_{1},\mathbf{A}_{2} are defined, we have that their domains coincide with 𝒟(𝐀)\mathcal{D}(\mathbf{A}), and hence, (11) is well-defined. With this, we determine the following.

Lemma 1.

Assume that 𝐀1,𝐀2:𝒟(𝐀)H\mathbf{A}_{1},\mathbf{A}_{2}:\mathcal{D}(\mathbf{A})\rightarrow H are generators of exponentially stable C0C_{0}-semigroups 𝐒1(t)(H)\mathbf{S}_{1}(t)\in\mathscr{L}\left({H}\right) and 𝐒2(t)(H)\mathbf{S}_{2}(t)\in\mathscr{L}\left({H}\right) for all t+t\in\mathbb{R}_{+}, then the unique solution 𝐓(H)\mathbf{T}\in\mathscr{L}\left({H}\right) of (11) is given by the following Bochner integral representation

(12) 𝐓=0+𝐒1(t)𝐏𝐒2(t)𝑑t,\mathbf{T}=-\int_{0}^{+\infty}\mathbf{S}_{1}(t)\mathbf{P}\mathbf{S}_{2}^{*}(t)dt,

where we have assumed that 𝐏(H)\mathbf{P}\in\mathscr{L}\left({H}\right).

Proof.

We begin by first demonstrating that the integral in (12) is well-defined. This is done by showing that the norm of the integrand is integrable over all of +\mathbb{R}_{+} [10, Section 2, Theorem 2]. We verify this claim in the following.

0+𝐒1(t)𝐏𝐒2(t)(H)𝑑t\displaystyle\int_{0}^{+\infty}\left\|\mathbf{S}_{1}(t)\mathbf{P}\mathbf{S}_{2}^{*}(t)\right\|_{\mathscr{L}\left({H}\right)}dt 0+M1M2eα1teα2t𝐏(H)𝑑t\displaystyle\leq\int_{0}^{+\infty}M_{1}M_{2}e^{-\alpha_{1}t}e^{-\alpha_{2}t}\left\|\mathbf{P}\right\|_{\mathscr{L}\left({H}\right)}dt
M2𝐏(H)0+e2αt𝑑t\displaystyle\leq M_{*}^{2}\left\|\mathbf{P}\right\|_{\mathscr{L}\left({H}\right)}\int_{0}^{+\infty}e^{-2\alpha_{*}t}dt
M22α𝐏(H),\displaystyle\leq\frac{M_{*}^{2}}{2\alpha_{*}}\left\|\mathbf{P}\right\|_{\mathscr{L}\left({H}\right)},

where M1,M2,α1,α2+M_{1},M_{2},\alpha_{1},\alpha_{2}\in\mathbb{R}_{+} are the stability constants associated with the exponentially stable C0C_{0}-semigroups 𝐒1(t),𝐒2(t)(H)\mathbf{S}_{1}(t),\mathbf{S}_{2}(t)\in\mathscr{L}\left({H}\right) respectively, M:=max{M1,M2}M_{*}:=\max\left\{M_{1},M_{2}\right\} and α:=min{α1,α2}\alpha_{*}:=\min\left\{\alpha_{1},\alpha_{2}\right\}. Utilizing (12) in (11) and following a similar derivation as in the first paragraph of the proof of Theorem 1 demonstrates that 𝐓\mathbf{T} defined by (12) is a solution of the strong form of Sylvester’s equation. Uniqueness follows as a consequence of the linearity of the equation. ∎

We now move on to discuss the dual problem to (5) that arises in the first-order optimality system of the penalized optimization problems presented in §4 and §5.

3.3. Dual Problem

The dual problem arises in the derivation of the first-order optimality system by determining the Fréchet derivative of the Lagrangian saddle-point functional with respect to the primal variable 𝐗𝒥1s(H)\mathbf{X}\in\mathscr{J}_{1}^{s}(H). We will go through its derivation in §4.2.1. For now, we will simply state the strong form of the dual problem and determine the its well-posedness.

The dual problem to (5) is stated as follows: Seek a 𝚲s(H)\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H) that satisfies

(13) (𝐀𝐆𝐗)𝚲+𝚲(𝐀𝐗𝐆)=𝐖,\left({\mathbf{A}^{*}-\mathbf{G}\mathbf{X}}\right)\boldsymbol{\Lambda}+\boldsymbol{\Lambda}\left({\mathbf{A}-\mathbf{X}\mathbf{G}}\right)=-\mathbf{W},

where 𝐀\mathbf{A} and 𝐀\mathbf{A}^{*} are the generators of the exponentially stable C0C_{0}-semigroups 𝐒(t)\mathbf{S}(t) and 𝐒(t)\mathbf{S}^{*}(t) respectively, and 𝐆𝒥1s(H)\mathbf{G}\in\mathscr{J}_{1}^{s}(H) and 𝐖s(H)\mathbf{W}\in\mathscr{L}^{s}(H) are symmetric positive semi-definite operators, and 𝐗𝒥1s(H)\mathbf{X}\in\mathscr{J}_{1}^{s}(H) is the solution to (5). The well-posedness of (13) is determined in the following.

Lemma 2.

Let 𝚲s(H)\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H) be the solution to (13), then there exist positive constants M,α+M,\alpha\in\mathbb{R}_{+} satisfying

𝚲(H)M22α𝐖(H).\left\|\boldsymbol{\Lambda}\right\|_{\mathscr{L}\left({H}\right)}\leq\frac{M^{2}}{2\alpha}\left\|\mathbf{W}\right\|_{\mathscr{L}(H)}.

Furthermore, the solution 𝚲s(H)\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H) is symmetric positive semi-definite.

Proof.

Let 𝐓(t)(H)\mathbf{T}(t)\in\mathscr{L}\left({H}\right) be the exponentially stable C0C_{0}-semigroup generated by 𝐀𝐆𝐗\mathbf{A}^{*}-\mathbf{G}\mathbf{X}. Because (13) is a Sylvester equation, we have from Lemma 1 that 𝚲s(H)\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H) can be represented in the following Bochner integral form

(14) 𝚲=0+𝐓(t)𝐖𝐓(t)𝑑t.\boldsymbol{\Lambda}=\int_{0}^{+\infty}\mathbf{T}(t)\mathbf{W}\mathbf{T}^{*}(t)dt.

Because 𝐆𝐗(H)-\mathbf{G}\mathbf{X}\in\mathscr{L}\left({H}\right) is a stabilizing perturbation to 𝐀:𝒟(𝐀)H\mathbf{A}:\mathcal{D}(\mathbf{A})\rightarrow H, we have that 𝐓(t)(H)Meαt\left\|\mathbf{T}(t)\right\|_{\mathscr{L}\left({H}\right)}\leq Me^{-\alpha t}. It then follows that

𝚲(H)M2𝐖(H)0+e2αt𝑑t=M22α𝐖(H),\left\|\boldsymbol{\Lambda}\right\|_{\mathscr{L}\left({H}\right)}\leq M^{2}\left\|\mathbf{W}\right\|_{\mathscr{L}(H)}\int_{0}^{+\infty}e^{-2\alpha t}dt=\frac{M^{2}}{2\alpha}\left\|\mathbf{W}\right\|_{\mathscr{L}(H)},

from which the bound presented in the lemma is proven. The symmetric positive semi-definite nature of 𝚲s(H)\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H) comes from inspecting (14) where taking the adjoint of both sides of the equation immediately verifies this claim. ∎

3.4. Parametrized Control Device Operator

In many application problems, the operator 𝐆s(H)\mathbf{G}\in\mathscr{L}^{s}(H) is parametertrized by a set of parameters, i.e., it is a function that maps a parameter space 𝒫\mathcal{P} into the operator space s(H)\mathscr{L}^{s}(H). This parameter space corresponds to, e.g. control device placement locations [16] and geometric design parameters [11]. We formalize the definition of the parametrized operators 𝐆p\mathbf{G}_{p} for p𝒫p\in\mathcal{P} in the following.

Let 𝒫\mathcal{P} be a complex Hilbert space with its norm 𝒫\left\|\cdot\right\|_{\mathcal{P}} be induced by the inner product (,)𝒫\left({\cdot,\cdot}\right)_{\mathcal{P}} and 𝐆p𝒥1s(H)\mathbf{G}_{p}\in\mathscr{J}_{1}^{s}(H) be a trace-class symmetric positive semi-definite operator parametrized by a parameter p𝒫p\in\mathcal{P} so that the mapping p𝐆pp\mapsto\mathbf{G}_{p} is twice Fréchet differentiable with respect to p𝒫p\in\mathcal{P} and that its first derivative 𝐆pp()\frac{\partial\mathbf{G}_{p}}{\partial p}\left({\cdot}\right) is bounded as an operator in (𝒫;𝒥1s(H))\mathscr{L}\left({\mathcal{P};\mathscr{J}_{1}^{s}(H)}\right) and the second derivative 2𝐆pp2(,)\frac{\partial^{2}\mathbf{G}_{p}}{\partial p^{2}}(\cdot,\cdot) is bounded in (𝒫;(𝒫;𝒥1s(H)))\mathscr{L}\left({\mathcal{P};\mathscr{L}\left({\mathcal{P};\mathscr{J}_{1}^{s}(H)}\right)}\right). The first assumption we make on 𝐆()\mathbf{G}_{\left({\cdot}\right)} is that it is uniformly bounded with respect to any p𝒫p\in\mathcal{P}, i.e. there exists a positive constant g+g\in\mathbb{R}_{+} that satisfies

(15) 𝐆p1g\left\|\mathbf{G}_{p}\right\|_{1}\leq g

for all p𝒫p\in\mathcal{P}. Because we have assumed that 𝐆p𝒥1s(H)\mathbf{G}_{p}\in\mathscr{J}_{1}^{s}(H) is Fréchet differentiable for all p𝒫p\in\mathcal{P}, it follows that it is Lipschitz continuous, i.e. there exists a positive constant L𝐆+L_{\mathbf{G}}\in\mathbb{R}_{+} so that

(16) 𝐆p1𝐆p21L𝐆p1p2𝒫\left\|\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}}\right\|_{1}\leq L_{\mathbf{G}}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}

for all p1,p2𝒫p_{1},p_{2}\in\mathcal{P}. To reduce notational clutter, we will denote 𝐝𝐆p():=𝐆pp()\mathbf{d}\mathbf{G}_{p}\left({\cdot}\right):=\frac{\partial\mathbf{G}_{p}}{\partial p}(\cdot). We will further assume that 𝐝𝐆p\mathbf{d}\mathbf{G}_{p} is Lipschitz continuous, i.e. there exists a positive constant L𝐝𝐆+L_{\mathbf{d}\mathbf{G}}\in\mathbb{R}_{+} that satisfies

(17) 𝐝𝐆p1𝐝𝐆p2(𝒫;𝒥1s(H))L𝐝𝐆p1p2𝒫\left\|\mathbf{d}\mathbf{G}_{p_{1}}-\mathbf{d}\mathbf{G}_{p_{2}}\right\|_{\mathscr{L}\left({\mathcal{P};\mathscr{J}_{1}^{s}(H)}\right)}\leq L_{\mathbf{d}\mathbf{G}}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}

for all p1,p2𝒫p_{1},p_{2}\in\mathcal{P}. We will further assume that

(18) 𝐝𝐆p0\mathbf{d}\mathbf{G}_{p}\neq 0

for any p𝒫p\in\mathcal{P} and that

(19) [𝐝𝐆p𝐝𝐆p]1(𝒫).\left[\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{d}\mathbf{G}_{p}\right]^{-1}\in\mathscr{L}\left({\mathcal{P}}\right).

Finally, we the twice differentiability assumption implies that

(20) 𝐝2𝐆p(q,r)<\mathbf{d}^{2}\mathbf{G}_{p}(q,r)<\infty

for all q,r𝒫q,r\in\mathcal{P}, where we have denoted 𝐝2𝐆p(,):=2𝐆pp2(,)\mathbf{d}^{2}\mathbf{G}_{p}(\cdot,\cdot):=\frac{\partial^{2}\mathbf{G}_{p}}{\partial p^{2}}(\cdot,\cdot). The satisfaction of this assumption is one of the necessary conditions for the second variation of Lagrangian functionals studied in this work to be bounded.

Under the parametrization of 𝐆p\mathbf{G}_{p} with respect to p𝒫p\in\mathcal{P}, we have that (𝐗,𝚲)𝒥1s(H)×s(H)\left({\mathbf{X},\boldsymbol{\Lambda}}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathscr{L}^{s}(H) is the solution to the following coupled equations

(21a) 𝐀𝐗+𝐗𝐀𝐗𝐆p𝐗+𝐐=𝟎\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}=\mathbf{0}
(21b) (𝐀𝐆p𝐗)𝚲+𝚲(𝐀𝐗𝐆p)=𝐖,\left({\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X}}\right)\boldsymbol{\Lambda}+\boldsymbol{\Lambda}\left({\mathbf{A}-\mathbf{X}\mathbf{G}_{p}}\right)=-\mathbf{W},

where 𝐗=𝐗(p)\mathbf{X}=\mathbf{X}(p) and 𝚲=𝚲(p)\boldsymbol{\Lambda}=\boldsymbol{\Lambda}(p). We determine in the following that both 𝐗𝒥1s(H)\mathbf{X}\in\mathscr{J}_{1}^{s}(H) and 𝚲s(H)\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H) are Lipschitz continuous functions of p𝒫p\in\mathcal{P}.

3.5. Lipschitz Continuity of the Primal and Dual Solutions

In each of the first-order optimality systems associated with their penalized constrained optimization problems lies a fixed-point equation that must be satisfied. We will determine that the primal problem (21a) and the dual problem (21b) are Lipschitz continuous functions of p𝒫p\in\mathcal{P} if 𝐆():𝒫𝒥1s(H)\mathbf{G}_{\left({\cdot}\right)}:\mathcal{P}\rightarrow\mathscr{J}_{1}^{s}(H) satisfies the assumptions prescribed in §3.4. These Lipschitz continuity bounds will then be used to determine that each fixed-point equation has only one fixed point. A more detailed discussion of these fixed point problems will be presented in §4 and §5 respectively. For now, we focus exclusively on proving that 𝐗𝒥1s(H)\mathbf{X}\in\mathscr{J}_{1}^{s}(H) and 𝚲s(H)\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H) are Lipschitz continuous with respect to p𝒫p\in\mathcal{P}.

Throughout this section, we will denote p1,p2𝒫p_{1},p_{2}\in\mathcal{P} as any two arbitrary parameters and 𝐗1,𝐗2𝒥1s(H)\mathbf{X}_{1},\mathbf{X}_{2}\in\mathscr{J}_{1}^{s}(H) to be the solutions to

(22) 𝐀𝐗i+𝐗i𝐀𝐗i𝐆pi𝐗i+𝐐=𝟎\mathbf{A}\mathbf{X}_{i}+\mathbf{X}_{i}\mathbf{A}^{*}-\mathbf{X}_{i}\mathbf{G}_{p_{i}}\mathbf{X}_{i}+\mathbf{Q}=\mathbf{0}

for i=1,2i=1,2. Likewise, we denote 𝚲1,𝚲2s(H)\boldsymbol{\Lambda}_{1},\boldsymbol{\Lambda}_{2}\in\mathscr{L}^{s}(H) to be the solutions to

(23) (𝐀𝐆pi𝐗i)𝚲i+𝚲i(𝐀𝐗i𝐆pi)=𝐖\left({\mathbf{A}^{*}-\mathbf{G}_{p_{i}}\mathbf{X}_{i}}\right)\boldsymbol{\Lambda}_{i}+\boldsymbol{\Lambda}_{i}\left({\mathbf{A}-\mathbf{X}_{i}\mathbf{G}_{p_{i}}}\right)=-\mathbf{W}

for i=1,2i=1,2. We begin our analysis by determining that 𝐗(p)\mathbf{X}(p) is a Lipschitz continuous function of p𝒫p\in\mathcal{P} in the following.

Lemma 3.

Assume that 𝐀:𝒟(𝐀)H\mathbf{A}:\mathcal{D}(\mathbf{A})\rightarrow H is the generator of an exponentially stable C0C_{0}-semigroup and 𝐐𝒥1s(H)\mathbf{Q}\in\mathscr{J}_{1}^{s}(H) is a symmetric positive semi-definite operator. Further assume that 𝐆():𝒫𝒥1s(H)\mathbf{G}_{\left({\cdot}\right)}:\mathcal{P}\rightarrow\mathscr{J}_{1}^{s}(H) is Lipshitz continuous on 𝒫\mathcal{P} satisfying (16). Then the solution 𝐗𝒥1s(H)\mathbf{X}\in\mathscr{J}_{1}^{s}(H) to (22) is a Lipschitz continuous function of p𝒫p\in\mathcal{P}. Furthermore, there exists positive constants L𝐆,M,α+L_{\mathbf{G}},M,\alpha\in\mathbb{R}_{+} so that

𝐗1𝐗21L𝐆M68α3𝐐12p1p2𝒫,\left\|\mathbf{X}_{1}-\mathbf{X}_{2}\right\|_{1}\leq\frac{L_{\mathbf{G}}M^{6}}{8\alpha^{3}}\left\|\mathbf{Q}\right\|^{2}_{1}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}},

where we have denoted 𝐗1,𝐗2𝒥1s(H)\mathbf{X}_{1},\mathbf{X}_{2}\in\mathscr{J}_{1}^{s}(H) to be the solution to (22) with the coefficient operators 𝐆p1,𝐆p2𝒥1s(H)\mathbf{G}_{p_{1}},\mathbf{G}_{p_{2}}\in\mathscr{J}_{1}^{s}(H) determined by p1,p2𝒫p_{1},p_{2}\in\mathcal{P} respectively.

Proof.

We begin by taking the difference between the equations for 𝐗1𝒥1s(H)\mathbf{X}_{1}\in\mathscr{J}_{1}^{s}\left({H}\right) and 𝐗2𝒥1s(H)\mathbf{X}_{2}\in\mathscr{J}_{1}^{s}(H) respectively (see (22)), We have that the difference 𝐗1𝐗2𝒥1s(H)\mathbf{X}_{1}-\mathbf{X}_{2}\in\mathscr{J}_{1}^{s}\left({H}\right) is then the solution to the following Sylvester equation

[𝐀𝐗1𝐆p1](𝐗1𝐗2)+(𝐗1𝐗2)[𝐀𝐆p2𝐗2]=𝐗1(𝐆p1𝐆p2)𝐗2.\left[\mathbf{A}-\mathbf{X}_{1}\mathbf{G}_{p_{1}}\right]\left({\mathbf{X}_{1}-\mathbf{X}_{2}}\right)+\left({\mathbf{X}_{1}-\mathbf{X}_{2}}\right)\left[\mathbf{A}^{*}-\mathbf{G}_{p_{2}}\mathbf{X}_{2}\right]=\mathbf{X}_{1}\left({\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}}}\right)\mathbf{X}_{2}.

It then follows from Lemma 1 that the solution to the above equation satisfies the following integral equation

(24) (𝐗1𝐗2)=0+𝐓1(t)𝐗1(𝐆p1𝐆p2)𝐗2𝐓2(t)𝑑t,\left({\mathbf{X}_{1}-\mathbf{X}_{2}}\right)=-\int_{0}^{+\infty}\mathbf{T}_{1}(t)\mathbf{X}_{1}\left({\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}}}\right)\mathbf{X}_{2}\mathbf{T}_{2}(t)dt,

where 𝐓1(t),𝐓2(t)(H)\mathbf{T}_{1}(t),\mathbf{T}_{2}(t)\in\mathscr{L}\left({H}\right) are the C0C_{0}-semigroups generated by [𝐀𝐗1𝐆p1]:𝒟(𝐀)H\left[\mathbf{A}-\mathbf{X}_{1}\mathbf{G}_{p_{1}}\right]:\mathcal{D}\left({\mathbf{A}}\right)\rightarrow H and [𝐀𝐆p2𝐗2]:𝒟(𝐀)H\left[\mathbf{A}^{*}-\mathbf{G}_{p_{2}}\mathbf{X}_{2}\right]:\mathcal{D}\left({\mathbf{A}^{*}}\right)\rightarrow H respectively.

Since 𝐀\mathbf{A} is the generator of an exponentially stable C0C_{0}-semigroup and that 𝐗1𝐆p1𝒥1s(H)\mathbf{X}_{1}\mathbf{G}_{p_{1}}\in\mathscr{J}_{1}^{s}(H) and 𝐆p2𝐗2𝒥1s(H)\mathbf{G}_{p_{2}}\mathbf{X}_{2}\in\mathscr{J}_{1}^{s}\left({H}\right) are bounded nonnegative operators, we have that [𝐀𝐗1𝐆p1]:𝒟(𝐀)H\left[\mathbf{A}-\mathbf{X}_{1}\mathbf{G}_{p_{1}}\right]:\mathcal{D}(\mathbf{A})\rightarrow H and [𝐀𝐆p2𝐗2]:𝒟(𝐀)H\left[\mathbf{A}^{*}-\mathbf{G}_{p_{2}}\mathbf{X}_{2}\right]:\mathcal{D}(\mathbf{A}^{*})\rightarrow H are also generators of exponentially stable C0C_{0}-semigroups. Furthermore, it follows that

(25) 𝐓1(t)(H)Meαt and 𝐓2(t)(H)Meαt\left\|\mathbf{T}_{1}(t)\right\|_{\mathscr{L}\left({H}\right)}\leq Me^{-\alpha t}\textrm{ and }\left\|\mathbf{T}_{2}(t)\right\|_{\mathscr{L}\left({H}\right)}\leq Me^{-\alpha t}

for all t+t\in\mathbb{R}_{+}, where M,αM,\alpha are the same constants associated with the unperturbed semigroup 𝐒(t)(H)\mathbf{S}(t)\in\mathscr{L}(H).

Norming both sides of (24) with respect to the 𝒥1(H)\mathscr{J}_{1}\left({H}\right) norm then yields

𝐗1𝐗210+𝐓1(t)𝐗1(𝐆p1𝐆p2)𝐗2𝐓2(t)1𝑑t\left\|\mathbf{X}_{1}-\mathbf{X}_{2}\right\|_{1}\leq\int_{0}^{+\infty}\left\|\mathbf{T}_{1}(t)\mathbf{X}_{1}\left({\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}}}\right)\mathbf{X}_{2}\mathbf{T}_{2}(t)\right\|_{1}dt

after applying the definition of the operator trace. It then follows that

𝐗1𝐗21M22α𝐗11𝐗21𝐆p1𝐆p2(H)\left\|\mathbf{X}_{1}-\mathbf{X}_{2}\right\|_{1}\leq\frac{M^{2}}{2\alpha}\left\|\mathbf{X}_{1}\right\|_{1}\left\|\mathbf{X}_{2}\right\|_{1}\left\|\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}}\right\|_{\mathscr{L}(H)}

after applying the fact that 𝒥1(H)\mathscr{J}_{1}(H) is a two-sided *-ideal in (H)\mathscr{L}(H) and the bounds provided in (25). Applying (7) in the statement of Theorem 1 then yields

𝐗1𝐗21\displaystyle\left\|\mathbf{X}_{1}-\mathbf{X}_{2}\right\|_{1} M68α3𝐐12𝐆p1𝐆p2(H)\displaystyle\leq\frac{M^{6}}{8\alpha^{3}}\left\|\mathbf{Q}\right\|^{2}_{1}\left\|\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}}\right\|_{\mathscr{L}\left({H}\right)}
L𝐆M68α3𝐐12p1p2𝒫,\displaystyle\leq\frac{L_{\mathbf{G}}M^{6}}{8\alpha^{3}}\left\|\mathbf{Q}\right\|^{2}_{1}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}},

after applying the Lipschitz continuity assumption on 𝐆():𝒫s(H)\mathbf{G}_{\left({\cdot}\right)}:\mathcal{P}\rightarrow\mathscr{L}^{s}(H). ∎

With Lemma 2, we are now able to prove that 𝚲s(H)\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H) is a Lipshitz continuous function of p𝒫p\in\mathcal{P} in the following.

Lemma 4.

Let 𝚲s(H)\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H) satisfy (21b). Then 𝚲s(H)\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H) is a Lipschitz continuous function of p𝒫p\in\mathcal{P} and there exists positive constants M,α+M,\alpha\in\mathbb{R}_{+} so that

𝚲1𝚲2(H)(M10g16α5𝐐12+M64α3𝐐1)L𝐆p1p2𝒫,\left\|\boldsymbol{\Lambda}_{1}-\boldsymbol{\Lambda}_{2}\right\|_{\mathscr{L}\left({H}\right)}\leq\left({\frac{M^{10}g}{16\alpha^{5}}\left\|\mathbf{Q}\right\|^{2}_{1}+\frac{M^{6}}{4\alpha^{3}}\left\|\mathbf{Q}\right\|_{1}}\right)L_{\mathbf{G}}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}},

for any p1,p2𝒫p_{1},p_{2}\in\mathcal{P}, where g:=supp𝒫𝐆p(H)g:=\sup_{p\in\mathcal{P}}\left\|\mathbf{G}_{p}\right\|_{\mathscr{L}(H)} for any p𝒫p\in\mathcal{P}.

Proof.

We begin by taking the difference of the equations (23) between i=1,2i=1,2. We arrive at the following Sylvester equation

(26) [𝐀𝐆p2𝐗2](𝚲1𝚲2)+(𝚲1𝚲2)[𝐀𝐗1𝐆p1]=F(𝐗1,𝐗2,𝚲1,𝚲2,𝐆p1,𝐆p2)\left[\mathbf{A}^{*}-\mathbf{G}_{p_{2}}\mathbf{X}_{2}\right]\left({\boldsymbol{\Lambda}_{1}-\boldsymbol{\Lambda}_{2}}\right)+\left({\boldsymbol{\Lambda}_{1}-\boldsymbol{\Lambda}_{2}}\right)\left[\mathbf{A}-\mathbf{X}_{1}\mathbf{G}_{p_{1}}\right]=F(\mathbf{X}_{1},\mathbf{X}_{2},\boldsymbol{\Lambda}_{1},\boldsymbol{\Lambda}_{2},\mathbf{G}_{p_{1}},\mathbf{G}_{p_{2}})

where we have denoted

F(𝐗1,𝐗2,𝚲1,𝚲2,𝐆p1,𝐆p2):=\displaystyle F(\mathbf{X}_{1},\mathbf{X}_{2},\boldsymbol{\Lambda}_{1},\boldsymbol{\Lambda}_{2},\mathbf{G}_{p_{1}},\mathbf{G}_{p_{2}})=
𝐆p1(𝐗1𝐗2)𝚲1+(𝐆p1𝐆p2)𝐗1𝚲1+𝚲2𝐗2(𝐆p1𝐆p2)+𝚲2(𝐗1𝐗2)𝐆p1.\displaystyle\qquad\mathbf{G}_{p_{1}}(\mathbf{X}_{1}-\mathbf{X}_{2})\boldsymbol{\Lambda}_{1}+(\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}})\mathbf{X}_{1}\boldsymbol{\Lambda}_{1}+\boldsymbol{\Lambda}_{2}\mathbf{X}_{2}(\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}})+\boldsymbol{\Lambda}_{2}(\mathbf{X}_{1}-\mathbf{X}_{2})\mathbf{G}_{p_{1}}.

Let now 𝐓2(t)(H)\mathbf{T}_{2}(t)\in\mathscr{L}\left({H}\right) be the exponentially stable C0C_{0}-semigroup generated by 𝐀𝐆p2𝐗2\mathbf{A}^{*}-\mathbf{G}_{p_{2}}\mathbf{X}_{2} and 𝐓1(t)(H)\mathbf{T}_{1}(t)\in\mathscr{L}\left({H}\right) be the exponentially stable C0C_{0} semigroup generated by 𝐀𝐗1𝐆p1\mathbf{A}-\mathbf{X}_{1}\mathbf{G}_{p_{1}} respectively. With Lemma 1 we have that (26) can be written in the following equivalent Bochner integral form

(𝚲1𝚲2)=0+𝐓2(t)F(𝐗1,𝐗2,𝚲1,𝚲2,𝐆p1,𝐆p2)𝐓1(t)𝑑t.(\boldsymbol{\Lambda}_{1}-\boldsymbol{\Lambda}_{2})=-\int_{0}^{+\infty}\mathbf{T}_{2}(t)F\left({\mathbf{X}_{1},\mathbf{X}_{2},\boldsymbol{\Lambda}_{1},\boldsymbol{\Lambda}_{2},\mathbf{G}_{p_{1}},\mathbf{G}_{p_{2}}}\right)\mathbf{T}_{1}(t)dt.

Norming both sides with the (H)\mathscr{L}\left({H}\right) norm then allows us to see that

(27) 𝚲1𝚲2(H)\displaystyle\left\|\boldsymbol{\Lambda}_{1}-\boldsymbol{\Lambda}_{2}\right\|_{\mathscr{L}\left({H}\right)} 0+M2e2αtF(𝐗1,𝐗2,𝚲1,𝚲2,𝐆p1,𝐆p2)(H)𝑑t\displaystyle\leq\int_{0}^{+\infty}M^{2}e^{-2\alpha t}\left\|F(\mathbf{X}_{1},\mathbf{X}_{2},\boldsymbol{\Lambda}_{1},\boldsymbol{\Lambda}_{2},\mathbf{G}_{p_{1}},\mathbf{G}_{p_{2}})\right\|_{\mathscr{L}\left({H}\right)}dt
=M22αF(𝐗1,𝐗2,𝚲1,𝚲2,𝐆p1,𝐆p2)(H).\displaystyle=\frac{M^{2}}{2\alpha}\left\|F(\mathbf{X}_{1},\mathbf{X}_{2},\boldsymbol{\Lambda}_{1},\boldsymbol{\Lambda}_{2},\mathbf{G}_{p_{1}},\mathbf{G}_{p_{2}})\right\|_{\mathscr{L}\left({H}\right)}.

We now bound F(𝐗1,𝐗2,𝚲1,𝚲2,𝐆p1,𝐆p2)(H)\left\|F\left({\mathbf{X}_{1},\mathbf{X}_{2},\boldsymbol{\Lambda}_{1},\boldsymbol{\Lambda}_{2},\mathbf{G}_{p_{1}},\mathbf{G}_{p_{2}}}\right)\right\|_{\mathscr{L}\left({H}\right)}. Recall (15), where we have assumed 𝐆p(H)=g\left\|\mathbf{G}_{p}\right\|_{\mathscr{L}(H)}=g for all p𝒫p\in\mathcal{P}. We have then

F(𝐗1,𝐗2,𝚲1,𝚲2,𝐆p1,𝐆p2)(H)\displaystyle\left\|F(\mathbf{X}_{1},\mathbf{X}_{2},\boldsymbol{\Lambda}_{1},\boldsymbol{\Lambda}_{2},\mathbf{G}_{p_{1}},\mathbf{G}_{p_{2}})\right\|_{\mathscr{L}\left({H}\right)}
𝐆p1(𝐗1𝐗2)𝚲1(H)+(𝐆p1𝐆p2)𝐗1𝚲1(H)\displaystyle\quad\leq\left\|\mathbf{G}_{p_{1}}(\mathbf{X}_{1}-\mathbf{X}_{2})\boldsymbol{\Lambda}_{1}\right\|_{\mathscr{L}\left({H}\right)}+\left\|(\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}})\mathbf{X}_{1}\boldsymbol{\Lambda}_{1}\right\|_{\mathscr{L}\left({H}\right)}
+𝚲2𝐗2(𝐆p1𝐆p2)(H)+𝚲2(𝐗1𝐗2)𝐆p1(H)\displaystyle\qquad+\left\|\boldsymbol{\Lambda}_{2}\mathbf{X}_{2}(\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}})\right\|_{\mathscr{L}\left({H}\right)}+\left\|\boldsymbol{\Lambda}_{2}(\mathbf{X}_{1}-\mathbf{X}_{2})\mathbf{G}_{p_{1}}\right\|_{\mathscr{L}\left({H}\right)}
(M8g8α4𝐐12+M42α2𝐐1)𝐆p1𝐆p2(H),\displaystyle\quad\leq\left({\frac{M^{8}g}{8\alpha^{4}}\left\|\mathbf{Q}\right\|^{2}_{1}+\frac{M^{4}}{2\alpha^{2}}\left\|\mathbf{Q}\right\|_{1}}\right)\left\|\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}}\right\|_{\mathscr{L}(H)},

after applying Theorem 1, Lemmas 3, and 2. Inserting this bound into (27) then results in

𝚲1𝚲2(H)(M10g16α5𝐐12+M64α3𝐐1)𝐆p1𝐆p2(H).\left\|\boldsymbol{\Lambda}_{1}-\boldsymbol{\Lambda}_{2}\right\|_{\mathscr{L}\left({H}\right)}\leq\left({\frac{M^{10}g}{16\alpha^{5}}\left\|\mathbf{Q}\right\|^{2}_{1}+\frac{M^{6}}{4\alpha^{3}}\left\|\mathbf{Q}\right\|_{1}}\right)\left\|\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}}\right\|_{\mathscr{L}(H)}.

Applying the Lipschitz continuity assumption (16) yields the result of this lemma. ∎

3.6. The Critical Cone

The sufficient second-order optimality condition requires that the Hessian evaluated at the stationary point of the Lagrangian be positive definite in the directions in the critical cone associated with the constraint. Loosely speaking, the critical cone is a subset of the tangent space of the constraint manifold evaluated at (𝐗opt,popt)\left({\mathbf{X}_{opt},p_{opt}}\right) that maps the first Gatéux (directional) derivative of the constraint to zero. We will utilize the second-order optimality condition to demonstrate that the solution to the first-order optimality system is indeed the unique minimizer to the associated penalized constrained optimization problems studied in this work.

Let us define

c(𝐗,p):=𝐀𝐗+𝐗𝐀𝐗𝐆p𝐗+𝐐c(\mathbf{X},p):=\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}

to be the constraint function associated with the operator-valued Riccati equation, where again 𝐐𝒥1s(H)\mathbf{Q}\in\mathscr{J}_{1}^{s}(H) and 𝐆():𝒫𝒥1s(H)\mathbf{G}_{(\cdot)}:\mathcal{P}\rightarrow\mathscr{J}_{1}^{s}(H) satisfies (15), (16), and (17). The critical cone for the constrained optimizer is defined by the following set

𝒦(𝐗opt,popt):={(𝚽,q)𝒥1s(H)×𝒫:c𝐗|(𝐗opt,p)opt(𝚽)=0 and cp|(𝐗opt,popt)(q)=0}.\mathcal{K}(\mathbf{X}_{opt},p_{opt}):=\left\{\left({\boldsymbol{\Phi},q}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathcal{P}:\left.\frac{\partial c}{\partial\mathbf{X}}\right|_{\left({\mathbf{X}_{opt},p}\right)_{opt}}(\boldsymbol{\Phi})=0\textrm{ and }\left.\frac{\partial c}{\partial p}\right|_{\left({\mathbf{X}_{opt},p_{opt}}\right)}(q)=0\right\}.

We now characterize 𝒦(𝐗opt,popt)\mathcal{K}(\mathbf{X}_{opt},p_{opt}). Taking the first Gatéux derivative of c(,)c(\cdot,\cdot) with respect to 𝐗𝒥1s(H)\mathbf{X}\in\mathscr{J}_{1}^{s}(H) yields

c𝐗(𝚽)\displaystyle\frac{\partial c}{\partial\mathbf{X}}(\boldsymbol{\Phi}) =𝐀𝚽+𝚽𝐀𝚽𝐆p𝐗𝐗𝐆p𝚽\displaystyle=\mathbf{A}\boldsymbol{\Phi}+\boldsymbol{\Phi}\mathbf{A}^{*}-\boldsymbol{\Phi}\mathbf{G}_{p}\mathbf{X}-\mathbf{X}\mathbf{G}_{p}\boldsymbol{\Phi}
=(𝐀𝐗𝐆p)𝚽+𝚽(𝐀𝐆p𝐗)\displaystyle=\left({\mathbf{A}-\mathbf{X}\mathbf{G}_{p}}\right)\boldsymbol{\Phi}+\boldsymbol{\Phi}\left({\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X}}\right)
=0\displaystyle=0

for all 𝚽𝒥1s(H)\boldsymbol{\Phi}\in\mathscr{J}_{1}^{s}(H). Because 𝚽𝒥1s(H)\boldsymbol{\Phi}\in\mathscr{J}_{1}^{s}(H) is now the solution of a Sylvester equation (see (11)) with 𝟎\mathbf{0} as the data, we have that 𝚽𝟎\boldsymbol{\Phi}\equiv\mathbf{0}. Continuing, we have that

cp(q)=𝐗𝐝𝐆p(q)𝐗=0.\frac{\partial c}{\partial p}(q)=-\mathbf{X}\mathbf{d}\mathbf{G}_{p}(q)\mathbf{X}=0.

In general, the set of q𝒫q\in\mathcal{P} satisfying the above condition is nonempty. Combining the two observations made above, we have that 𝒦(𝐗opt,popt)\mathcal{K}(\mathbf{X}_{opt},p_{opt}) can be characterized by the following

(28) 𝒦(𝐗opt,popt):={(𝟎,q)𝒥1s(H)×𝒫:𝐗opt𝐝𝐆popt(q)𝐗opt=𝟎}.\mathcal{K}(\mathbf{X}_{opt},p_{opt}):=\left\{\left({\mathbf{0},q}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathcal{P}:-\mathbf{X}_{opt}\mathbf{d}\mathbf{G}_{p_{opt}}(q)\mathbf{X}_{opt}=\mathbf{0}\right\}.

With the preliminary analysis completed, we are now ready to present and analyze the two penalized constrained optimization problems of interest in the following two sections of this paper.

4. Control Penalized Constrained Trace Minimization

4.1. Problem Statement

The control penalization technique [19] is a well-known and often applied method of regularizing the solution of optimization problems. It has the benefit of implicitly constraining the parameter space and improving the well-posedness properties of the optimization problem. We study this technique in the context of control device design and placement in this section. The ideas presented in this section will serve as a pedagogical stepping stone for the more complex arguments needed to analyze the problem presented in the following section.

The control penalized optimization problem of interest is to seek a (𝐗opt,popt)𝒥1s(H)×𝒫\left({\mathbf{X}_{opt},p_{opt}}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathcal{P} that minimizes

(29) 𝒥β(𝐗,p):=trace(𝐗𝐖)+β2p𝒫2\mathcal{J}_{\beta}(\mathbf{X},p):=\texttt{trace}\left(\mathbf{X}\mathbf{W}\right)+\frac{\beta}{2}\left\|p\right\|_{\mathcal{P}}^{2}

constrained by the strong operator-valued Riccati equation

(30) 𝐀𝐗+𝐗𝐀𝐗𝐆p𝐗+𝐐=𝟎,\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}=\mathbf{0},

where 𝐖s(H)\mathbf{W}\in\mathscr{L}^{s}(H) is a symmetric positive semi-definite weighting operator, 𝐀:𝒟(𝐀)HH\mathbf{A}:\mathcal{D}(\mathbf{A})\subset H\rightarrow H is the generator of the exponentially stable C0C_{0}-semigroup 𝐒(t)(H)\mathbf{S}(t)\in\mathscr{L}(H), 𝐐𝒥1s(H)\mathbf{Q}\in\mathscr{J}_{1}^{s}(H) is a nonnegative operator, and 𝐆():𝒫𝒥1s(H)\mathbf{G}_{\left({\cdot}\right)}:\mathcal{P}\rightarrow\mathscr{J}_{1}^{s}(H) is the parametrized operator associated with the control device. The definition of 𝒥β()\mathcal{J}_{\beta}(\cdot) presented in (29) is equivalent to the cost functional presented in the introduction of this work. This is because for each p𝒫p\in\mathcal{P}, there is only one 𝐗𝒥1s(H)\mathbf{X}\in\mathscr{J}_{1}^{s}(H) that satisfies (30).

In the following discussion, we will first present the first-order optimality system associated with the penalized optimization problem studied in this section, then we provide a derivation of this set of equations. Next we determine that there exists only one solution to the first order optimality system and that this solution satisfies the second-order sufficient conditions to qualify as a constrained minimizer of the cost functional (29). In other words, there exists only one global constrained minimizer for (29).

4.2. First-Order Optimality System

The first-order optimality system associated with the constrained optimization problem is given as follows: Seek a (𝐗opt,𝚲opt,popt)𝒥1s(H)×s(H)×𝒫\left({\mathbf{X}_{opt},\boldsymbol{\Lambda}_{opt},p_{opt}}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathscr{L}^{s}(H)\times\mathcal{P} that satisfies

Primal Problem:
(31a) 𝐀𝐗+𝐗𝐀𝐗𝐆p𝐗+𝐐=0\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}=0
Dual Problem:
(31b) (𝐀𝐆p𝐗)𝚲+𝚲(𝐀𝐗𝐆p)=𝐖\left({\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X}}\right)\boldsymbol{\Lambda}+\boldsymbol{\Lambda}\left({\mathbf{A}-\mathbf{X}\mathbf{G}_{p}}\right)=-\mathbf{W}
Optimality Condition:
(31c) p=1β𝐝𝐆p𝐗𝚲𝐗.p=\frac{1}{\beta}\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}.

The variable 𝚲s(H)\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H) is the dual solution that arises from applying the Lagrange multiplier to (31a). We will determine that there is only one solution (𝐗opt,𝚲opt,popt)𝒥1s(H)×s(H)×𝒫\left({\mathbf{X}_{opt},\boldsymbol{\Lambda}_{opt},p_{opt}}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathscr{L}^{s}(H)\times\mathcal{P} that satisfies (31) and that (𝐗opt,popt)\left({\mathbf{X}_{opt},p_{opt}}\right) is in fact a constrained minimizer of (29).

4.2.1. Derivation of the Optimality System

We now derive the first-order optimality system (31). We begin by introducing the Lagrange multiplier 𝚲(H)\boldsymbol{\Lambda}\in\mathscr{L}(H). The Lagrangian functional associated with the cost functional (29) with constraint (30) is the following

(32) (𝐗,p,𝚲):=𝐈,𝐗𝐖+β2p𝒫2+𝚲,[𝐀𝐗+𝐗𝐀𝐗𝐆p𝐗+𝐐],\mathcal{L}\left({\mathbf{X},p,\boldsymbol{\Lambda}}\right):=\left<\mathbf{I},\mathbf{X}\mathbf{W}\right>+\frac{\beta}{2}\left\|p\right\|^{2}_{\mathcal{P}}+\left<\boldsymbol{\Lambda},\left[\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}\right]\right>,

where we have utilized the identity trace(𝐗𝐖):=𝐈,𝐗𝐖\texttt{trace}\left(\mathbf{X}\mathbf{W}\right):=\left<\mathbf{I},\mathbf{X}\mathbf{W}\right>. Because 𝚲,\left<\boldsymbol{\Lambda},\cdot\right> is a functional belonging to 𝒥1(H)\mathscr{J}_{1}(H)^{\prime}, we have that (32) is well-defined since we have demonstrated that there exists a 𝐗𝒥1s(H)\mathbf{X}\in\mathscr{J}_{1}^{s}(H) that satisfies (31a) in the 𝒥1(H)\mathscr{J}_{1}(H) topology in Theorem 1.

The first order necessary condition for optimality, i.e. that (𝐗,p,𝚲)𝒥1(H)×𝒫×(H)\left({\mathbf{X},p,\boldsymbol{\Lambda}}\right)\in\mathscr{J}_{1}(H)\times\mathcal{P}\times\mathscr{L}(H) is a saddle-point of (32), is the following

(33) 𝐗(𝚿)=0,p(q)=0,𝚲(𝚽)=0,\frac{\partial\mathcal{L}}{\partial\mathbf{X}}(\boldsymbol{\Psi})=0,\quad\frac{\partial\mathcal{L}}{\partial p}(q)=0,\quad\frac{\partial\mathcal{L}}{\partial\boldsymbol{\Lambda}}(\boldsymbol{\Phi})=0,

for all (𝚿,q,𝚽)(H)×𝒫×𝒥1(H)\left({\boldsymbol{\Psi},q,\boldsymbol{\Phi}}\right)\in\mathscr{L}(H)\times\mathcal{P}\times\mathscr{J}_{1}(H). In this work, we choose to work with the strong form of (33), i.e.

(34) 𝐗=0,p=0,𝚲=0,\frac{\partial\mathcal{L}}{\partial\mathbf{X}}=0,\quad\frac{\partial\mathcal{L}}{\partial p}=0,\quad\frac{\partial\mathcal{L}}{\partial\boldsymbol{\Lambda}}=0,

because we have already derived the necessary theoretical results for the well-posedness analysis using the strong form of the equation that arise from (34). It is easily determined that (34) being satisfied also implies that (33) is also satisfied, therefore it is sufficient to consider (34) in our analysis. Furthermore, (33) implies (34) because 𝒥1(H)\mathscr{J}_{1}(H) and (H)\mathscr{L}(H) form a duality pairing under the trace operator. We proceed in deriving each term in (34) in the remaining paragraphs of this subsection.

We begin our discussion by deriving (31a). Taking the Gatéux derivative of (,,)\mathcal{L}(\cdot,\cdot,\cdot) with respect to 𝚲\boldsymbol{\Lambda} yields

𝚲(𝚽)=𝚽,[𝐀𝐗+𝐗𝐀𝐗𝐆p𝐗+𝐐]\frac{\partial\mathcal{L}}{\partial\boldsymbol{\Lambda}}(\boldsymbol{\Phi})=\left<\boldsymbol{\Phi},\left[\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}\right]\right>

for all 𝚽(H)\boldsymbol{\Phi}\in\mathscr{L}(H). Setting 𝚲(𝚽)=0\frac{\partial\mathcal{L}}{\partial\boldsymbol{\Lambda}}(\boldsymbol{\Phi})=0 then yields the following necessary condition

(35) 𝚽,[𝐀𝐗+𝐗𝐀𝐗𝐆p𝐗+𝐐]=0\left<\boldsymbol{\Phi},\left[\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}\right]\right>=0

for all 𝚽(H)\boldsymbol{\Phi}\in\mathscr{L}(H). We have that (35) is true if and only if (31a) is satisfied because there is a one-to-one correspondence between 𝚽,\left<\boldsymbol{\Phi},\cdot\right> and 𝒥1(H)\mathscr{J}_{1}(H)^{\prime}, and observing that Theorem 1 indicates that (5) is satisfied in the 𝒥1(H)\mathscr{J}_{1}(H) norm topology.

We now derive (31b). The Gatéux derivative of (,,)\mathcal{L}(\cdot,\cdot,\cdot) with respect to 𝐗\mathbf{X} is the following

𝐗(𝚿)\displaystyle\frac{\partial\mathcal{L}}{\partial\mathbf{X}}(\boldsymbol{\Psi}) =𝐈,𝚿𝐖+𝚲,[(𝐀𝐗𝐆p)𝚿+𝚿(𝐀𝐆p𝐗)]\displaystyle=\left<\mathbf{I},\boldsymbol{\Psi}\mathbf{W}\right>+\left<\boldsymbol{\Lambda},\left[\left({\mathbf{A}-\mathbf{X}\mathbf{G}_{p}}\right)\boldsymbol{\Psi}+\boldsymbol{\Psi}\left({\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X}}\right)\right]\right>
=𝐖,𝚿+(𝐀𝐆p𝐗)𝚲,𝚿+𝚲,(𝐀𝐗𝐆p)𝚿\displaystyle=\left<\mathbf{W},\boldsymbol{\Psi}\right>+\left<\left({\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X}^{*}}\right)\boldsymbol{\Lambda},\boldsymbol{\Psi}\right>+\left<\boldsymbol{\Lambda}^{*},(\mathbf{A}-\mathbf{X}^{*}\mathbf{G}_{p})\boldsymbol{\Psi}^{*}\right>
=𝐖,𝚿+(𝐀𝐆p𝐗)𝚲,𝚿+(𝐀𝐆p𝐗)𝚲,𝚿\displaystyle=\left<\mathbf{W},\boldsymbol{\Psi}\right>+\left<\left({\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X}^{*}}\right)\boldsymbol{\Lambda},\boldsymbol{\Psi}\right>+\left<(\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X})\boldsymbol{\Lambda}^{*},\boldsymbol{\Psi}^{*}\right>
=𝐖,𝚿+(𝐀𝐆p𝐗)𝚲,𝚿+𝚲(𝐀𝐗𝐆p),𝚿\displaystyle=\left<\mathbf{W},\boldsymbol{\Psi}\right>+\left<\left({\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X}^{*}}\right)\boldsymbol{\Lambda},\boldsymbol{\Psi}\right>+\left<\boldsymbol{\Lambda}\left({\mathbf{A}-\mathbf{X}^{*}\mathbf{G}_{p}}\right),\boldsymbol{\Psi}\right>
=𝐖,𝚿+(𝐀𝐆p𝐗)𝚲,𝚿+𝚲(𝐀𝐗𝐆p),𝚿\displaystyle=\left<\mathbf{W},\boldsymbol{\Psi}\right>+\left<\left({\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X}}\right)\boldsymbol{\Lambda},\boldsymbol{\Psi}\right>+\left<\boldsymbol{\Lambda}\left({\mathbf{A}-\mathbf{X}\mathbf{G}_{p}}\right),\boldsymbol{\Psi}\right>

for all 𝚿𝒥1(H)\boldsymbol{\Psi}\in\mathscr{J}_{1}(H). Note that we have made heavy use of (1) in the derivation above. In the final equality, we have applied the property that 𝐗=𝐗\mathbf{X}^{*}=\mathbf{X}. Setting L𝐗(𝚿)=0\frac{\partial L}{\partial\mathbf{X}}(\boldsymbol{\Psi})=0 then results in

(36) (𝐀𝐆p𝐗)𝚲,𝚿+𝚲(𝐀𝐗𝐆p),𝚿=𝐖,𝚿\left<\left({\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X}}\right)\boldsymbol{\Lambda},\boldsymbol{\Psi}\right>+\left<\boldsymbol{\Lambda}\left({\mathbf{A}-\mathbf{X}\mathbf{G}_{p}}\right),\boldsymbol{\Psi}\right>=-\left<\mathbf{W},\boldsymbol{\Psi}\right>

for all 𝚿𝒥1(H)\boldsymbol{\Psi}\in\mathscr{J}_{1}(H). Because [(𝐀𝐆p𝐗)𝚲+𝚲(𝐀𝐗𝐆p)+𝐖],\left<\left[\left({\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X}}\right)\boldsymbol{\Lambda}+\boldsymbol{\Lambda}\left({\mathbf{A}-\mathbf{X}\mathbf{G}_{p}}\right)+\mathbf{W}\right],\cdot\right> is defined in the weak-* topology of 𝒥1(H)\mathscr{J}_{1}(H)^{\prime}, we have that (36) is satisfied if and only if (31b) is satisfied. This is because the zero element in 𝒥1(H)\mathscr{J}_{1}(H)^{\prime} is the only functional that maps any element in 𝒥1(H)\mathscr{J}_{1}(H) to zero in the complex plane.

We conclude this section with the derivation of (31c).

Lp(q)\displaystyle\frac{\partial L}{\partial p}(q) =β(p,q)𝒫𝚲,𝐗𝐝𝐆p(q)𝐗\displaystyle=\beta\left({p,q}\right)_{\mathcal{P}}-\left<\boldsymbol{\Lambda},\mathbf{X}\mathbf{d}\mathbf{G}_{p}(q)\mathbf{X}\right>
=βp,q𝒫𝐗𝚲,𝐝𝐆p(q)𝐗\displaystyle=\beta\left<p,q\right>_{\mathcal{P}}-\left<\mathbf{X}^{*}\boldsymbol{\Lambda},\mathbf{d}\mathbf{G}_{p}(q)\mathbf{X}\right>
=βp,q𝒫𝚲𝐗,𝐗𝐝𝐆p(q)\displaystyle=\beta\left<p,q\right>_{\mathcal{P}}-\left<\boldsymbol{\Lambda}^{*}\mathbf{X},\mathbf{X}^{*}\mathbf{d}\mathbf{G}_{p}(q)^{*}\right>
=βp,q𝒫𝐗𝚲𝐗,𝐝𝐆p(q)\displaystyle=\beta\left<p,q\right>_{\mathcal{P}}-\left<\mathbf{X}\boldsymbol{\Lambda}^{*}\mathbf{X},\mathbf{d}\mathbf{G}_{p}(q)^{*}\right>
=βp,q𝒫𝐗𝚲𝐗,𝐝𝐆p(q)\displaystyle=\beta\left<p,q\right>_{\mathcal{P}}-\left<\mathbf{X}^{*}\boldsymbol{\Lambda}\mathbf{X}^{*},\mathbf{d}\mathbf{G}_{p}(q)\right>
=(βp𝐝𝐆p𝐗𝚲𝐗,q)𝒫\displaystyle=\left({\beta p-\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}\boldsymbol{\Lambda}\mathbf{X},q}\right)_{\mathcal{P}}
=0,\displaystyle=0,

after applying the fact that 𝐗=𝐗\mathbf{X}^{*}=\mathbf{X}, 𝚲=𝚲\boldsymbol{\Lambda}^{*}=\boldsymbol{\Lambda}, and making heavy use of (1). We then have that

(37) (p,q)𝒫=1β(𝐝𝐆p𝐗𝚲𝐗,q)𝒫(p,q)_{\mathcal{P}}=\frac{1}{\beta}\left({\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}\boldsymbol{\Lambda}\mathbf{X},q}\right)_{\mathcal{P}}

for all q𝒫q\in\mathcal{P}. We then arrive at (31c) since (37) can only be satisfied if its strong form is satisfied since 𝒫\mathcal{P} is a Hilbert space.

4.2.2. Well-Posedness of the Optimality System

We now determine the conditions that must be satisfied in order for (31) to only have one solution. First, notice that (31a) is well-posed for any p𝒫p\in\mathcal{P} as a consequence of Theorem 1 and observing that 𝐆p𝒥1s(H)\mathbf{G}_{p}\in\mathscr{J}_{1}^{s}(H). Next, notice that 𝐗𝒥1s(H)\mathbf{X}\in\mathscr{J}_{1}^{s}(H) is coupled to (31b) as an input parameter. Lemma 2 indicates that (31b) is well-posed for any 𝐗𝒥!s(H)\mathbf{X}\in\mathscr{J}_{!}^{s}(H). Because (31a) is independent of 𝚲s(H)\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H), there exists a unique (𝐗,𝚲)𝒥1s(H)×s(H)\left({\mathbf{X},\boldsymbol{\Lambda}}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathscr{L}^{s}(H) that satisfies the coupled equation (22) and (23) for any choice of p𝒫p\in\mathcal{P}. Because of this, we have determined the existence of a mapping p(𝐗(p),𝚲(p))p\mapsto\left({\mathbf{X}(p),\boldsymbol{\Lambda}(p)}\right) for any p𝒫p\in\mathcal{P}. This mapping is continuous as a consequence of Lemmas 3 and 4. It is our goal then to determine the conditions by which we can select a unique p𝒫p\in\mathcal{P} so that (𝐗(p),𝚲(p),p)𝒥1s(H)×s(H)×𝒫\left({\mathbf{X}(p),\boldsymbol{\Lambda}(p),p}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathscr{L}^{s}(H)\times\mathcal{P} satisfies (31c).

Let us define f(p):𝒫𝒫f(p):\mathcal{P}\rightarrow\mathcal{P} as follows

f(p):=𝐝𝐆p𝐗(p)𝚲(p)𝐗(p),f(p):=\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}(p)\boldsymbol{\Lambda}(p)\mathbf{X}(p),

where (𝐗(p),𝚲(p))\left({\mathbf{X}(p),\boldsymbol{\Lambda}(p)}\right) satisfies (31a) and (31b) respectively. It then becomes clear that (31c) can be written in the following fixed point form

p=f(p).p=f(p).

It is our goal to invoke the Banach fixed point theorem [7, Theorem 3.7-1] to determine the conditions under which the function f()f(\cdot) is a contractive map, i.e., f()f(\cdot) satisfies

f(p1)f(p2)𝒫kp1p2𝒫\left\|f(p_{1})-f(p_{2})\right\|_{\mathcal{P}}\leq k\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}

with k<1k<1 for any p1,p2𝒫p_{1},p_{2}\in\mathcal{P}. More concretely, we wish to demonstrate for any p1,p2𝒫p_{1},p_{2}\in\mathcal{P}, that

(38) 1β𝐝𝐆p1𝐗1𝚲1𝐗1𝐝𝐆p2𝐗2𝚲2𝐗2𝒫kp1p2𝒫.\frac{1}{\beta}\left\|\mathbf{d}\mathbf{G}_{p_{1}}^{*}\mathbf{X}_{1}\boldsymbol{\Lambda}_{1}\mathbf{X}_{1}-\mathbf{d}\mathbf{G}_{p_{2}}^{*}\mathbf{X}_{2}\boldsymbol{\Lambda}_{2}\mathbf{X}_{2}\right\|_{\mathcal{P}}\leq k\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}.

We now characterize the constant k+k\in\mathbb{R}_{+}. Lemmas 2, 3, and 4 determines that both 𝐗𝒥1s(H)\mathbf{X}\in\mathscr{J}_{1}^{s}(H) and 𝚲s(H)\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H) are Lipschitz continuous functions of p𝒫p\in\mathcal{P} under the assumption that (15) and (16) are satisfied. If we further assume (17) is satisfied, then it follows that

1β𝐝𝐆p1𝐗1𝚲1𝐗1𝐝𝐆p2𝐗2𝚲2𝐗2𝒫\displaystyle\frac{1}{\beta}\left\|\mathbf{d}\mathbf{G}_{p_{1}}^{*}\mathbf{X}_{1}\boldsymbol{\Lambda}_{1}\mathbf{X}_{1}-\mathbf{d}\mathbf{G}_{p_{2}}^{*}\mathbf{X}_{2}\boldsymbol{\Lambda}_{2}\mathbf{X}_{2}\right\|_{\mathcal{P}} 1β(𝐝𝐆p1𝐝𝐆p1(𝒫;𝒥1(H))𝐗112𝚲2(H)\displaystyle\leq\frac{1}{\beta}\bigg(\left\|\mathbf{d}\mathbf{G}_{p_{1}}-\mathbf{d}\mathbf{G}_{p_{1}}\right\|_{\mathscr{L}(\mathcal{P};\mathscr{J}_{1}(H))}\left\|\mathbf{X}_{1}\right\|^{2}_{1}\left\|\boldsymbol{\Lambda}_{2}\right\|_{\mathscr{L}(H)}
+𝐝𝐆p1(𝒫;𝒥1(H))𝐗1𝐗21𝚲2(H)𝐗21\displaystyle\qquad+\left\|\mathbf{d}\mathbf{G}_{p_{1}}\right\|_{\mathscr{L}\left({\mathcal{P};\mathscr{J}_{1}(H)}\right)}\left\|\mathbf{X}_{1}-\mathbf{X}_{2}\right\|_{1}\left\|\boldsymbol{\Lambda}_{2}\right\|_{\mathscr{L}(H)}\left\|\mathbf{X}_{2}\right\|_{1}
+𝐝𝐆p1(𝒫;𝒥1(H))𝐗11𝚲1𝚲2(H)𝐗21\displaystyle\qquad+\left\|\mathbf{d}\mathbf{G}_{p_{1}}\right\|_{\mathscr{L}(\mathcal{P};\mathscr{J}_{1}(H))}\left\|\mathbf{X}_{1}\right\|_{1}\left\|\boldsymbol{\Lambda}_{1}-\boldsymbol{\Lambda}_{2}\right\|_{\mathscr{L}(H)}\left\|\mathbf{X}_{2}\right\|_{1}
+𝐝𝐆p1(𝒫;𝒥1(H))𝐗11𝚲1(H)𝐗1𝐗21)\displaystyle\qquad+\left\|\mathbf{d}\mathbf{G}_{p_{1}}\right\|_{\mathscr{L}\left({\mathcal{P};\mathscr{J}_{1}(H)}\right)}\left\|\mathbf{X}_{1}\right\|_{1}\left\|\boldsymbol{\Lambda}_{1}\right\|_{\mathscr{L}(H)}\left\|\mathbf{X}_{1}-\mathbf{X}_{2}\right\|_{1}\bigg)
1β(M616α3𝐐12𝐖(H)𝐝𝐆p1𝐝𝐆p1(𝒫;𝒥1(H))\displaystyle\leq\frac{1}{\beta}\bigg(\frac{M^{6}}{16\alpha^{3}}\left\|\mathbf{Q}\right\|_{1}^{2}\left\|\mathbf{W}\right\|_{\mathscr{L}(H)}\left\|\mathbf{d}\mathbf{G}_{p_{1}}-\mathbf{d}\mathbf{G}_{p_{1}}\right\|_{\mathscr{L}(\mathcal{P};\mathscr{J}_{1}(H))}
+C𝐝𝐆M42α2𝐐1𝐖(H)𝐗1𝐗21\displaystyle\qquad+\frac{C_{\mathbf{d}\mathbf{G}}M^{4}}{2\alpha^{2}}\left\|\mathbf{Q}\right\|_{1}\left\|\mathbf{W}\right\|_{\mathscr{L}(H)}\left\|\mathbf{X}_{1}-\mathbf{X}_{2}\right\|_{1}
+C𝐝𝐆M42α2𝐐12𝚲1𝚲2(H))\displaystyle\qquad+\frac{C_{\mathbf{d}\mathbf{G}}M^{4}}{2\alpha^{2}}\left\|\mathbf{Q}\right\|_{1}^{2}\left\|\boldsymbol{\Lambda}_{1}-\boldsymbol{\Lambda}_{2}\right\|_{\mathscr{L}(H)}\bigg)
1β(L𝐝𝐆M616α3𝐐12𝐖(H)p1p2𝒫\displaystyle\leq\frac{1}{\beta}\bigg(\frac{L_{\mathbf{d}\mathbf{G}}M^{6}}{16\alpha^{3}}\left\|\mathbf{Q}\right\|_{1}^{2}\left\|\mathbf{W}\right\|_{\mathscr{L}(H)}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}
+L𝐝𝐆C𝐝𝐆M1016α5𝐐13𝐖(H)p1p2𝒫\displaystyle\qquad+\frac{L_{\mathbf{d}\mathbf{G}}C_{\mathbf{d}\mathbf{G}}M^{10}}{16\alpha^{5}}\left\|\mathbf{Q}\right\|_{1}^{3}\left\|\mathbf{W}\right\|_{\mathscr{L}(H)}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}
+L𝐆C𝐝𝐆M42α2𝐐12(M10γ16α5𝐐12+M64α3𝐐1)p1p2𝒫).\displaystyle\qquad+\frac{L_{\mathbf{G}}C_{\mathbf{d}\mathbf{G}}M^{4}}{2\alpha^{2}}\left\|\mathbf{Q}\right\|_{1}^{2}\left({\frac{M^{10}\gamma}{16\alpha^{5}}\left\|\mathbf{Q}\right\|^{2}_{1}+\frac{M^{6}}{4\alpha^{3}}\left\|\mathbf{Q}\right\|_{1}}\right)\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}\bigg).
=kα,β,M,𝐐,𝐖,𝐆p1p2𝒫\displaystyle=k_{\alpha,\beta,M,\mathbf{Q},\mathbf{W},\mathbf{G}}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}

Therefore, we have demonstrated that (31c) has a unique solution if the penalty parameter β\beta is chosen sufficiently large enough to sufficiently reduce the contributions of α,M,𝐐1,𝐖(H)\alpha,M,\left\|\mathbf{Q}\right\|_{1},\left\|\mathbf{W}\right\|_{\mathscr{L}\left({H}\right)}, and the constants associated with 𝐆()\mathbf{G}_{\left({\cdot}\right)}.

We now analyze the Hessian operator 𝐝2opt[,]:[𝒥1s(H)×𝒫]×[𝒥1s(H)×𝒫]\mathbf{d}^{2}\mathcal{L}_{opt}\left[\cdot,\cdot\right]:\left[\mathscr{J}_{1}^{s}(H)\times\mathcal{P}\right]\times\left[\mathscr{J}_{1}^{s}(H)\times\mathcal{P}\right]\rightarrow\mathbb{R} associated with the Lagrangian cost functional (32) evaluated at (𝐗opt,popt,𝚲opt)\left({\mathbf{X}_{opt},p_{opt},\boldsymbol{\Lambda}_{opt}}\right). We hope to demonstrate that the second-order optimality condition is satisfied, i.e. that 𝐝2opt[(𝚽,q),(𝚽,q)]\mathbf{d}^{2}\mathcal{L}_{opt}\left[(\boldsymbol{\Phi},q),(\boldsymbol{\Phi},q)\right] is positive definite for every (𝚽,q)\left({\boldsymbol{\Phi},q}\right) in the critical cone 𝒦(𝐗opt,popt)\mathcal{K}\left({\mathbf{X}_{opt},p_{opt}}\right) (as defined in (28)), We now define 𝐝2opt[,]\mathbf{d}^{2}\mathcal{L}_{opt}\left[\cdot,\cdot\right] as follows

𝐝2opt[(𝚽,q),(𝚿,r)]\displaystyle\mathbf{d}^{2}\mathcal{L}_{opt}\left[(\boldsymbol{\Phi},q),(\boldsymbol{\Psi},r)\right] =𝚲opt,[𝚽𝐆popt𝚿+𝚿𝐆opt𝚽]\displaystyle=-\left<\boldsymbol{\Lambda}_{opt},\left[\boldsymbol{\Phi}\mathbf{G}_{p_{opt}}\boldsymbol{\Psi}+\boldsymbol{\Psi}\mathbf{G}_{{}_{opt}}\boldsymbol{\Phi}\right]\right>
𝚲opt,[𝚽𝐝𝐆popt(r)𝐗opt+𝐗opt𝐝𝐆popt(r)𝚽]\displaystyle\qquad-\left<\boldsymbol{\Lambda}_{opt},\left[\boldsymbol{\Phi}\mathbf{d}\mathbf{G}_{p_{opt}}(r)\mathbf{X}_{opt}+\mathbf{X}_{opt}\mathbf{d}\mathbf{G}_{p_{opt}}(r)\boldsymbol{\Phi}\right]\right>
𝚲opt,[𝚿𝐝𝐆popt(q)𝐗opt+𝐗opt𝐝𝐆popt(q)𝚿]\displaystyle\qquad-\left<\boldsymbol{\Lambda}_{opt},\left[\boldsymbol{\Psi}\mathbf{d}\mathbf{G}_{p_{opt}}(q)\mathbf{X}_{opt}+\mathbf{X}_{opt}\mathbf{d}\mathbf{G}_{p_{opt}}(q)\boldsymbol{\Psi}\right]\right>
+β(q,r)𝒫𝚲opt,𝐗opt𝐝2𝐆popt(q,r)𝐗opt\displaystyle\qquad+\beta\left({q,r}\right)_{\mathcal{P}}-\left<\boldsymbol{\Lambda}_{opt},\mathbf{X}_{opt}\mathbf{d}^{2}\mathbf{G}_{p_{opt}}\left({q,r}\right)\mathbf{X}_{opt}\right>

for all (𝚽,q),(Ψ,r)𝒥1(H)×𝒫\left({\boldsymbol{\Phi},q}\right),\left({\Psi,r}\right)\in\mathscr{J}_{1}(H)\times\mathcal{P}. Now restricting (𝚽,q)\left({\boldsymbol{\Phi},q}\right) and (𝚿,r)\left({\boldsymbol{\Psi},r}\right) to the critical cone 𝒦(𝐗opt,popt)\mathcal{K}(\mathbf{X}_{opt},p_{opt}) then yields

𝐝2opt[(𝚽,q),(𝚿,r)]\displaystyle\mathbf{d}^{2}\mathcal{L}_{opt}\left[(\boldsymbol{\Phi},q),(\boldsymbol{\Psi},r)\right] =β(q,r)𝒫𝚲opt,𝐗opt𝐝2𝐆popt(q,r)𝐗opt\displaystyle=\beta\left({q,r}\right)_{\mathcal{P}}-\left<\boldsymbol{\Lambda}_{opt},\mathbf{X}_{opt}\mathbf{d}^{2}\mathbf{G}_{p_{opt}}\left({q,r}\right)\mathbf{X}_{opt}\right>

since 𝚽=𝚿=0\boldsymbol{\Phi}=\boldsymbol{\Psi}=0 must be satisfied for any element belonging to 𝒦(𝐗opt,popt)\mathcal{K}\left({\mathbf{X}_{opt},p_{opt}}\right). Taking q=rq=r then immediately indicates that 𝐝2opt\mathbf{d}^{2}\mathcal{L}_{opt} is a positive semi-definite operator on 𝒦(𝐗opt,popt)\mathcal{K}(\mathbf{X}_{opt},p_{opt}) if β\beta is chosen sufficiently large, thereby satisfying the necessary second order condition for (𝐗opt,popt)\left({\mathbf{X}_{opt},p_{opt}}\right) to qualify as a constrained minimizer to the cost functional (29).

We formalize our findings in the following.

Theorem 2.

Assume that 𝐀:𝒟(𝐀)H\mathbf{A}:\mathcal{D}(\mathbf{A})\rightarrow H is the generator of a C0C_{0}-semigroup, 𝐐𝒥1s(H)\mathbf{Q}\in\mathscr{J}_{1}^{s}(H), and 𝐖s(H)\mathbf{W}\in\mathscr{L}^{s}(H). If conditions (15), (16), and (17) are satisfied for the mapping 𝐆():𝒫𝒥1s(H)\mathbf{G}_{(\cdot)}:\mathcal{P}\rightarrow\mathscr{J}_{1}^{s}(H), then there exists a unique solution (𝐗opt,𝚲opt,p)𝒥1s(H)×s(H)×𝒫\left({\mathbf{X}_{opt},\boldsymbol{\Lambda}_{opt},p}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathscr{L}^{s}\left({H}\right)\times\mathcal{P} that satisfies the first-order optimality system (31) associated with the constrained weighted trace minimization problem provided that the penalty parameter β+\beta\in\mathbb{R}_{+} is chosen sufficiently large enough so that kα,β,M,𝐐,𝐖,𝐆<1k_{\alpha,\beta,M,\mathbf{Q},\mathbf{W},\mathbf{G}}<1.

Further assume that (20) is satisfied for the mapping 𝐆():𝒫𝒥1s(H)\mathbf{G}_{(\cdot)}:\mathcal{P}\rightarrow\mathscr{J}_{1}^{s}(H). Then (𝐗opt,popt)𝒥1s(H)×𝒫\left({\mathbf{X}_{opt},p_{opt}}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathcal{P} is the unique constrained minimizer for the cost functional (29) if β+\beta\in\mathbb{R}_{+} is sufficiently large.

In general, Theorem 2 indicates that β\beta can be chosen arbitrarily large to force 𝒥β()\mathcal{J}_{\beta}(\cdot) to have a unique minimizer.

5. Approximate Control Constraint Enforcement

In this section, we consider a penalized technique to approximately enforce a trace constraint on the operator 𝐆p𝒥1s(H)\mathbf{G}_{p}\in\mathscr{J}_{1}^{s}(H) for all p𝒫p\in\mathcal{P}. One important property of this penalization approach is that the approximate constraint enforcement becomes exact as the penalization parameter reaches positive infinity. It is through this mechanism by which we are able to determine that a wide class of sensor placement and design problems admits a unique constrained minimizer. Much of the general techniques used to analyze the problem described in this section have been discussed in the previous section. Because of this, we will only briefly touch upon details that have direct analogs to what was discussed for the previous problem and focus our attention on the technicalities associated with the current problem of interest.

5.1. Problem Statement

The penalized optimization problem analyzed in this section is to seek a constrained minimizer (𝐗opt,popt)𝒥1s(H)×𝒫\left({\mathbf{X}_{opt},p_{opt}}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathcal{P} for the following cost functional

(39) 𝒥β(𝐗,p):=trace(𝐗𝐖)+β2[trace(𝐆p)γ]2\mathcal{J}_{\beta}(\mathbf{X},p):=\texttt{trace}\left(\mathbf{X}\mathbf{W}\right)+\frac{\beta}{2}\left[\texttt{trace}\left(\mathbf{G}_{p}\right)-\gamma\right]^{2}

subject to

(40) 𝐀𝐗+𝐗𝐀𝐗𝐆p𝐗+𝐐=𝟎,\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}=\mathbf{0},

where we assume that 𝐐𝒥1s(H)\mathbf{Q}\in\mathscr{J}_{1}^{s}(H) is again a positive semi-definite trace-class operator.

5.2. First-Order Optimality System

The Lagrangian functional associated with the constrained optimization problem studied in this section is the following

(41) (𝐗,p,𝚲):=𝐈,𝐗𝐖+β2[trace(𝐆p)γ]2+𝚲,𝐀𝐗+𝐗𝐀𝐗𝐆p𝐗+𝐐.\mathcal{L}(\mathbf{X},p,\boldsymbol{\Lambda}):=\left<\mathbf{I},\mathbf{X}\mathbf{W}\right>+\frac{\beta}{2}\left[\texttt{trace}\left(\mathbf{G}_{p}\right)-\gamma\right]^{2}+\left<\boldsymbol{\Lambda},\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}\right>.

Taking the first Fréchet derivative of (,,)\mathcal{L}(\cdot,\cdot,\cdot) with respect to 𝐗𝒥1s(H)\mathbf{X}\in\mathscr{J}_{1}^{s}(H), p𝒫p\in\mathcal{P}, and 𝚲s(H)\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H) yields the first-order optimality system associated with (41) stated below.

Primal Problem:
(42a) 𝐀𝐗+𝐗𝐀𝐗𝐆p𝐗+𝐐=0\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}=0
Dual Problem:
(42b) (𝐀𝐆p𝐗)𝚲+𝚲(𝐀𝐗𝐆p)=𝐖\left({\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X}}\right)\boldsymbol{\Lambda}+\boldsymbol{\Lambda}\left({\mathbf{A}-\mathbf{X}\mathbf{G}_{p}}\right)=-\mathbf{W}
Optimality Condition:
(42c) p=1𝐗𝚲𝐗(H)[𝐝𝐆p𝐝𝐆p]1𝐝𝐆p𝐗𝚲𝐗𝐝𝐆ppp=\frac{1}{\left\|\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}\right\|_{\mathscr{L}(H)}}\left[\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{d}\mathbf{G}_{p}\right]^{-1}\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}\mathbf{d}\mathbf{G}_{p}p
Control Operator Trace Constraint:
(42d) trace(𝐆p)=γ+1β𝐗𝚲𝐗(H)\texttt{trace}\left(\mathbf{G}_{p}\right)=\gamma+\frac{1}{\beta}\left\|\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}\right\|_{\mathscr{L}(H)}

Because the derivation of (42a) and (42b) is nearly identical to that of (31a) and (31b), we will only discuss the derivation of (42c) and (42d) in the following.

Remark 1.

An inspection of (42d) indicates that trace(𝐆p)γ\texttt{trace}\left(\mathbf{G}_{p}\right)\rightarrow\gamma as β+\beta\rightarrow+\infty. Applying (42d) in (39) then indicates that 𝒥β(𝐗;p)\mathcal{J}_{\beta}(\mathbf{X};p) becomes trace(𝐗𝐖)\texttt{trace}\left(\mathbf{X}\mathbf{W}\right) as β+\beta\rightarrow+\infty.

5.2.1. Derivation of the Optimality Condition

The chain rule and necessary stationary implies that

p(q):=𝐆p[𝐆pp(q)]=0\frac{\partial\mathcal{L}}{\partial p}(q):=\frac{\partial\mathcal{L}}{\partial\mathbf{G}_{p}}\left[\frac{\partial\mathbf{G}_{p}}{\partial p}(q)\right]=0

for all q𝒫q\in\mathcal{P}.Because we have assumed that 𝐝𝐆p0\mathbf{d}\mathbf{G}_{p}\neq 0 in (18), we have that

(43) 𝐆p(𝐇)\displaystyle\frac{\partial\mathcal{L}}{\partial\mathbf{G}_{p}}(\mathbf{H}) :=β𝐈,𝐇[trace(𝐆p)γ]𝚲,𝐗𝐇𝐗\displaystyle=\beta\left<\mathbf{I},\mathbf{H}\right>\left[\texttt{trace}\left(\mathbf{G}_{p}\right)-\gamma\right]-\left<\boldsymbol{\Lambda},\mathbf{X}\mathbf{H}\mathbf{X}\right>
=β[trace(𝐆p)γ]𝐈,𝐇𝐗𝚲𝐗,𝐇\displaystyle=\beta[\texttt{trace}\left(\mathbf{G}_{p}\right)-\gamma]\left<\mathbf{I},\mathbf{H}\right>-\left<\mathbf{X}\boldsymbol{\Lambda}\mathbf{X},\mathbf{H}\right>
=0\displaystyle=0

for all 𝐇𝒥1(H)\mathbf{H}\in\mathscr{J}_{1}(H). This then implies that

β[trace(𝐆p)γ]𝐈=𝐗𝚲𝐗\beta\left[\texttt{trace}\left(\mathbf{G}_{p}\right)-\gamma\right]\mathbf{I}=\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}

and (42d) follows after norming both sides of the above equation with the (H)\mathscr{L}(H) norm and performing some algebraic manipulation.

Taking 𝐇=𝐝𝐆p(q)\mathbf{H}=\mathbf{d}\mathbf{G}_{p}(q) for all q𝒫q\in\mathcal{P} in (43) results in

p(𝐝𝐆p(q))\displaystyle\frac{\partial\mathcal{L}}{\partial p}(\mathbf{d}\mathbf{G}_{p}(q)) =β[trace(𝐆p)γ]𝐈,𝐝𝐆p(q)𝐗𝚲𝐗,𝐝𝐆p(q)\displaystyle=\beta[\texttt{trace}\left(\mathbf{G}_{p}\right)-\gamma]\left<\mathbf{I},\mathbf{d}\mathbf{G}_{p}(q)\right>-\left<\mathbf{X}\boldsymbol{\Lambda}\mathbf{X},\mathbf{d}\mathbf{G}_{p}(q)\right>
=β[trace(𝐆p)γ](𝐝𝐆p𝐈,q)𝒫(𝐝𝐆p𝐗𝚲𝐗,q)𝒫\displaystyle=\beta\left[\texttt{trace}\left(\mathbf{G}_{p}\right)-\gamma\right]\left({\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{I},q}\right)_{\mathcal{P}}-\left({\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}\boldsymbol{\Lambda}\mathbf{X},q}\right)_{\mathcal{P}}
=0.\displaystyle=0.

This then implies that

𝐝𝐆p𝐈=1β[trace(𝐆p)γ]𝐝𝐆p𝐗𝚲𝐗.\mathbf{d}\mathbf{G}^{*}_{p}\mathbf{I}=\frac{1}{\beta\left[\texttt{trace}\left(\mathbf{G}_{p}\right)-\gamma\right]}\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}.

We now apply (42d) to obtain

𝐝𝐆p𝐈=1𝐗𝚲𝐗(H)𝐝𝐆p𝐗𝚲𝐗.\mathbf{d}\mathbf{G}^{*}_{p}\mathbf{I}=\frac{1}{\left\|\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}\right\|_{\mathscr{L}\left({H}\right)}}\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}.

Right-multiplying this equation on both sides by 𝐝𝐆pp\mathbf{d}\mathbf{G}_{p}p then yields

𝐝𝐆p𝐝𝐆pp=1𝐗𝚲𝐗(H)𝐝𝐆p𝐗𝚲𝐗𝐝𝐆pp\mathbf{d}\mathbf{G}^{*}_{p}\mathbf{d}\mathbf{G}_{p}p=\frac{1}{\left\|\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}\right\|_{\mathscr{L}\left({H}\right)}}\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}\mathbf{d}\mathbf{G}_{p}p

and (42c) is obtained by left-multiplying both sides of the above by [𝐝𝐆p𝐝𝐆p]1\left[\mathbf{d}\mathbf{G}^{*}_{p}\mathbf{d}\mathbf{G}_{p}\right]^{-1} after recalling (19). With this, we are ready to analyze the first-order optimality system (42).

5.2.2. Unique Solution to First-Order Optimality System

We begin our analysis by noticing that there is a continuous mapping p(𝐗(p),𝚲(p))p\mapsto\left({\mathbf{X}(p),\boldsymbol{\Lambda}(p)}\right) where 𝐗𝒥1s(H)\mathbf{X}\in\mathscr{J}_{1}^{s}(H) and 𝚲s(H)\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H) correspond to the solutions to (42a) and (42b) respectively. This observation is made using the same argument as in §4.2.2. With this, we may focus on determining that (42c) has a unique fixed point p𝒫p\in\mathcal{P} on the manifold induced by (42d).

Let us write f():𝒫𝒫f(\cdot):\mathcal{P}\rightarrow\mathcal{P} as the nonlinear function defined in (42c), i.e.

f(p):=1𝐗𝚲𝐗(H)[𝐝𝐆p𝐝𝐆p]1𝐝𝐆p𝐗(p)𝚲(p)𝐗(p)𝐝𝐆pp.f(p):=\frac{1}{\left\|\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}\right\|_{\mathscr{L}(H)}}\left[\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{d}\mathbf{G}_{p}\right]^{-1}\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}(p)\boldsymbol{\Lambda}(p)\mathbf{X}(p)\mathbf{d}\mathbf{G}_{p}p.

It is our job again to determine the conditions under which there exists a positive constant k<1k<1 that satisfies

f(p1)f(p2)𝒫kp1p2𝒫\left\|f(p_{1})-f(p_{2})\right\|_{\mathcal{P}}\leq k\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}

for any two p1,p2𝒫p_{1},p_{2}\in\mathcal{P}.

Notice that (42c) is scale-invariant. Let us take

(44) p:=sp~p:=s\widetilde{p}

where s+s\in\mathbb{R}_{+} is a positive scaling factor and p~\widetilde{p} is a reference coordinate space defined such that supp~𝒫=1\sup\left\|\widetilde{p}\right\|_{\mathcal{P}}=1. The scale-invariance arises from the observation that, no matter how one chooses the scaling factor ss in the affine transformation in (44), it never appears in the definition of the fixed-point problem (42c). This is because the scaling factors ss cancel out in (42c) when applying this affine coordinate transformation. This then allows us to take supp𝒫=1\sup\left\|p\right\|_{\mathcal{P}}=1 in our analysis without loss of generality.

We now begin our well-posedness analysis by partitioning the fixed point equation into the following

p=[(I)(II)(III)]p,p=\left[(I)(II)(III)\right]p,

where we have denoted

(I):=1𝐗𝚲𝐗(H),(II):=[𝐝𝐆p𝐝𝐆p]1,(III):=𝐝𝐆p𝐗𝚲𝐗𝐝𝐆p.(I):=\frac{1}{\left\|\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}\right\|_{\mathscr{L}(H)}},\quad(II):=\left[\mathbf{d}\mathbf{G}^{*}_{p}\mathbf{d}\mathbf{G}_{p}\right]^{-1},\quad(III):=\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}\mathbf{d}\mathbf{G}_{p}.

These partitioned terms are bounded as follows.

(45) |(I)|1μ,|(I)|\leq\frac{1}{\mu},

where μ:=infp𝒫𝐗(p)𝚲(p)𝐗(p)(H)\mu:=\inf_{p\in\mathcal{P}}\left\|\mathbf{X}(p)\boldsymbol{\Lambda}(p)\mathbf{X}(p)\right\|_{\mathscr{L}(H)}.

The lower bound of μ\mu is bounded above by zero since 𝐗=𝟎\mathbf{X}=\mathbf{0} if and only if 𝐐=0\mathbf{Q}=0 and 𝚲𝟎\boldsymbol{\Lambda}\neq\mathbf{0} as long as 𝐖0\mathbf{W}\neq\mathbf{\mathbf{}}0. To see this, consider the Bochner integral form of the operator-valued Riccati equation (6). First, assume that 𝐗=𝟎\mathbf{X}=\mathbf{0}, then 𝐐\mathbf{Q} must be 𝟎\mathbf{0} since 0+𝐒(t)𝐐𝐒(t)𝑑t=𝟎\int_{0}^{+\infty}\mathbf{S}(t)\mathbf{Q}\mathbf{S}^{*}(t)dt=\mathbf{0} if and only if 𝐐=𝟎\mathbf{Q}=\mathbf{0}. Now assume 𝐐=𝟎\mathbf{Q}=\mathbf{0}, then 0+𝐒(t)(𝐗𝐆p𝐗)𝐒(t)𝑑t=𝟎\int_{0}^{+\infty}\mathbf{S}(t)\left({\mathbf{X}\mathbf{G}_{p}\mathbf{X}}\right)\mathbf{S}^{*}(t)dt=\mathbf{0} implies that 𝐗=𝟎\mathbf{X}=\mathbf{0}. Next, 𝚲\boldsymbol{\Lambda} satisfies the Bochner integral form of the dual problem (14). The linearity of (14) and well-posedness of the equation then implies that 𝚲=0\boldsymbol{\Lambda}=0 if and only if 𝐖=0\mathbf{W}=0. Hence, we must have that infp𝒫𝐗(p)𝚲(p)𝐗(p)(H)>0.\inf_{p\in\mathcal{P}}\left\|\mathbf{X}(p)\boldsymbol{\Lambda}(p)\mathbf{X}(p)\right\|_{\mathscr{L}\left({H}\right)}>0.

We then define

(II)(𝒫):=K.\left\|(II)\right\|_{\mathscr{L}\left({\mathcal{P}}\right)}:=K.

And finally,

(III)(𝒫)C𝐝𝐆2M68α3𝐐12𝐖(H).\left\|(III)\right\|_{\mathscr{L}\left({\mathcal{P}}\right)}\leq C_{\mathbf{d}\mathbf{G}}^{2}\frac{M^{6}}{8\alpha^{3}}\left\|\mathbf{Q}\right\|_{1}^{2}\left\|\mathbf{W}\right\|_{\mathscr{L}\left({H}\right)}.

It is again sufficient to assume that supp𝒫p𝒫=1\sup_{p\in\mathcal{P}}\left\|p\right\|_{\mathcal{P}}=1 due to the scale invariance of the fixed point equation. With this, we determine that

(I)1(II)1(III)1p1(I)2(II)2(III)2p2𝒫\displaystyle\left\|(I)_{1}(II)_{1}(III)_{1}p_{1}-(I)_{2}(II)_{2}(III)_{2}p_{2}\right\|_{\mathcal{P}} |(I)1(I)2|(II)2(𝒫)(III)2(𝒫)p2𝒫\displaystyle\leq\left|(I)_{1}-(I)_{2}\right|\left\|(II)_{2}\right\|_{\mathscr{L}\left({\mathcal{P}}\right)}\left\|(III)_{2}\right\|_{\mathscr{L}\left({\mathcal{P}}\right)}\left\|p_{2}\right\|_{\mathcal{P}}
+|(I)1|(II)1(II)2(𝒫)(III)3(𝒫)p2𝒫\displaystyle\quad+|(I)_{1}|\left\|(II)_{1}-(II)_{2}\right\|_{\mathscr{L}\left({\mathcal{P}}\right)}\left\|(III)_{3}\right\|_{\mathscr{L}\left({\mathcal{P}}\right)}\left\|p_{2}\right\|_{\mathcal{P}}
+|(I)1|(II)1(𝒫)(III)1(III)2(𝒫)p2𝒫\displaystyle\quad+|(I)_{1}|\left\|(II)_{1}\right\|_{\mathscr{L}\left({\mathcal{P}}\right)}\left\|(III)_{1}-(III)_{2}\right\|_{\mathscr{L}(\mathcal{P})}\left\|p_{2}\right\|_{\mathcal{P}}
+|(I)1|(II)1(𝒫)(III)1(𝒫)p1p2𝒫\displaystyle\quad+|(I)_{1}|\left\|(II)_{1}\right\|_{\mathscr{L}\left({\mathcal{P}}\right)}\left\|(III)_{1}\right\|_{\mathscr{L}\left({\mathcal{P}}\right)}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}
KC𝐝𝐆2M68α𝐐12𝐖(H)|(I)1(I)2|\displaystyle\leq\frac{KC^{2}_{\mathbf{d}\mathbf{G}}M^{6}}{8\alpha}\left\|\mathbf{Q}\right\|_{1}^{2}\left\|\mathbf{W}\right\|_{\mathscr{L}\left({H}\right)}|(I)_{1}-(I)_{2}|
+C𝐝𝐆2M68α3μ𝐐12𝐖(H)(II)1(II)2(𝒫)\displaystyle\quad+\frac{C^{2}_{\mathbf{d}\mathbf{G}}M^{6}}{8\alpha^{3}\mu}\left\|\mathbf{Q}\right\|_{1}^{2}\left\|\mathbf{W}\right\|_{\mathscr{L}(H)}\left\|(II)_{1}-(II)_{2}\right\|_{\mathscr{L}\left({\mathcal{P}}\right)}
+Kμ(III)1(III)2(𝒫)\displaystyle\quad+\frac{K}{\mu}\left\|(III)_{1}-(III)_{2}\right\|_{\mathscr{L}\left({\mathcal{P}}\right)}
+KC𝐝𝐆2M68α3μp1p2𝒫.\displaystyle\quad+\frac{KC_{\mathbf{d}\mathbf{G}}^{2}M^{6}}{8\alpha^{3}\mu}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}.

We now bound every difference term in the above inequality. First, see that

|(I)1(I)2|\displaystyle|(I)_{1}-(I)_{2}| =|1𝐗1𝚲1𝐗1(H)1𝐗2𝚲2𝐗2(H)|\displaystyle=\left|\frac{1}{\left\|\mathbf{X}_{1}\boldsymbol{\Lambda}_{1}\mathbf{X}_{1}\right\|_{\mathscr{L}\left({H}\right)}}-\frac{1}{\left\|\mathbf{X}_{2}\boldsymbol{\Lambda}_{2}\mathbf{X}_{2}\right\|_{\mathscr{L}\left({H}\right)}}\right|
𝐗2𝚲2𝐗2𝐗1𝚲1𝐗1(H)𝐗1𝚲1𝐗1(H)𝐗2𝚲2𝐗2(H)\displaystyle\leq\frac{\left\|\mathbf{X}_{2}\boldsymbol{\Lambda}_{2}\mathbf{X}_{2}-\mathbf{X}_{1}\boldsymbol{\Lambda}_{1}\mathbf{X}_{1}\right\|_{\mathscr{L}\left({H}\right)}}{\left\|\mathbf{X}_{1}\boldsymbol{\Lambda}_{1}\mathbf{X}_{1}\right\|_{\mathscr{L}(H)}\left\|\mathbf{X}_{2}\boldsymbol{\Lambda}_{2}\mathbf{X}_{2}\right\|_{\mathscr{L}\left({H}\right)}}
1μ2(𝐗2𝐗1(H)𝚲1(H)𝐗1(H)\displaystyle\leq\frac{1}{\mu^{2}}\bigg(\left\|\mathbf{X}_{2}-\mathbf{X}_{1}\right\|_{\mathscr{L}(H)}\left\|\boldsymbol{\Lambda}_{1}\right\|_{\mathscr{L}\left({H}\right)}\left\|\mathbf{X}_{1}\right\|_{\mathscr{L}\left({H}\right)}
+𝐗2(H)𝚲2𝚲1(H)𝐗1(H)\displaystyle\qquad+\left\|\mathbf{X}_{2}\right\|_{\mathscr{L}\left({H}\right)}\left\|\boldsymbol{\Lambda}_{2}-\boldsymbol{\Lambda}_{1}\right\|_{\mathscr{L}\left({H}\right)}\left\|\mathbf{X}_{1}\right\|_{\mathscr{L}\left({H}\right)}
+𝐗2(H)𝚲2(H)𝐗2𝐗1(H))\displaystyle\qquad+\left\|\mathbf{X}_{2}\right\|_{\mathscr{L}\left({H}\right)}\left\|\boldsymbol{\Lambda}_{2}\right\|_{\mathscr{L}\left({H}\right)}\left\|\mathbf{X}_{2}-\mathbf{X}_{1}\right\|_{\mathscr{L}\left({H}\right)}\bigg)
M42α2μ2𝐐1𝐖(H)𝐗1𝐗21+M44α2μ2𝐐12𝚲1𝚲2(H)\displaystyle\leq\frac{M^{4}}{2\alpha^{2}\mu^{2}}\left\|\mathbf{Q}\right\|_{1}\left\|\mathbf{W}\right\|_{\mathscr{L}\left({H}\right)}\left\|\mathbf{X}_{1}-\mathbf{X}_{2}\right\|_{1}+\frac{M^{4}}{4\alpha^{2}\mu^{2}}\left\|\mathbf{Q}\right\|_{1}^{2}\left\|\boldsymbol{\Lambda}_{1}-\boldsymbol{\Lambda}_{2}\right\|_{\mathscr{L}\left({H}\right)}
L𝐆M1016α5μ𝐐13𝐖(H)p1p2𝒫+L𝐆M44α2μ2(M10γβ16α5𝐐14+M64α3𝐐13)p1p2𝒫\displaystyle\leq\frac{L_{\mathbf{G}}M^{10}}{16\alpha^{5}\mu}\left\|\mathbf{Q}\right\|_{1}^{3}\left\|\mathbf{W}\right\|_{\mathscr{L}\left({H}\right)}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}+\frac{L_{\mathbf{G}}M^{4}}{4\alpha^{2}\mu^{2}}\left({\frac{M^{10}\gamma_{\beta}}{16\alpha^{5}}\left\|\mathbf{Q}\right\|^{4}_{1}+\frac{M^{6}}{4\alpha^{3}}\left\|\mathbf{Q}\right\|^{3}_{1}}\right)\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}
=k(I),α,𝐐p1p2𝒫,\displaystyle=k_{(I),\alpha,\mathbf{Q}}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}},

where we have defined γβ:=γ+1βsupp𝒫𝐗(p)𝚲(p)𝐗(p)(H)\gamma_{\beta}:=\gamma+\frac{1}{\beta}\sup_{p\in\mathcal{P}}\left\|\mathbf{X}(p)\boldsymbol{\Lambda}(p)\mathbf{X}(p)\right\|_{\mathscr{L}\left({H}\right)}. Next, from the Lipshitz continuity of 𝐝𝐆()\mathbf{d}\mathbf{G}_{(\cdot)} with respect to p𝒫p\in\mathcal{P}, we have that

(II)1(II)2(𝒫)Lp1p2𝒫.\left\|(II)_{1}-(II)_{2}\right\|_{\mathscr{L}\left({\mathcal{P}}\right)}\leq L\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}.

Finally, we derive a Lipschitz bound for the final partitioned difference term. Leveraging the analysis presented in the previous section, we have that

(III)1(III)2(H)\displaystyle\left\|(III)_{1}-(III)_{2}\right\|_{\mathscr{L}\left({H}\right)}
(𝐝𝐆p1𝐗1𝚲1𝐗1𝐝𝐆p2𝐗2𝚲2𝐗2)𝐝𝐆p2(𝒫)+𝐝𝐆p1𝐗1𝚲1𝐗1(𝐝𝐆p1𝐝𝐆p2)(H)\displaystyle\quad\leq\left\|\left({\mathbf{d}\mathbf{G}_{p_{1}}^{*}\mathbf{X}_{1}\boldsymbol{\Lambda}_{1}\mathbf{X}_{1}-\mathbf{d}\mathbf{G}_{p_{2}}^{*}\mathbf{X}_{2}\boldsymbol{\Lambda}_{2}\mathbf{X}_{2}}\right)\mathbf{d}\mathbf{G}_{p_{2}}\right\|_{\mathscr{L}\left({\mathcal{P}}\right)}+\left\|\mathbf{d}\mathbf{G}_{p_{1}}^{*}\mathbf{X}_{1}\boldsymbol{\Lambda}_{1}\mathbf{X}_{1}\left({\mathbf{d}\mathbf{G}_{p_{1}}-\mathbf{d}\mathbf{G}_{p_{2}}}\right)\right\|_{\mathscr{L}\left({H}\right)}
C𝐝𝐆(L𝐝𝐆M616α3𝐐12𝐖(H)p1p2𝒫\displaystyle\quad\leq C_{\mathbf{d}\mathbf{G}}\bigg(\frac{L_{\mathbf{d}\mathbf{G}}M^{6}}{16\alpha^{3}}\left\|\mathbf{Q}\right\|_{1}^{2}\left\|\mathbf{W}\right\|_{\mathscr{L}(H)}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}
+L𝐝𝐆C𝐝𝐆M1016α5𝐐13𝐖(H)p1p2𝒫\displaystyle\qquad+\frac{L_{\mathbf{d}\mathbf{G}}C_{\mathbf{d}\mathbf{G}}M^{10}}{16\alpha^{5}}\left\|\mathbf{Q}\right\|_{1}^{3}\left\|\mathbf{W}\right\|_{\mathscr{L}(H)}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}
+L𝐆C𝐝𝐆M42α2𝐐12(M10γβ16α5𝐐12+M64α3𝐐1)p1p2𝒫)\displaystyle\qquad+\frac{L_{\mathbf{G}}C_{\mathbf{d}\mathbf{G}}M^{4}}{2\alpha^{2}}\left\|\mathbf{Q}\right\|_{1}^{2}\left({\frac{M^{10}\gamma_{\beta}}{16\alpha^{5}}\left\|\mathbf{Q}\right\|^{2}_{1}+\frac{M^{6}}{4\alpha^{3}}\left\|\mathbf{Q}\right\|_{1}}\right)\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}\bigg)
+C𝐝𝐆L𝐝𝐆M68α3𝐐12𝐖(H)p1p2𝒫\displaystyle\qquad+\frac{C_{\mathbf{d}\mathbf{G}}L_{\mathbf{d}\mathbf{G}}M^{6}}{8\alpha^{3}}\left\|\mathbf{Q}\right\|_{1}^{2}\left\|\mathbf{W}\right\|_{\mathscr{L}\left({H}\right)}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}
=k(III),α,bQp1p2𝒫.\displaystyle=k_{(III),\alpha,bQ}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}.

Putting this all together, we have determined that there exists a constant kM,α,β,𝐆,𝐐,𝐖+k_{M,\alpha,\beta,\mathbf{G},\mathbf{Q},\mathbf{W}}\in\mathbb{R}_{+} dependent on M,α,β,𝐆1,𝐐1,𝐖(H)M,\alpha,\beta,\left\|\mathbf{G}\right\|_{1},\left\|\mathbf{Q}\right\|_{1},\left\|\mathbf{W}\right\|_{\mathscr{L}(H)} so that

f(p1)f(p2)𝒫kM,α,β,𝐆,𝐐,𝐖p1p2𝒫\left\|f(p_{1})-f(p_{2})\right\|_{\mathcal{P}}\leq k_{M,\alpha,\beta,\mathbf{G},\mathbf{Q},\mathbf{W}}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}

for any two p1,p2𝒫p_{1},p_{2}\in\mathcal{P}. Inspection of the definition of the constant kM,α,β,𝐆,𝐐,𝐖k_{M,\alpha,\beta,\mathbf{G},\mathbf{Q},\mathbf{W}} indicates that if α,β+\alpha,\beta\in\mathbb{R}_{+} is sufficiently large, and γ,𝐐1,𝐖+\gamma,\left\|\mathbf{Q}\right\|_{1},\left\|\mathbf{W}\right\|\in\mathbb{R}_{+} is sufficiently small, then (42c) has a unique fixed point.

Evaluating the second variation of (,,)\mathcal{L}(\cdot,\cdot,\cdot) at (𝐗opt,popt,𝚲opt)(\mathbf{X}_{opt},p_{opt},\boldsymbol{\Lambda}_{opt}) in the direction vectors contained in the critical cone 𝒦(𝐗opt,popt)\mathcal{K}(\mathbf{X}_{opt},p_{opt}), as defined in (28), then yields

𝐝2opt[(𝚽,q),(𝚽,q)]\displaystyle\mathbf{d}^{2}\mathcal{L}_{opt}[(\boldsymbol{\Phi},q),(\boldsymbol{\Phi},q)] =β[trace(𝐆popt)γ]𝐈,𝐝2𝐆popt(q,q)\displaystyle=\beta\left[\texttt{trace}\left(\mathbf{G}_{p_{opt}}\right)-\gamma\right]\left<\mathbf{I},\mathbf{d}^{2}\mathbf{G}_{p_{opt}}(q,q)\right>
+β𝐈,𝐝𝐆popt(q)2𝚲opt,𝐗opt𝐝2𝐆popt(q,q)𝐗opt\displaystyle\qquad+\beta\left<\mathbf{I},\mathbf{d}\mathbf{G}_{p_{opt}}(q)\right>^{2}-\left<\boldsymbol{\Lambda}_{opt},\mathbf{X}_{opt}\mathbf{d}^{2}\mathbf{G}_{p_{opt}}(q,q)\mathbf{X}_{opt}\right>
=𝐗opt𝐆popt𝐗opt(H)𝐈,𝐝2𝐆popt(q,q)\displaystyle=\left\|\mathbf{X}_{opt}\mathbf{G}_{p_{opt}}\mathbf{X}_{opt}\right\|_{\mathscr{L}\left({H}\right)}\left<\mathbf{I},\mathbf{d}^{2}\mathbf{G}_{p_{opt}}(q,q)\right>
+β𝐈,𝐝𝐆popt(q)2𝚲opt,𝐗opt𝐝2𝐆popt(q,q)𝐗opt\displaystyle\qquad+\beta\left<\mathbf{I},\mathbf{d}\mathbf{G}_{p_{opt}}(q)\right>^{2}-\left<\boldsymbol{\Lambda}_{opt},\mathbf{X}_{opt}\mathbf{d}^{2}\mathbf{G}_{p_{opt}}(q,q)\mathbf{X}_{opt}\right>

for all (𝚽,q)𝒦(𝐗opt,popt).\left({\boldsymbol{\Phi},q}\right)\in\mathcal{K}\left({\mathbf{X}_{opt},p_{opt}}\right). Inspecting the expression above indicates that 𝐝2opt[(𝚽,q),(𝚽,q)]\mathbf{d}^{2}\mathcal{L}_{opt}[(\boldsymbol{\Phi},q),(\boldsymbol{\Phi},q)] is positive definite if β+\beta\in\mathbb{R}_{+} is sufficiently large after recalling (42d).

Theorem 3.

Assume that 𝐀:𝒟(𝐀)H\mathbf{A}:\mathcal{D}(\mathbf{A})\rightarrow H is the generator of a C0C_{0}-semigroup, 𝐐𝒥1s(H)\mathbf{Q}\in\mathscr{J}_{1}^{s}(H), and 𝐖s(H)\mathbf{W}\in\mathscr{L}^{s}(H). Then, if conditions (15), (16), (17), and (19) are satisfied for the mapping 𝐆():𝒫𝒥1s(H)\mathbf{G}_{(\cdot)}:\mathcal{P}\rightarrow\mathscr{J}_{1}^{s}(H), then there exists a unique solution (𝐗opt,𝚲opt,p)𝒥1s(H)×s(H)×𝒫\left({\mathbf{X}_{opt},\boldsymbol{\Lambda}_{opt},p}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathscr{L}^{s}\left({H}\right)\times\mathcal{P} that satisfies the first-order optimality system (42) associated with the constrained weighted trace minimization problem, provided that the penalty parameter β+\beta\in\mathbb{R}_{+} is chosen sufficiently large enough and that α+\alpha\in\mathbb{R}_{+} is sufficiently large to reduce the contributions of 𝐐1,𝐖(H),\left\|\mathbf{Q}\right\|_{1},\left\|\mathbf{W}\right\|_{\mathscr{L}(H)}, and MM in the definition of the constant kM,α,β,𝐆,𝐐,𝐖k_{M,\alpha,\beta,\mathbf{G},\mathbf{Q},\mathbf{W}}.

Further assume that (20) is satisfied for the mapping 𝐆():𝒫𝒥1s(H)\mathbf{G}_{(\cdot)}:\mathcal{P}\rightarrow\mathscr{J}_{1}^{s}(H). Then (𝐗opt,popt)𝒥1s(H)×𝒫\left({\mathbf{X}_{opt},p_{opt}}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathcal{P} is the unique constrained minimizer for the cost functional (39) provided that the conditions described above for uniqueness are satisfied.

Theorem 3 provides the necessary conditions needed for 𝒥β()\mathcal{J}_{\beta}(\cdot) to have a constrained minimizer. In general, the constrained minimizer is not guaranteed to be unique unless α\alpha is sufficiently large enough to reduce the effects of 𝐐1,𝐖(H)\left\|\mathbf{Q}\right\|_{1},\left\|\mathbf{W}\right\|_{\mathscr{L}\left({H}\right)}, and MM in the definition of the contraction constant kM,α,β,𝐆,𝐐,𝐖k_{M,\alpha,\beta,\mathbf{G},\mathbf{Q},\mathbf{W}}. This is in contrast to the constrained optimization problem presented in §4 where β\beta can be chosen large enough to guarantee the uniqueness.

Assume for now that the conditions in Theorem 3 are satisfied and α\alpha is sufficiently large enough to guarantee the uniqueness of the fixed point for β\beta chosen large enough. Let us now define (𝐗optβ,poptβ)𝒥1s(H)×𝒫(\mathbf{X}^{\beta}_{opt},p^{\beta}_{opt})\in\mathscr{J}_{1}^{s}(H)\times\mathcal{P} to be the constrained minimizer associated with the choice of β\beta in the definition of (39). Theorem 3 suggests that there is a unique constrained minimizer to 𝒥β()\mathcal{J}_{\beta}(\cdot) for every β>βmin\beta>\beta_{min} for some threshold value βmin\beta_{min}. Taking β+\beta\rightarrow+\infty preserves the contractive property of the fixed point problem (42c) while simultaneously enforcing the constraint trace(𝐆p)=γ\texttt{trace}\left(\mathbf{G}_{p}\right)=\gamma. This then implies that the limit limβ(𝐗optβ,poptβ)=(𝐗opt,popt)\lim_{\beta\rightarrow\infty}\left({\mathbf{X}^{\beta}_{opt},p^{\beta}_{opt}}\right)=\left({\mathbf{X}^{\infty}_{opt},p^{\infty}_{opt}}\right) is well-defined. We also observe that the Hessian operator, while remaining positive, becomes unbounded in this limit. This suggests that the stationary point associated with the constrained minimizer (𝐗opt,popt)\left({\mathbf{X}^{\infty}_{opt},p^{\infty}_{opt}}\right) is not twice differentiable. This discussion along with Remark 1 then implies the following.

Corollary 1.

Assume that the conditions in Theorem 3 are satisfied. Then there exists a unique constrained minimizer (𝐗opt,popt)𝒥1s(H)×𝒫\left({\mathbf{X}_{opt},p_{opt}}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathcal{P} that satisfies the following constrained minimization problem.

minp𝒫trace(𝐗(p)𝐖)\min_{p\in\mathcal{P}}\texttt{trace}\left(\mathbf{X}(p)\mathbf{W}\right)

subject to

{𝐀𝐗+𝐗𝐀𝐗𝐆p𝐗+𝐐=𝟎trace(𝐆p)=γ.\left\{\begin{aligned} \mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}&=\mathbf{0}\\ \texttt{trace}\left(\mathbf{G}_{p}\right)&=\gamma.\end{aligned}\right.

6. Discussion

In this work, we have determined the well-posedness of the strong form of the operator-valued Riccati equation (the primal problem). Upon this foundation, we are able to determine the well-posedness of the dual problem and prove that the solutions to both the primal and dual problems are Lipschitz continuous with respect to the parameter variable that describes the operator associated with the control device. From there, we are able to determine that both constrained weighted trace minimization problems presented in this work have unique constrained minimizers under certain necessary conditions through a fixed-point argument. In the penalization parameter can be chosen sufficiently in the penalized parameter optimization problem to force uniqueness, whereas this property does not extend to the problem of penalization for approximate constraint enforcement. Under the conditions in which the second optimization problem does have a unique minimizer, we determine that in the infinite limit of the penalization parameter that the uniqueness of the minimizer is preserved. This then implies that exact constraint enforcement also leads to a unique minimizer provided that the necessary conditions prescribed in Theorem 3 are satisfied.

The results of this work are mostly theoretical. However, it can be utilized to design efficient sensor and actuator design and placement algorithms. The uniqueness of the minimizer enables the use of gradient-based optimization algorithms, as opposed to more expensive global optimization algorithms, since uniqueness is known a priori, The results regarding the strong form of the operator-valued Riccati equation indicates that its solution is compact in the sense that it maps 𝒟(𝐀)\mathcal{D}(\mathbf{A}^{*})^{\prime} into 𝒟(𝐀)\mathcal{D}(\mathbf{A}^{*}). We plan on utilizing this observation in an attempt to derive optimal convergence rates for the numerical approximation for operator-valued Riccati equations that arise from unbounded sensing and actuation problems in a future work.

References

  • [1] Sheldon Axler. Linear algebra done right. Springer Nature, 2024.
  • [2] Alain Bensoussan. Optimization of sensors’ location in a distributed filtering problem. In Stability of Stochastic Dynamical Systems: Proceedings of the International Symposium Organized by “The Control Theory Centre”, University of Warwick, July 10–14, 1972 Sponsored by the “International Union of Theoretical and Applied Mechanics”, pages 62–84. Springer, 2006.
  • [3] Alain Bensoussan, Giuseppe Da Prato, Michel C Delfour, and Sanjoy K Mitter. Representation and control of infinite dimensional systems. Springer, 2007.
  • [4] John A Burns and Carlos N Rautenberg. The infinite-dimensional optimal filtering problem with mobile and stationary sensor networks. Numerical Functional Analysis and Optimization, 36(2):181–224, 2015.
  • [5] John A Burns and Carlos N Rautenberg. Solutions and approximations to the riccati integral equation with values in a space of compact operators. SIAM Journal on Control and Optimization, 53(5):2846–2877, 2015.
  • [6] James Cheung. On the approximation of operator-valued riccati equations in hilbert spaces. Journal of Mathematical Analysis and Applications, 547(1):129250, 2025.
  • [7] Philippe G Ciarlet. Linear and nonlinear functional analysis with applications. SIAM, 2025.
  • [8] John B Conway. A course in operator theory, volume 21. American Mathematical Society, 2025.
  • [9] Ruth F Curtain and Hans Zwart. An introduction to infinite-dimensional linear systems theory, volume 21. Springer Science & Business Media, 2012.
  • [10] J. Diestel and J. J. Uhl. Vector Measures. American Mathematical Society, 1977.
  • [11] M Sajjad Edalatzadeh, Dante Kalise, Kirsten A Morris, and Kevin Sturm. Optimal actuator design for vibration control based on lqr performance and shape calculus. arXiv preprint arXiv:1903.07572, 2019.
  • [12] Jerome A Goldstein. Semigroups of linear operators and applications. Courier Dover Publications, 2017.
  • [13] Michael Hintermüller, Carlos N Rautenberg, Masoumeh Mohammadi, Martin Kanitsar, et al. Optimal sensor placement: A robust approach. SIAM J. Control. Optim., 55(6):3609–3639, 2017.
  • [14] Weiwei Hu, Kirsten Morris, and Yangwen Zhang. Sensor location in a controlled thermal fluid. In 2016 IEEE 55th Conference on Decision and Control (CDC), pages 2259–2264. IEEE, 2016.
  • [15] Andreas Kirsch et al. An introduction to the mathematical theory of inverse problems, volume 120. Springer, 2011.
  • [16] Kirsten Morris. Linear-quadratic optimal actuator location. IEEE Transactions on Automatic Control, 56(1):113–124, 2011.
  • [17] Kirsten Morris and Steven Yang. Comparison of actuator placement criteria for control of structures. Journal of Sound and Vibration, 353:1–18, 2015.
  • [18] Louis Sharrock and Nikolas Kantas. Joint online parameter estimation and optimal sensor placement for the partially observed stochastic advection-diffusion equation. SIAM/ASA Journal on Uncertainty Quantification, 10(1):55–95, 2022.
  • [19] Fredi Tröltzsch. Optimal control of partial differential equations: theory, methods, and applications, volume 112. American Mathematical Soc., 2010.