Penalized Weighted Trace Minimization for Optimal Control Device Design and Placement

James Cheung Toyon Research Corporation, 6800 Cortona Drive, Goleta, CA 93117.

Abstract.

In this paper, we present a new analytical framework for determining the well-posedness of constrained optimization problems that arise in the study of optimal control device design and placement within the context of infinite dimensional linear quadratic control systems. We first prove the well-posedness of the newly minted ”strong form” of the time-independent operator-valued Riccati equation. This form of the equation then enables the use of trace-class operator analysis and the Lagrange multiplier formalism to analyze operator-valued Riccati equation-constrained optimization problems. Using this fundamental result, we then determine the conditions under which there exists unique solutions to two important classes of penalized trace minimization problems for optimal control device placement and design.

The material presented in this paper was developed independently by the author and is not associated with any work related to his affiliated organization.

1. Introduction

The purpose of the work is to address the fundamental question of well-posedness for optimization problems associated with optimal sensor and actuator placement and design in the context of infinite-dimensional linear systems theory. To consolidate definitions, we will collectively refer to sensors and actuators as control devices. In the context of the linear quadratic regulator (LQR), the optimization problem of interest is the following:

\min_{p\in\mathcal{P}}\left\{\min_{u(\cdot)\in\mathscr{L}(\mathbb{R}_{+};U)}\int_{0}^{+\infty}\left[\left({z(t),\mathbf{Q}z(t)}\right)_{H}+\left({u(t;p),\mathbf{R}u(t;p)}\right)_{U}\right]dt\right\}

subject to

\left\{\begin{aligned} \frac{\partial z}{\partial t}&=\mathbf{A}z(t)+\mathbf{B}(p)u(t;p)\\ z(0)&=z_{0}\end{aligned}\right.,

for all $t\in[0,\infty)$ ; where denoting $H$ , $U$ , and $\mathcal{P}$ as separable Hilbert spaces associated with the state, control, and parameter respectively, we define $p\in\mathcal{P}$ to be the generalized design parameter (e.g. actuator placement or geometric design variables), $z(\cdot)\in H$ to be the state variable, $u(\cdot;p)\in L^{2}(\mathbb{R}_{+};H)$ to be the control variable, $\mathbf{Q}\in\mathscr{L}^{s}(H)$ to be the output operator, $\mathbf{R}\in\mathscr{L}(U)$ to be the control weighting operator, $\mathbf{B}(p)\in\mathscr{L}(U;H)$ to be the parametrized control operator associated with the control device (e.g. sensors or actuators), and $\mathbf{A}:\mathcal{D}(\mathbf{A})\rightarrow H$ to define the state process as the generator of a $C_{0}$ -semigroup. The Dynamic Programming Principle determines that the optimal control, for a fixed value $p\in\mathcal{P}$ , is given by

u_{opt}(t;p)=-\mathbf{R}^{-1}\mathbf{B}^{*}(p)\mathbf{X}(p)z(t),

where $\mathbf{X}(p)\in\mathscr{L}(H)$ is the solution to the weak form operator-valued Riccati equation

\left({\phi,\left[\mathbf{A}^{*}\mathbf{X}(p)+\mathbf{X}(p)\mathbf{A}-\mathbf{X}(p)\mathbf{B}(p)\mathbf{R}^{-1}\mathbf{B}(p)\mathbf{X}(p)_{+}\mathbf{Q}\right]\psi}\right)_{H}=0.

for all $\phi,\psi\in\mathcal{D}(\mathbf{A})$ . It then follows [9, Theorem 6.2.4] that

	$\displaystyle\min_{u(\cdot)\in L(\mathbb{R}_{+};U)}\int_{0}^{+\infty}\left[\left({z(t),\mathbf{Q}z(t)}\right)_{H}+\left({u(t;p),\mathbf{R}u(t;p)}\right)_{U}\right]dt$	$\displaystyle=\left({z_{0}\mathbf{X}(p)z_{0}}\right)_{H}$
		$\displaystyle=\texttt{trace}\left(\mathbf{X}(p)\mathbf{W}\right),$

where $\mathbf{W}(\cdot):=z_{0}\left({z_{0},\cdot}\right)_{H}$ is the operator generated through the exterior product using the initial condition of the system $z_{0}\in H$ . With these combined observations, the optimal actuator placement and design problem becomes

\min_{p\in\mathcal{P}}\texttt{trace}\left(\mathbf{X}(p)\mathbf{W}\right)

subject to

\left({\phi,\left[\mathbf{A}^{*}\mathbf{X}(p)+\mathbf{X}(p)\mathbf{A}-\mathbf{X}(p)\mathbf{B}(p)\mathbf{R}^{-1}(p)\mathbf{B}(p)^{*}\mathbf{X}(p)+\mathbf{Q}\right]\psi}\right)_{H}=\mathbf{0},

for all $\phi,\psi\in\mathcal{D}(\mathbf{A})$ . This same optimization problem, albeit with $\mathbf{A}$ and $\mathbf{A}^{*}$ switched in the statement of the operator-valued Riccati equation carries over to the setting of optimal sensor placement and design for linear state estimation systems (i.e. in the context of the Kálmán filter). Therefore, the study of operator-valued Riccati equation constrained weighted trace minimization problems has broad reaching implications in designing optimal systems for both control and state estimation.

The problem of optimal device placement and design has its origins in the work of [2], where the notion of optimal sensor placement has been introduced for infinite-dimensional systems. This theory was then extended to actuator placement in LQR systems in [16]. Additional notable extensions of the theory have been made to device placement in $\mathcal{H}_{\infty}$ control systems [13] and joint parameter estimation and sensor placement in partially observed systems [18]. The theory of optimal device placement and design has been applied to many practical problems including optimizing thermal control [14], vibration damping [17], and mobile sensor [4] systems. A general observation made when studying the existing literature on optimal device placement and design indicates that a constrained optimizer $p_{opt}\in\mathcal{P}$ that minimizes the weighted operator trace of $\mathbf{X}(p)$ can be shown to exist; however uniqueness of the optimizer is almost always left as an open problem. This is the key motivator for writing this paper.

We approach the problem of determining the conditions by which a unique solution to the constrained trace minimization problem associated with optimal device placement and design through penalization techniques. The first problem we study is the control penalization problem, i.e., to seek a constrained minimizer to

\mathcal{J}_{\beta}(p)=\texttt{trace}\left(\mathbf{X}(p)\mathbf{W}\right)+\frac{\beta}{2}\left\|p\right\|^{2}_{\mathcal{P}},

where we have introduced the penalization parameter $\beta\in\mathbb{R}_{+}$ to regularize the cost functional. This penalization scheme is a classical technique used to improve the conditioning of optimization problems that arise in inverse problems [15] and optimal control [19]. We demonstrate in §4 that the expected result of choosing $\beta$ sufficiently large induces uniqueness of the constrained minimizer $p_{opt}\in\mathcal{P}$ .

The study of the first penalized optimization problem will form the pedagogical basis for the second problem discussed in this work: the determination of conditions by which an unique optimizer $p_{opt}\in\mathcal{P}$ exists on constraint manifolds defined by

\texttt{trace}\left(\mathbf{B}(p)\mathbf{R}^{-1}(p)\mathbf{B}^{*}(p)\right)=\gamma,

where $\gamma\in\mathbb{R}_{+}$ is a positive constant. This constraint on $\mathbf{B}(p)\mathbf{R}^{-1}\mathbf{B}^{*}(p)$ serves the practical purpose of constraining the so-called gain of the control device, e.g. the amount of control effort used by the control feedback law. Following the theme of penalization, this additional trace constraint is approximately enforced through the following penalized cost functional

\mathcal{J}_{\beta}(p)=\texttt{trace}\left(\mathbf{X}(p)\mathbf{W}\right)+\frac{\beta}{2}\left[\texttt{trace}\left(\mathbf{B}(p)\mathbf{R}^{-1}(p)\mathbf{B}^{*}(p)\right)-\gamma\right]^{2}.

Exact constraint enforcement is achieved in the limit as $\beta\rightarrow+\infty$ . We determine the conditions under which there exists a unique constrained minimizer to this cost functional in §5. The penalization introduced in the second problem serves primarily as a mechanism to determine uniqueness on the additional constraint manifold.

The constrained optimization problems are cast into the Lagrange multiplier formalism to study the two penalized constrained trace minimization problems posed in the previous section. A fixed-point argument (c.f. the Banach fixed point theorem [7, Theorem 3.7-1]) is used to determine the uniqueness of the solution to the associated first-order optimality system associated with the Lagrangian functional for each problem. The second-order sufficient optimality condition is then applied to determine that the unique solution of the first-order optimality system is indeed a minimizer.

While this strategy is sound at first glance, we quickly run into technical issues when attempting to apply the Lagrange multiplier formalism to the operator-valued Riccati equation. The problem is that a sufficiently “strong” form of this equation, i.e. of the form

\mathbf{A}^{*}\mathbf{X}+\mathbf{X}\mathbf{A}-\mathbf{X}\mathbf{B}\mathbf{R}^{-1}\mathbf{B}^{*}\mathbf{X}+\mathbf{Q}=\mathbf{0},

has not been shown to be well-posed on all of the Hilbert space $H$ associated with the state of the system. This inhibits the use of the appropriate trace-class operator analytic tools required to rigorously derive the first-order optimality system associated with the constrained optimization problems discussed in this work. The prior literature (see e.g. [3]) only indicates that the operator-valued Riccati equation (without appealing to its Bochner integral form [5]) is only well-defined as an operator equation whose domain is defined on $\mathcal{D}(\mathbf{A})$ . However, we are able to overcome this technical hurdle in this work by determining that the “strong” form of the operator-valued Riccati equation is in fact well posed in Theorem 1.

The structure of the paper is the following: First, we begin in §2 with the notation that will be used throughout the work, then in §3 we analyze the well-posedness of the strong form of the operator-valued Riccati equations and its associated dual problem along with the Lipschitz continuity of their solutions with respect to the control parameter $p\in\mathcal{P}$ . Next, in §4 and §5 we analyze the penalized problems of interest, and finally we conclude this paper with a discussion of our findings in §6.

2. Notation

Let $H$ be a separable complex Hilbert space with its inner product denoted as $\left({\cdot,\cdot}\right)_{H}:H\times H\rightarrow\mathbb{R}_{+}$ . Throughout this work, we will define $\left\{e_{i}\right\}_{i=1}^{\infty}$ an orthonormal basis of $H$ . This means that any element $\phi\in H$ can be represented as

\phi=\sum_{i=1}^{\infty}c_{i}e_{i},

where $\left\{c_{i}\right\}_{i=1}^{\infty}\in\mathbb{C}$ are scalar coefficients, and that $\left({e_{i},e_{j}}\right)_{H}=\delta_{ij}$ with

\delta_{ij}:=\begin{cases}1&\textrm{if }i=j\\ 0&\textrm{Otherwise}\end{cases}

denoting the Kronecker delta function. We denote the space of bounded linear operators mapping $H$ onto $H$ as $\mathscr{L}\left({H}\right)$ . where the $\mathscr{L}\left({H}\right)$ norm is defined by

\left\|\mathbf{T}\right\|_{\mathscr{L}\left({H}\right)}:=\sup_{\begin{subarray}{c}\phi\in H\\ \phi\neq 0\end{subarray}}\frac{\left\|\mathbf{T}\phi\right\|_{H}}{\left\|\phi\right\|_{H}}.

For any $\mathbf{T}\in\mathscr{L}\left({H}\right)$ , the definition of the adjoint operator $\mathbf{T}^{*}\in\mathscr{L}\left({H}\right)$ is defined through the inner-product by the following

\left({\phi,\mathbf{T}\psi}\right)_{H}=\left({\mathbf{T}^{*}\phi,\psi}\right)_{H}

for all $\phi,\psi\in H$ .

In this work, we are also be interested in the space of bounded linear operators on Banach spaces. Let $V_{1}$ and $V_{2}$ be two complex Banach spaces. We then denote the space of bounded linear operators mapping $V_{1}$ to $V_{2}$ as $\mathscr{L}\left({V_{1};V_{2}}\right)$ . The $\mathscr{L}\left({V_{1};V_{2}}\right)$ norm is then defined by

\left\|\mathbf{T}\right\|_{\mathscr{L}\left({V_{1};V_{2}}\right)}:=\sup_{\begin{subarray}{c}\phi\in V_{1}\\ \phi\neq 0\end{subarray}}\frac{\left\|\mathbf{T}\phi\right\|_{V_{2}}}{\left\|\phi\right\|_{V_{1}}}

for all $\phi\in V_{1}$ . With the basic functional analytic notation defined, we now move on to define the more technical notions needed for this work.

2.1. Trace-Class Operators

Trace class operators generalize the notion of finite dimensional matrices with finite trace (i.e. the matrix Lie algebra $\mathfrak{gl}(n;\mathbb{C})$ ) to the setting of infinite dimensional linear operators. The formal definition for the space of trace-class operators $\mathscr{J}_{1}(H)\subset\mathscr{L}(H)$ is given by

\mathscr{J}_{1}(H):=\left\{\mathbf{T}\in\mathscr{L}(H):|\texttt{trace}\left(\mathbf{T}\right)|<\infty\right\},

where the operator trace $\texttt{trace}\left(\cdot\right):\mathscr{J}_{1}(H)\rightarrow\mathbb{C}$ is defined by

\texttt{trace}\left(\mathbf{T}\right):=\sum_{k=1}^{\infty}\left({e_{k},\mathbf{T}e_{k}}\right)_{H}

for all $\mathbf{T}\in\mathscr{J}_{1}(H)$ , where $\left\{e_{k}\right\}_{k=1}^{\infty}$ again forms an orthonormal basis of the Hilbert space $H$ . The $\mathscr{J}_{1}(H)$ norm is then defined by

\left\|\mathbf{T}\right\|_{1}:=|\texttt{trace}\left(\mathbf{T}\right)|,

where $|\cdot|:\mathbb{C}\rightarrow\mathbb{R}_{+}$ denotes the modulus on the field $\mathbb{C}$ . From [8, Theorem 18.11], we have that $\mathscr{J}_{1}(H)$ is a two-sided *-ideal in $\mathscr{L}(H)$ , meaning that for any $\mathbf{U}\in\mathscr{L}(H)$ and $\mathbf{V}\in\mathscr{J}_{1}(H)$ , we have that $\mathbf{U}\mathbf{V}\in\mathscr{J}_{1}(H)$ and $\mathbf{V}\mathbf{U}\in\mathscr{J}_{1}(H)$ .

Using the operator trace and the fact that $\mathscr{J}_{1}(H)$ is a two-sided *-ideal in $\mathscr{L}(H)$ , we are able to induce the following definition of the duality pairing $\left<\cdot,\cdot\right>:\mathscr{L}(H)\times\mathscr{J}_{1}(H)\rightarrow\mathbb{C}$ as follows

\left<\mathbf{U},\mathbf{V}\right>:=\texttt{trace}\left(\mathbf{U}^{*}\mathbf{V}\right)

for all $\mathbf{U}\in\mathscr{L}(H)$ and $\mathbf{V}\in\mathscr{J}_{1}(H)$ . There is a one-to-one correspondence (i.e. an isometric isomorphism) between $\left<\mathbf{U},\cdot\right>:\mathscr{J}_{1}(H)\rightarrow\mathbb{C}$ and $\mathscr{J}_{1}(H)^{\prime}$ , the dual space of $\mathscr{J}_{1}(H)$ [8, Theorem 19.2]. From the definition of the operator and the invariance of conjugation under the trace operation, we have that

(1)

\left<\mathbf{U},\mathbf{V}\right>=\left<\mathbf{I},\mathbf{U}^{*}\mathbf{V}\right>=\left<\mathbf{I},\mathbf{V}^{*}\mathbf{U}\right>=\left<\mathbf{V}^{*},\mathbf{U}^{*}\right>,

where we have denoted $\mathbf{I}$ as the identity element in $\mathscr{L}\left({H}\right)$ . This identity will be utilized frequently in the derivation of the first-order optimality conditions associated with the penalized optimization problems studied in §4 and §5.

2.1.1. Symmetric Operators

Of particular interest in this work is the subspace of symmetric operators in $\mathscr{J}_{1}\left({H}\right)$ and $\mathscr{L}\left({H}\right)$ . An operator $\mathbf{T}\in\mathscr{L}\left({H}\right)$ is symmetric if

\left({\phi,\mathbf{T}\psi}\right)_{H}=\left({\phi,\mathbf{T}^{*}\psi}\right)_{H}

for all $\phi,\psi\in H$ . The subspace of all symmetric operators in $\mathscr{L}\left({H}\right)$ will be denoted as $\mathscr{L}^{s}\left({H}\right)$ . We will then define $\mathscr{J}_{1}^{s}(H):=\mathscr{J}_{1}(H)\cap\mathscr{L}(H)$ as the space of symmetric operators in $\mathscr{J}_{1}(H)$ . The definition of the norm for the spaces $\mathscr{L}^{s}(H)$ and $\mathscr{J}_{1}^{s}(H)$ coincides with the $\mathscr{L}(H)$ and $\mathscr{J}(H)$ norms respectively.

Semi-definite operators arise frequently during the course of the discussion presented in this work. A symmetric operator $\mathbf{T}\in\mathscr{L}^{s}(H)$ is positive semi-definite if

\left({\phi,\mathbf{T}\phi}\right)_{H}\geq 0

for all $\phi\in H$ . A symmetric operator is then positive definite if the inequality is strict.

2.2. Exponentially Stable $C_{0}$ -Semigroups

Following [12, §2], we define a $C_{0}$ -semigroup as a one-parameter family of operators $\left\{\mathbf{S}(t)\in\mathscr{L}(H):t\in\mathbb{R}_{+}\cup{0}\right\}$ that satisfies the following

i)

$\mathbf{S}(t)\mathbf{S}(s)=\mathbf{S}(t+s)$ for each $t,s\in\mathbb{R}_{+}$ ,
ii)

$\mathbf{S}(0)=\mathbf{I}$ , and
iii)

$\mathbf{S}(t)\phi\in H$ is norm-continuous with respect to $t\in\mathbb{R}_{+}$ for all $\phi\in H$ .

A $C_{0}$ -semigroup is then said to be exponentially stable if there exists positive constants $M,\alpha\in\mathbb{R}_{+}$ satisfying

(2)

\left\|\mathbf{S}(t)\right\|_{\mathscr{L}(H)}\leq Me^{-\alpha t}

for any $t\in\mathbb{R}_{+}$ . An (unbounded) operator $\mathbf{A}$ is said to be a generator of $\mathbf{S}(t)$ if

(3)

\mathbf{A}\phi=\lim_{h\rightarrow 0^{+}}h^{-1}\left[\mathbf{S}(h)\phi-\phi\right]\in H

for all $\phi\in\mathcal{D}(\mathbf{A})$ , where

\mathcal{D}(\mathbf{A}):=\left\{\phi\in H:\left\|\mathbf{A}\phi\right\|_{H}<\infty\right\}.

The analysis provided in the proof of [12, Theorem 2.6] indicates that $\mathcal{D}(\mathbf{A})$ is densely defined in $H$ .

The adjoint of $\mathbf{S}(t)$ , denoted by $\mathbf{S}^{*}(t)$ , is also a bounded operator on $H$ . This is easily observed by the following

\left({\phi,\mathbf{S}(t)\psi}\right)_{H}=\left({\mathbf{S}^{*}(t)\phi,\psi}\right)_{H}

for all $\phi,\psi\in H$ and all $t\in\mathbb{R}_{+}$ . This identity indicates also that

\left\|\mathbf{S}^{*}(t)\right\|_{H}\leq Me^{-\alpha t}

through the induced operator norm. From [12, Thoerem 4.3], we have that $\mathbf{A}^{*}$ , the adjoint operator of $\mathbf{A}$ , is the generator of $\mathbf{S}^{*}(t)$ , i.e.

(4)

\mathbf{A}^{*}\psi=\lim_{h\rightarrow 0^{+}}h^{-1}\left[\mathbf{S}^{*}(h)\psi-\psi\right]\in H

for all $\psi\in\mathcal{D}(\mathbf{A}^{*})$ , where we have defined

\mathcal{D}(\mathbf{A}^{*}):=\left\{\phi\in H:\lim_{h\rightarrow 0^{+}}h^{-1}\left({\mathbf{S}^{*}(h)\phi-\phi}\right)\in H\right\}.

$\mathcal{D}(\mathbf{A}^{*})$ is also a densely defined subset of $H$ .

Bounded perturbations to a generator of an exponentially stable $C_{0}$ -semigroup also generates a semigroup, i.e. if $\mathbf{A}:\mathcal{D}(\mathbf{A})\rightarrow H$ generates a semigroup $\mathbf{S}(t)$ , then $\mathbf{A}-\mathbf{T}$ , where $\mathbf{T}\in\mathscr{L}\left({H}\right)$ is a bounded positive semi-definite operator, also generates a $C_{0}$ -semigroup [12, Theorem 6.4]. The perturbation semigroup generated by $\mathbf{A}-\mathbf{T}$ is then also exponentially stable since $\mathbf{T}\in\mathscr{L}\left({H}\right)$ is positive semi-definite.

3. The Operator-Valued Riccati Equation and Its Dual Equation

This section is dedicated to the study of the strong-form of the operator-valued Riccati equation and its dual equation that arises from the derivation of the first-order optimality system associated with the constrained optimization problems discussed in §4 and §5. The focus of this section is on determining the well-posedness and the Lipschitz continuity (with respect to varying control parameters $p\in\mathcal{P}$ ) of these equations. In this section and the remainder of this work, we will take $\mathbf{G}:=\mathbf{B}\mathbf{R}^{-1}\mathbf{B}^{*}$ and $\mathbf{G}_{p}$ to be its parametrized analog to simplify notation.

3.1. Strong Operator-Valued Riccati Equation

We now motivate the definition of the strong form of the operator-valued Riccati equation. This form is essential in defining the Lagrangian first-order optimality system that we use to determine the well-posedness of the penalized weighted trace minimization problems studied in this work. We begin with the following.

Definition 1.

A symmetric positive semi-definite operator $\mathbf{X}\in\mathscr{L}^{s}(H)$ is said to be a solution of the strong operator-valued Riccati equation if it satisfies

(5)

\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}\mathbf{X}+\mathbf{Q}=\mathbf{0}

in the $\mathscr{L}(H)$ topology (i.e. $\left\|\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}\mathbf{X}+\mathbf{Q}\right\|_{\mathscr{L}(H)}=0$ ) with the additional condition that $\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}\in\mathscr{L}^{s}(H)$ , where $\mathbf{A}:\mathcal{D}(\mathbf{A})\rightarrow H$ is the generator of an exponentially stable $C_{0}$ -semigroup semigroup and the coefficient operators $\mathbf{G},\mathbf{Q}\in\mathscr{L}^{s}(H)$ are symmetric positive semi-definite.

In [3, Chapter IV-1 Section 3] the notion of strict and classical solutions are presented to describe solutions to (5). The analysis regarding these solutions was done in the context of $\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}$ being an operator on $\mathcal{D}(\mathbf{A}^{*})$ . In contrast, we have determine in Theorem 1 that $\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}$ is actually a bounded operator on all of $H$ . This finding opens up the possibility of utilizing trace-class operator theory in the analysis of operator-valued Riccati equations without reformulating it as a Bochner integral equation as done in [5]. This is the key result that enables the derivation of the first-order optimality systems utilized in this work.

We now determine that the strong form of the operator-valued Riccati equation (5) is well-defined and is also equivalent to the Bochner integral form of the operator-valued Riccati equation (c.f. [5]) in the following. It is further determined that the solution to (5) is a trace-class operator if $\mathbf{Q}\in\mathscr{J}_{1}^{s}(H)$ .

Theorem 1.

Assume that $\mathbf{A}:\mathcal{D}(\mathbf{A})\rightarrow H$ is the generator of an exponentially stable $C_{0}$ -semigroup $\mathbf{S}(t)\in\mathscr{L}(H)$ , $\mathbf{Q},\mathbf{G}\in\mathscr{L}^{s}(H)$ are symmetric positive semi-definite operators. Then the unique positive semi-definite solution $\mathbf{X}\in\mathscr{L}^{s}(H)$ to following Bochner integral equation

(6)

\mathbf{X}=\int_{0}^{+\infty}\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(t)dt

is the unique solution to (5).

Furthermore, if we assume that $\mathbf{Q}\in\mathscr{J}^{s}_{1}(H)$ , then we have that $\mathbf{X}\in\mathscr{J}_{1}^{s}(H)$ is a trace class operator that satisfies the following

(7)

\left\|\mathbf{X}\right\|_{1}\leq\frac{M^{2}}{2\alpha}\left\|\mathbf{Q}\right\|_{1},

where $M,\alpha\in\mathbb{R}_{+}$ are the constants associated with the stability bounds for $\mathbf{S}(t)\in\mathscr{L}(H)$ given in (2).

Proof.

Let $\mathbf{X}\in\mathscr{L}^{s}(H)$ be the defined by (6) and let $\mathcal{D}(\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*})$ be the domain of $\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}$ , i.e.

\mathcal{D}\left({\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}}\right):=\left\{\phi\in H:\left[\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}\right]\phi\in H\right\}.

We will demonstrate that $\mathcal{D}(\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*})=H$ . Using the definition of the infinitesimal generators $\mathbf{A}$ and $\mathbf{A}^{*}$ given in (3) and (4) respectively and the definition of $\mathbf{X}\in\mathscr{L}^{s}(H)$ given by (6), we have that

		$\displaystyle\left({\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}}\right)\zeta$
		$\displaystyle\quad=\int_{0}^{+\infty}\left[\mathbf{A}\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{}(t)+\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{}(t)\mathbf{A}^{*}\right]\zeta dt$
		$\displaystyle\quad=\lim_{h\rightarrow 0^{+}}h^{-1}\int_{0}^{+\infty}\left[\left({\mathbf{S}(t+h)-\mathbf{S}(t)}\right)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}(t+h)+\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\left({\mathbf{S}^{}(t+h)-\mathbf{S}^{}(t)}\right)\right]\zeta dt$
		$\displaystyle\quad=\lim_{h\rightarrow 0^{+}}h^{-1}\int_{0}^{+\infty}\left[\mathbf{S}(t+h)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}(t+h)-\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(t)\right]\zeta dt$

for all $\zeta\in\mathcal{D}(\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*})$ . Notice here that we may already take $\zeta$ to be in $H$ without consequence since $\mathbf{S}(t)$ and $\mathbf{S}^{*}(t)$ are bounded operators on $H$ . Continuing, we have that

		$\displaystyle\lim_{h\rightarrow 0^{+}}h^{-1}\int_{0}^{+\infty}\left[\mathbf{S}(t+h)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{}(t+h)-\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{}(t)\right]\zeta dt$
		$\displaystyle\quad=\lim_{h\rightarrow 0^{+}}h^{-1}\left\{\lim_{\tau\rightarrow+\infty}\int_{0}^{\tau}\left[\mathbf{S}(t+h)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{}(t+h)-\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{}(t)\right]\zeta dt\right\}$
		$\displaystyle\quad=\lim_{h\rightarrow 0^{+}}h^{-1}\left\{\lim_{\tau\rightarrow+\infty}\left[\int_{h}^{\tau+h}\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{}(t)\zeta dt-\int_{0}^{\tau}\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{}(t)\zeta dt\right]\right\}$
		$\displaystyle\quad=\lim_{h\rightarrow 0^{+}}h^{-1}\left[\lim_{\tau\rightarrow+\infty}\int_{\tau}^{\tau+h}\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{}(t)\zeta dt-\int_{0}^{h}\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{}(t)\zeta dt\right]$
		$\displaystyle=\lim_{\tau\rightarrow+\infty}\left[\lim_{h\rightarrow 0^{+}}h^{-1}\int_{\tau}^{\tau+h}\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(t)\zeta dt\right]-\mathbf{Q}\zeta+\mathbf{X}\mathbf{G}\mathbf{X}\zeta$
		$\displaystyle=\lim_{\tau\rightarrow+\infty}\left[\mathbf{S}(\tau)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(\tau)\zeta\right]-\mathbf{Q}\zeta+\mathbf{X}\mathbf{G}\mathbf{X}\zeta$
		$\displaystyle=-\mathbf{Q}\zeta+\mathbf{X}\mathbf{G}\mathbf{X}\zeta$

for all $\zeta\in H$ after applying the fact that $\mathbf{S}(t)\in\mathscr{L}(H)$ , and consequently also $\mathbf{S}^{*}(t)\in\mathscr{L}(H)$ , vanish in the $t\rightarrow+\infty$ limit as a consequence of their exponential stability. The interchange of limits used in the above sequence of equalities is allowable owing to the continuity of the function that the limiting operations are applied to. We have therefore demonstrated that

(8)

\left({\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}}\right)\zeta=\left({-\mathbf{Q}+\mathbf{X}\mathbf{G}\mathbf{X}}\right)\zeta

for all $\zeta\in H$ . This then implies that $\mathbf{X}\in\mathscr{L}^{s}(H)$ defined as a solution to (6) also necessarily satisfies (5).

Next, we demonstrate that the solution of the strong operator-valued Riccati equation (5) satisfies (6). To do this, we derive the following weak form of the operator-value Riccati equation from (5)

(9)

\left({\left[\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}\mathbf{X}+\mathbf{Q}\right]\phi,\psi}\right)_{H}=0,

for all $\phi,\psi\in\mathcal{D}(\mathbf{A}^{*})$ . It then follows from [6, Proposition 4] that (9) is equivalent to the Bochner integral form of the operator-valued Riccati equation (6). Therefore, a solution $\mathbf{X}\in\mathscr{L}^{s}(H)$ satisfying (6) also satisfies (5). The uniqueness of $\mathbf{X}\in\mathscr{L}^{s}(H)$ that satisfies (5) is a consequence of the well-posedness of (6) determined in [5]. This argument presented in previous paragraph along with this paragraph indicates that there exists only one positive semi-definite solution to (5) and the solution coincides with the solution of (6). Symmetry of $\mathbf{X}$ is easily determined through inspection taking the adjoint of both sides of (5).

We conclude the proof by deriving the solution bound (7) under the assumption that $\mathbf{Q}\in\mathscr{J}_{1}^{s}(H)$ . To that end, we consider the following weak integral form of the operator-valued Riccati equation.

(10)

\left({\psi,\mathbf{X}\phi}\right)_{H}=\int_{0}^{+\infty}\left({\psi,\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(t)\phi}\right)_{H}dt

for all $\phi,\psi\in H$ . It is clear that the solution $\mathbf{X}\in\mathscr{L}^{s}(H)$ to (6) also satisfies (10). By choosing $\phi=\psi=e_{i}$ , where $\left\{e_{i}\right\}_{i=1}^{\infty}$ forms an orthonormal basis of $H$ , we then have that

\left({e_{i},\mathbf{X}e_{i}}\right)_{H}+\int_{0}^{+\infty}\left({e_{i},\mathbf{S}(t)\mathbf{X}\mathbf{G}\mathbf{X}\mathbf{S}^{*}(t),e_{i}}\right)_{H}dt=\int_{0}^{+\infty}\left({e_{i},\mathbf{S}(t)\mathbf{Q}\mathbf{S}^{*}(t)e_{i}}\right)_{H}dt.

Since $\mathbf{X}\in\mathscr{L}^{s}(H)$ and $\mathbf{G}\in\mathscr{L}^{s}(H)$ are symmetric positive semi-definite, we have that both terms in the left hand side of the above equation are nonnegative. It then follows that

\left({e_{i},\mathbf{S}(t)\mathbf{X}\mathbf{G}\mathbf{X}\mathbf{S}^{*}(t)e_{i}}\right)_{H}=\left\|\mathbf{G}^{\frac{1}{2}}\mathbf{X}\mathbf{S}^{*}(t)e_{i}\right\|_{H}^{2}\geq 0,

where $\mathbf{G}^{\frac{1}{2}}\in\mathscr{L}(H)$ denotes the operator square root [1, Theorem 7.38] of $\mathbf{G}\in\mathscr{L}^{s}(H)$ . From this, it follows that

\left({e_{i},\mathbf{X}e_{i}}\right)_{H}\leq\int_{0}^{+\infty}\left({e_{i},\mathbf{S}(t)\mathbf{Q}\mathbf{S}^{*}(t)e_{i}}\right)_{H}ds,

where summing both sides over all $i\in\mathbb{N}$ yields

	$\displaystyle\texttt{trace}\left(\mathbf{X}\right)$	$\displaystyle\leq\int_{0}^{+\infty}\texttt{trace}\left(\mathbf{S}(t)\mathbf{Q}\mathbf{S}^{*}(t)\right)dt$
		$\displaystyle\leq\int_{0}^{+\infty}\left\\|\mathbf{S}^{*}(t)\right\\|^{2}_{\mathscr{L}\left({H}\right)}\left\\|\mathbf{Q}\right\\|_{1}ds$
		$\displaystyle\leq M^{2}\left\\|\mathbf{Q}\right\\|_{1}\int_{0}^{+\infty}e^{-2\alpha t}dt$
		$\displaystyle=\frac{M^{2}}{2\alpha}\left\\|\mathbf{Q}\right\\|_{\mathscr{J}_{1}\left({H}\right)},$

and hence $\mathbf{X}\in\mathscr{J}_{1}^{s}(H)$ and (7) follows from seeing that $\texttt{trace}\left(\mathbf{X}\right)=\left\|\mathbf{X}\right\|_{1}$ because $\mathbf{X}\in\mathscr{J}_{1}^{s}(H)$ is symmetric positive semi-definite. Since $\mathbf{X}\in\mathscr{J}_{1}^{s}(H)$ , we have that $\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}\in\mathscr{J}_{1}^{s}(H)$ . ∎

We proceed with a discussion on the strong form of the Sylvester equation in the following subsection.

3.2. Sylvester’s Equation

In the analysis presented in the following subsections, we frequently encounter the strong form of the operator-valued Sylvester equation (it is “strong” in the same sense (5) is the strong form of the operator-valued Riccati equation), given by

(11)

\mathbf{A}_{1}\mathbf{T}+\mathbf{T}\mathbf{A}_{2}^{*}=\mathbf{P}

where $\mathbf{A}_{1}:=\mathbf{A}-\mathbf{K}_{1}$ and $\mathbf{A}_{2}:\mathbf{A}-\mathbf{K}_{2}$ defined as generators of exponentially stable perturbation $C_{0}$ -semigroups $\mathbf{S}_{1}(t)\in\mathscr{L}\left({H}\right)$ and $\mathbf{S}_{2}(t)\in\mathscr{L}\left({H}\right)$ respectively for all $t\in\mathbb{R}_{+}$ with $\mathbf{K}_{1},\mathbf{K}_{2}\in\mathscr{L}\left({H}\right)$ being positive semi-definite operators. Because of the way $\mathbf{A}_{1},\mathbf{A}_{2}$ are defined, we have that their domains coincide with $\mathcal{D}(\mathbf{A})$ , and hence, (11) is well-defined. With this, we determine the following.

Lemma 1.

Assume that $\mathbf{A}_{1},\mathbf{A}_{2}:\mathcal{D}(\mathbf{A})\rightarrow H$ are generators of exponentially stable $C_{0}$ -semigroups $\mathbf{S}_{1}(t)\in\mathscr{L}\left({H}\right)$ and $\mathbf{S}_{2}(t)\in\mathscr{L}\left({H}\right)$ for all $t\in\mathbb{R}_{+}$ , then the unique solution $\mathbf{T}\in\mathscr{L}\left({H}\right)$ of (11) is given by the following Bochner integral representation

(12)

\mathbf{T}=-\int_{0}^{+\infty}\mathbf{S}_{1}(t)\mathbf{P}\mathbf{S}_{2}^{*}(t)dt,

where we have assumed that $\mathbf{P}\in\mathscr{L}\left({H}\right)$ .

Proof.

We begin by first demonstrating that the integral in (12) is well-defined. This is done by showing that the norm of the integrand is integrable over all of $\mathbb{R}_{+}$ [10, Section 2, Theorem 2]. We verify this claim in the following.

	$\displaystyle\int_{0}^{+\infty}\left\\|\mathbf{S}_{1}(t)\mathbf{P}\mathbf{S}_{2}^{*}(t)\right\\|_{\mathscr{L}\left({H}\right)}dt$	$\displaystyle\leq\int_{0}^{+\infty}M_{1}M_{2}e^{-\alpha_{1}t}e^{-\alpha_{2}t}\left\\|\mathbf{P}\right\\|_{\mathscr{L}\left({H}\right)}dt$
		$\displaystyle\leq M_{}^{2}\left\\|\mathbf{P}\right\\|_{\mathscr{L}\left({H}\right)}\int_{0}^{+\infty}e^{-2\alpha_{}t}dt$
		$\displaystyle\leq\frac{M_{}^{2}}{2\alpha_{}}\left\\|\mathbf{P}\right\\|_{\mathscr{L}\left({H}\right)},$

where $M_{1},M_{2},\alpha_{1},\alpha_{2}\in\mathbb{R}_{+}$ are the stability constants associated with the exponentially stable $C_{0}$ -semigroups $\mathbf{S}_{1}(t),\mathbf{S}_{2}(t)\in\mathscr{L}\left({H}\right)$ respectively, $M_{*}:=\max\left\{M_{1},M_{2}\right\}$ and $\alpha_{*}:=\min\left\{\alpha_{1},\alpha_{2}\right\}$ . Utilizing (12) in (11) and following a similar derivation as in the first paragraph of the proof of Theorem 1 demonstrates that $\mathbf{T}$ defined by (12) is a solution of the strong form of Sylvester’s equation. Uniqueness follows as a consequence of the linearity of the equation. ∎

We now move on to discuss the dual problem to (5) that arises in the first-order optimality system of the penalized optimization problems presented in §4 and §5.

3.3. Dual Problem

The dual problem arises in the derivation of the first-order optimality system by determining the Fréchet derivative of the Lagrangian saddle-point functional with respect to the primal variable $\mathbf{X}\in\mathscr{J}_{1}^{s}(H)$ . We will go through its derivation in §4.2.1. For now, we will simply state the strong form of the dual problem and determine the its well-posedness.

The dual problem to (5) is stated as follows: Seek a $\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H)$ that satisfies

(13)

\left({\mathbf{A}^{*}-\mathbf{G}\mathbf{X}}\right)\boldsymbol{\Lambda}+\boldsymbol{\Lambda}\left({\mathbf{A}-\mathbf{X}\mathbf{G}}\right)=-\mathbf{W},

where $\mathbf{A}$ and $\mathbf{A}^{*}$ are the generators of the exponentially stable $C_{0}$ -semigroups $\mathbf{S}(t)$ and $\mathbf{S}^{*}(t)$ respectively, and $\mathbf{G}\in\mathscr{J}_{1}^{s}(H)$ and $\mathbf{W}\in\mathscr{L}^{s}(H)$ are symmetric positive semi-definite operators, and $\mathbf{X}\in\mathscr{J}_{1}^{s}(H)$ is the solution to (5). The well-posedness of (13) is determined in the following.

Lemma 2.

Let $\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H)$ be the solution to (13), then there exist positive constants $M,\alpha\in\mathbb{R}_{+}$ satisfying

\left\|\boldsymbol{\Lambda}\right\|_{\mathscr{L}\left({H}\right)}\leq\frac{M^{2}}{2\alpha}\left\|\mathbf{W}\right\|_{\mathscr{L}(H)}.

Furthermore, the solution $\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H)$ is symmetric positive semi-definite.

Proof.

Let $\mathbf{T}(t)\in\mathscr{L}\left({H}\right)$ be the exponentially stable $C_{0}$ -semigroup generated by $\mathbf{A}^{*}-\mathbf{G}\mathbf{X}$ . Because (13) is a Sylvester equation, we have from Lemma 1 that $\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H)$ can be represented in the following Bochner integral form

(14)

\boldsymbol{\Lambda}=\int_{0}^{+\infty}\mathbf{T}(t)\mathbf{W}\mathbf{T}^{*}(t)dt.

Because $-\mathbf{G}\mathbf{X}\in\mathscr{L}\left({H}\right)$ is a stabilizing perturbation to $\mathbf{A}:\mathcal{D}(\mathbf{A})\rightarrow H$ , we have that $\left\|\mathbf{T}(t)\right\|_{\mathscr{L}\left({H}\right)}\leq Me^{-\alpha t}$ . It then follows that

\left\|\boldsymbol{\Lambda}\right\|_{\mathscr{L}\left({H}\right)}\leq M^{2}\left\|\mathbf{W}\right\|_{\mathscr{L}(H)}\int_{0}^{+\infty}e^{-2\alpha t}dt=\frac{M^{2}}{2\alpha}\left\|\mathbf{W}\right\|_{\mathscr{L}(H)},

from which the bound presented in the lemma is proven. The symmetric positive semi-definite nature of $\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H)$ comes from inspecting (14) where taking the adjoint of both sides of the equation immediately verifies this claim. ∎

3.4. Parametrized Control Device Operator

In many application problems, the operator $\mathbf{G}\in\mathscr{L}^{s}(H)$ is parametertrized by a set of parameters, i.e., it is a function that maps a parameter space $\mathcal{P}$ into the operator space $\mathscr{L}^{s}(H)$ . This parameter space corresponds to, e.g. control device placement locations [16] and geometric design parameters [11]. We formalize the definition of the parametrized operators $\mathbf{G}_{p}$ for $p\in\mathcal{P}$ in the following.

Let $\mathcal{P}$ be a complex Hilbert space with its norm $\left\|\cdot\right\|_{\mathcal{P}}$ be induced by the inner product $\left({\cdot,\cdot}\right)_{\mathcal{P}}$ and $\mathbf{G}_{p}\in\mathscr{J}_{1}^{s}(H)$ be a trace-class symmetric positive semi-definite operator parametrized by a parameter $p\in\mathcal{P}$ so that the mapping $p\mapsto\mathbf{G}_{p}$ is twice Fréchet differentiable with respect to $p\in\mathcal{P}$ and that its first derivative $\frac{\partial\mathbf{G}_{p}}{\partial p}\left({\cdot}\right)$ is bounded as an operator in $\mathscr{L}\left({\mathcal{P};\mathscr{J}_{1}^{s}(H)}\right)$ and the second derivative $\frac{\partial^{2}\mathbf{G}_{p}}{\partial p^{2}}(\cdot,\cdot)$ is bounded in $\mathscr{L}\left({\mathcal{P};\mathscr{L}\left({\mathcal{P};\mathscr{J}_{1}^{s}(H)}\right)}\right)$ . The first assumption we make on $\mathbf{G}_{\left({\cdot}\right)}$ is that it is uniformly bounded with respect to any $p\in\mathcal{P}$ , i.e. there exists a positive constant $g\in\mathbb{R}_{+}$ that satisfies

(15)

\left\|\mathbf{G}_{p}\right\|_{1}\leq g

for all $p\in\mathcal{P}$ . Because we have assumed that $\mathbf{G}_{p}\in\mathscr{J}_{1}^{s}(H)$ is Fréchet differentiable for all $p\in\mathcal{P}$ , it follows that it is Lipschitz continuous, i.e. there exists a positive constant $L_{\mathbf{G}}\in\mathbb{R}_{+}$ so that

(16)

\left\|\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}}\right\|_{1}\leq L_{\mathbf{G}}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}

for all $p_{1},p_{2}\in\mathcal{P}$ . To reduce notational clutter, we will denote $\mathbf{d}\mathbf{G}_{p}\left({\cdot}\right):=\frac{\partial\mathbf{G}_{p}}{\partial p}(\cdot)$ . We will further assume that $\mathbf{d}\mathbf{G}_{p}$ is Lipschitz continuous, i.e. there exists a positive constant $L_{\mathbf{d}\mathbf{G}}\in\mathbb{R}_{+}$ that satisfies

(17)

\left\|\mathbf{d}\mathbf{G}_{p_{1}}-\mathbf{d}\mathbf{G}_{p_{2}}\right\|_{\mathscr{L}\left({\mathcal{P};\mathscr{J}_{1}^{s}(H)}\right)}\leq L_{\mathbf{d}\mathbf{G}}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}

for all $p_{1},p_{2}\in\mathcal{P}$ . We will further assume that

(18)

\mathbf{d}\mathbf{G}_{p}\neq 0

for any $p\in\mathcal{P}$ and that

(19)

\left[\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{d}\mathbf{G}_{p}\right]^{-1}\in\mathscr{L}\left({\mathcal{P}}\right).

Finally, we the twice differentiability assumption implies that

(20)

\mathbf{d}^{2}\mathbf{G}_{p}(q,r)<\infty

for all $q,r\in\mathcal{P}$ , where we have denoted $\mathbf{d}^{2}\mathbf{G}_{p}(\cdot,\cdot):=\frac{\partial^{2}\mathbf{G}_{p}}{\partial p^{2}}(\cdot,\cdot)$ . The satisfaction of this assumption is one of the necessary conditions for the second variation of Lagrangian functionals studied in this work to be bounded.

Under the parametrization of $\mathbf{G}_{p}$ with respect to $p\in\mathcal{P}$ , we have that $\left({\mathbf{X},\boldsymbol{\Lambda}}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathscr{L}^{s}(H)$ is the solution to the following coupled equations

(21a)		$\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}=\mathbf{0}$
(21b)		$\left({\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X}}\right)\boldsymbol{\Lambda}+\boldsymbol{\Lambda}\left({\mathbf{A}-\mathbf{X}\mathbf{G}_{p}}\right)=-\mathbf{W},$

where $\mathbf{X}=\mathbf{X}(p)$ and $\boldsymbol{\Lambda}=\boldsymbol{\Lambda}(p)$ . We determine in the following that both $\mathbf{X}\in\mathscr{J}_{1}^{s}(H)$ and $\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H)$ are Lipschitz continuous functions of $p\in\mathcal{P}$ .

3.5. Lipschitz Continuity of the Primal and Dual Solutions

In each of the first-order optimality systems associated with their penalized constrained optimization problems lies a fixed-point equation that must be satisfied. We will determine that the primal problem (21a) and the dual problem (21b) are Lipschitz continuous functions of $p\in\mathcal{P}$ if $\mathbf{G}_{\left({\cdot}\right)}:\mathcal{P}\rightarrow\mathscr{J}_{1}^{s}(H)$ satisfies the assumptions prescribed in §3.4. These Lipschitz continuity bounds will then be used to determine that each fixed-point equation has only one fixed point. A more detailed discussion of these fixed point problems will be presented in §4 and §5 respectively. For now, we focus exclusively on proving that $\mathbf{X}\in\mathscr{J}_{1}^{s}(H)$ and $\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H)$ are Lipschitz continuous with respect to $p\in\mathcal{P}$ .

Throughout this section, we will denote $p_{1},p_{2}\in\mathcal{P}$ as any two arbitrary parameters and $\mathbf{X}_{1},\mathbf{X}_{2}\in\mathscr{J}_{1}^{s}(H)$ to be the solutions to

(22)

\mathbf{A}\mathbf{X}_{i}+\mathbf{X}_{i}\mathbf{A}^{*}-\mathbf{X}_{i}\mathbf{G}_{p_{i}}\mathbf{X}_{i}+\mathbf{Q}=\mathbf{0}

for $i=1,2$ . Likewise, we denote $\boldsymbol{\Lambda}_{1},\boldsymbol{\Lambda}_{2}\in\mathscr{L}^{s}(H)$ to be the solutions to

(23)

\left({\mathbf{A}^{*}-\mathbf{G}_{p_{i}}\mathbf{X}_{i}}\right)\boldsymbol{\Lambda}_{i}+\boldsymbol{\Lambda}_{i}\left({\mathbf{A}-\mathbf{X}_{i}\mathbf{G}_{p_{i}}}\right)=-\mathbf{W}

for $i=1,2$ . We begin our analysis by determining that $\mathbf{X}(p)$ is a Lipschitz continuous function of $p\in\mathcal{P}$ in the following.

Lemma 3.

Assume that $\mathbf{A}:\mathcal{D}(\mathbf{A})\rightarrow H$ is the generator of an exponentially stable $C_{0}$ -semigroup and $\mathbf{Q}\in\mathscr{J}_{1}^{s}(H)$ is a symmetric positive semi-definite operator. Further assume that $\mathbf{G}_{\left({\cdot}\right)}:\mathcal{P}\rightarrow\mathscr{J}_{1}^{s}(H)$ is Lipshitz continuous on $\mathcal{P}$ satisfying (16). Then the solution $\mathbf{X}\in\mathscr{J}_{1}^{s}(H)$ to (22) is a Lipschitz continuous function of $p\in\mathcal{P}$ . Furthermore, there exists positive constants $L_{\mathbf{G}},M,\alpha\in\mathbb{R}_{+}$ so that

\left\|\mathbf{X}_{1}-\mathbf{X}_{2}\right\|_{1}\leq\frac{L_{\mathbf{G}}M^{6}}{8\alpha^{3}}\left\|\mathbf{Q}\right\|^{2}_{1}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}},

where we have denoted $\mathbf{X}_{1},\mathbf{X}_{2}\in\mathscr{J}_{1}^{s}(H)$ to be the solution to (22) with the coefficient operators $\mathbf{G}_{p_{1}},\mathbf{G}_{p_{2}}\in\mathscr{J}_{1}^{s}(H)$ determined by $p_{1},p_{2}\in\mathcal{P}$ respectively.

Proof.

We begin by taking the difference between the equations for $\mathbf{X}_{1}\in\mathscr{J}_{1}^{s}\left({H}\right)$ and $\mathbf{X}_{2}\in\mathscr{J}_{1}^{s}(H)$ respectively (see (22)), We have that the difference $\mathbf{X}_{1}-\mathbf{X}_{2}\in\mathscr{J}_{1}^{s}\left({H}\right)$ is then the solution to the following Sylvester equation

\left[\mathbf{A}-\mathbf{X}_{1}\mathbf{G}_{p_{1}}\right]\left({\mathbf{X}_{1}-\mathbf{X}_{2}}\right)+\left({\mathbf{X}_{1}-\mathbf{X}_{2}}\right)\left[\mathbf{A}^{*}-\mathbf{G}_{p_{2}}\mathbf{X}_{2}\right]=\mathbf{X}_{1}\left({\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}}}\right)\mathbf{X}_{2}.

It then follows from Lemma 1 that the solution to the above equation satisfies the following integral equation

(24)

\left({\mathbf{X}_{1}-\mathbf{X}_{2}}\right)=-\int_{0}^{+\infty}\mathbf{T}_{1}(t)\mathbf{X}_{1}\left({\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}}}\right)\mathbf{X}_{2}\mathbf{T}_{2}(t)dt,

where $\mathbf{T}_{1}(t),\mathbf{T}_{2}(t)\in\mathscr{L}\left({H}\right)$ are the $C_{0}$ -semigroups generated by $\left[\mathbf{A}-\mathbf{X}_{1}\mathbf{G}_{p_{1}}\right]:\mathcal{D}\left({\mathbf{A}}\right)\rightarrow H$ and $\left[\mathbf{A}^{*}-\mathbf{G}_{p_{2}}\mathbf{X}_{2}\right]:\mathcal{D}\left({\mathbf{A}^{*}}\right)\rightarrow H$ respectively.

Since $\mathbf{A}$ is the generator of an exponentially stable $C_{0}$ -semigroup and that $\mathbf{X}_{1}\mathbf{G}_{p_{1}}\in\mathscr{J}_{1}^{s}(H)$ and $\mathbf{G}_{p_{2}}\mathbf{X}_{2}\in\mathscr{J}_{1}^{s}\left({H}\right)$ are bounded nonnegative operators, we have that $\left[\mathbf{A}-\mathbf{X}_{1}\mathbf{G}_{p_{1}}\right]:\mathcal{D}(\mathbf{A})\rightarrow H$ and $\left[\mathbf{A}^{*}-\mathbf{G}_{p_{2}}\mathbf{X}_{2}\right]:\mathcal{D}(\mathbf{A}^{*})\rightarrow H$ are also generators of exponentially stable $C_{0}$ -semigroups. Furthermore, it follows that

(25)

\left\|\mathbf{T}_{1}(t)\right\|_{\mathscr{L}\left({H}\right)}\leq Me^{-\alpha t}\textrm{ and }\left\|\mathbf{T}_{2}(t)\right\|_{\mathscr{L}\left({H}\right)}\leq Me^{-\alpha t}

for all $t\in\mathbb{R}_{+}$ , where $M,\alpha$ are the same constants associated with the unperturbed semigroup $\mathbf{S}(t)\in\mathscr{L}(H)$ .

Norming both sides of (24) with respect to the $\mathscr{J}_{1}\left({H}\right)$ norm then yields

\left\|\mathbf{X}_{1}-\mathbf{X}_{2}\right\|_{1}\leq\int_{0}^{+\infty}\left\|\mathbf{T}_{1}(t)\mathbf{X}_{1}\left({\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}}}\right)\mathbf{X}_{2}\mathbf{T}_{2}(t)\right\|_{1}dt

after applying the definition of the operator trace. It then follows that

\left\|\mathbf{X}_{1}-\mathbf{X}_{2}\right\|_{1}\leq\frac{M^{2}}{2\alpha}\left\|\mathbf{X}_{1}\right\|_{1}\left\|\mathbf{X}_{2}\right\|_{1}\left\|\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}}\right\|_{\mathscr{L}(H)}

after applying the fact that $\mathscr{J}_{1}(H)$ is a two-sided *-ideal in $\mathscr{L}(H)$ and the bounds provided in (25). Applying (7) in the statement of Theorem 1 then yields

	$\displaystyle\left\\|\mathbf{X}_{1}-\mathbf{X}_{2}\right\\|_{1}$	$\displaystyle\leq\frac{M^{6}}{8\alpha^{3}}\left\\|\mathbf{Q}\right\\|^{2}_{1}\left\\|\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}}\right\\|_{\mathscr{L}\left({H}\right)}$
		$\displaystyle\leq\frac{L_{\mathbf{G}}M^{6}}{8\alpha^{3}}\left\\|\mathbf{Q}\right\\|^{2}_{1}\left\\|p_{1}-p_{2}\right\\|_{\mathcal{P}},$

after applying the Lipschitz continuity assumption on $\mathbf{G}_{\left({\cdot}\right)}:\mathcal{P}\rightarrow\mathscr{L}^{s}(H)$ . ∎

With Lemma 2, we are now able to prove that $\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H)$ is a Lipshitz continuous function of $p\in\mathcal{P}$ in the following.

Lemma 4.

Let $\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H)$ satisfy (21b). Then $\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H)$ is a Lipschitz continuous function of $p\in\mathcal{P}$ and there exists positive constants $M,\alpha\in\mathbb{R}_{+}$ so that

\left\|\boldsymbol{\Lambda}_{1}-\boldsymbol{\Lambda}_{2}\right\|_{\mathscr{L}\left({H}\right)}\leq\left({\frac{M^{10}g}{16\alpha^{5}}\left\|\mathbf{Q}\right\|^{2}_{1}+\frac{M^{6}}{4\alpha^{3}}\left\|\mathbf{Q}\right\|_{1}}\right)L_{\mathbf{G}}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}},

for any $p_{1},p_{2}\in\mathcal{P}$ , where $g:=\sup_{p\in\mathcal{P}}\left\|\mathbf{G}_{p}\right\|_{\mathscr{L}(H)}$ for any $p\in\mathcal{P}$ .

Proof.

We begin by taking the difference of the equations (23) between $i=1,2$ . We arrive at the following Sylvester equation

(26)

\left[\mathbf{A}^{*}-\mathbf{G}_{p_{2}}\mathbf{X}_{2}\right]\left({\boldsymbol{\Lambda}_{1}-\boldsymbol{\Lambda}_{2}}\right)+\left({\boldsymbol{\Lambda}_{1}-\boldsymbol{\Lambda}_{2}}\right)\left[\mathbf{A}-\mathbf{X}_{1}\mathbf{G}_{p_{1}}\right]=F(\mathbf{X}_{1},\mathbf{X}_{2},\boldsymbol{\Lambda}_{1},\boldsymbol{\Lambda}_{2},\mathbf{G}_{p_{1}},\mathbf{G}_{p_{2}})

where we have denoted

		$\displaystyle F(\mathbf{X}_{1},\mathbf{X}_{2},\boldsymbol{\Lambda}_{1},\boldsymbol{\Lambda}_{2},\mathbf{G}_{p_{1}},\mathbf{G}_{p_{2}})=$
		$\displaystyle\qquad\mathbf{G}_{p_{1}}(\mathbf{X}_{1}-\mathbf{X}_{2})\boldsymbol{\Lambda}_{1}+(\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}})\mathbf{X}_{1}\boldsymbol{\Lambda}_{1}+\boldsymbol{\Lambda}_{2}\mathbf{X}_{2}(\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}})+\boldsymbol{\Lambda}_{2}(\mathbf{X}_{1}-\mathbf{X}_{2})\mathbf{G}_{p_{1}}.$

Let now $\mathbf{T}_{2}(t)\in\mathscr{L}\left({H}\right)$ be the exponentially stable $C_{0}$ -semigroup generated by $\mathbf{A}^{*}-\mathbf{G}_{p_{2}}\mathbf{X}_{2}$ and $\mathbf{T}_{1}(t)\in\mathscr{L}\left({H}\right)$ be the exponentially stable $C_{0}$ semigroup generated by $\mathbf{A}-\mathbf{X}_{1}\mathbf{G}_{p_{1}}$ respectively. With Lemma 1 we have that (26) can be written in the following equivalent Bochner integral form

(\boldsymbol{\Lambda}_{1}-\boldsymbol{\Lambda}_{2})=-\int_{0}^{+\infty}\mathbf{T}_{2}(t)F\left({\mathbf{X}_{1},\mathbf{X}_{2},\boldsymbol{\Lambda}_{1},\boldsymbol{\Lambda}_{2},\mathbf{G}_{p_{1}},\mathbf{G}_{p_{2}}}\right)\mathbf{T}_{1}(t)dt.

Norming both sides with the $\mathscr{L}\left({H}\right)$ norm then allows us to see that

(27)		$\displaystyle\left\\|\boldsymbol{\Lambda}_{1}-\boldsymbol{\Lambda}_{2}\right\\|_{\mathscr{L}\left({H}\right)}$	$\displaystyle\leq\int_{0}^{+\infty}M^{2}e^{-2\alpha t}\left\\|F(\mathbf{X}_{1},\mathbf{X}_{2},\boldsymbol{\Lambda}_{1},\boldsymbol{\Lambda}_{2},\mathbf{G}_{p_{1}},\mathbf{G}_{p_{2}})\right\\|_{\mathscr{L}\left({H}\right)}dt$
(27)			$\displaystyle=\frac{M^{2}}{2\alpha}\left\\|F(\mathbf{X}_{1},\mathbf{X}_{2},\boldsymbol{\Lambda}_{1},\boldsymbol{\Lambda}_{2},\mathbf{G}_{p_{1}},\mathbf{G}_{p_{2}})\right\\|_{\mathscr{L}\left({H}\right)}.$

We now bound $\left\|F\left({\mathbf{X}_{1},\mathbf{X}_{2},\boldsymbol{\Lambda}_{1},\boldsymbol{\Lambda}_{2},\mathbf{G}_{p_{1}},\mathbf{G}_{p_{2}}}\right)\right\|_{\mathscr{L}\left({H}\right)}$ . Recall (15), where we have assumed $\left\|\mathbf{G}_{p}\right\|_{\mathscr{L}(H)}=g$ for all $p\in\mathcal{P}$ . We have then

		$\displaystyle\left\\|F(\mathbf{X}_{1},\mathbf{X}_{2},\boldsymbol{\Lambda}_{1},\boldsymbol{\Lambda}_{2},\mathbf{G}_{p_{1}},\mathbf{G}_{p_{2}})\right\\|_{\mathscr{L}\left({H}\right)}$
		$\displaystyle\quad\leq\left\\|\mathbf{G}_{p_{1}}(\mathbf{X}_{1}-\mathbf{X}_{2})\boldsymbol{\Lambda}_{1}\right\\|_{\mathscr{L}\left({H}\right)}+\left\\|(\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}})\mathbf{X}_{1}\boldsymbol{\Lambda}_{1}\right\\|_{\mathscr{L}\left({H}\right)}$
		$\displaystyle\qquad+\left\\|\boldsymbol{\Lambda}_{2}\mathbf{X}_{2}(\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}})\right\\|_{\mathscr{L}\left({H}\right)}+\left\\|\boldsymbol{\Lambda}_{2}(\mathbf{X}_{1}-\mathbf{X}_{2})\mathbf{G}_{p_{1}}\right\\|_{\mathscr{L}\left({H}\right)}$
		$\displaystyle\quad\leq\left({\frac{M^{8}g}{8\alpha^{4}}\left\\|\mathbf{Q}\right\\|^{2}_{1}+\frac{M^{4}}{2\alpha^{2}}\left\\|\mathbf{Q}\right\\|_{1}}\right)\left\\|\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}}\right\\|_{\mathscr{L}(H)},$

after applying Theorem 1, Lemmas 3, and 2. Inserting this bound into (27) then results in

\left\|\boldsymbol{\Lambda}_{1}-\boldsymbol{\Lambda}_{2}\right\|_{\mathscr{L}\left({H}\right)}\leq\left({\frac{M^{10}g}{16\alpha^{5}}\left\|\mathbf{Q}\right\|^{2}_{1}+\frac{M^{6}}{4\alpha^{3}}\left\|\mathbf{Q}\right\|_{1}}\right)\left\|\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}}\right\|_{\mathscr{L}(H)}.

Applying the Lipschitz continuity assumption (16) yields the result of this lemma. ∎

3.6. The Critical Cone

The sufficient second-order optimality condition requires that the Hessian evaluated at the stationary point of the Lagrangian be positive definite in the directions in the critical cone associated with the constraint. Loosely speaking, the critical cone is a subset of the tangent space of the constraint manifold evaluated at $\left({\mathbf{X}_{opt},p_{opt}}\right)$ that maps the first Gatéux (directional) derivative of the constraint to zero. We will utilize the second-order optimality condition to demonstrate that the solution to the first-order optimality system is indeed the unique minimizer to the associated penalized constrained optimization problems studied in this work.

Let us define

c(\mathbf{X},p):=\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}

to be the constraint function associated with the operator-valued Riccati equation, where again $\mathbf{Q}\in\mathscr{J}_{1}^{s}(H)$ and $\mathbf{G}_{(\cdot)}:\mathcal{P}\rightarrow\mathscr{J}_{1}^{s}(H)$ satisfies (15), (16), and (17). The critical cone for the constrained optimizer is defined by the following set

\mathcal{K}(\mathbf{X}_{opt},p_{opt}):=\left\{\left({\boldsymbol{\Phi},q}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathcal{P}:\left.\frac{\partial c}{\partial\mathbf{X}}\right|_{\left({\mathbf{X}_{opt},p}\right)_{opt}}(\boldsymbol{\Phi})=0\textrm{ and }\left.\frac{\partial c}{\partial p}\right|_{\left({\mathbf{X}_{opt},p_{opt}}\right)}(q)=0\right\}.

We now characterize $\mathcal{K}(\mathbf{X}_{opt},p_{opt})$ . Taking the first Gatéux derivative of $c(\cdot,\cdot)$ with respect to $\mathbf{X}\in\mathscr{J}_{1}^{s}(H)$ yields

	$\displaystyle\frac{\partial c}{\partial\mathbf{X}}(\boldsymbol{\Phi})$	$\displaystyle=\mathbf{A}\boldsymbol{\Phi}+\boldsymbol{\Phi}\mathbf{A}^{*}-\boldsymbol{\Phi}\mathbf{G}_{p}\mathbf{X}-\mathbf{X}\mathbf{G}_{p}\boldsymbol{\Phi}$
		$\displaystyle=\left({\mathbf{A}-\mathbf{X}\mathbf{G}_{p}}\right)\boldsymbol{\Phi}+\boldsymbol{\Phi}\left({\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X}}\right)$
		$\displaystyle=0$

for all $\boldsymbol{\Phi}\in\mathscr{J}_{1}^{s}(H)$ . Because $\boldsymbol{\Phi}\in\mathscr{J}_{1}^{s}(H)$ is now the solution of a Sylvester equation (see (11)) with $\mathbf{0}$ as the data, we have that $\boldsymbol{\Phi}\equiv\mathbf{0}$ . Continuing, we have that

\frac{\partial c}{\partial p}(q)=-\mathbf{X}\mathbf{d}\mathbf{G}_{p}(q)\mathbf{X}=0.

In general, the set of $q\in\mathcal{P}$ satisfying the above condition is nonempty. Combining the two observations made above, we have that $\mathcal{K}(\mathbf{X}_{opt},p_{opt})$ can be characterized by the following

(28)

\mathcal{K}(\mathbf{X}_{opt},p_{opt}):=\left\{\left({\mathbf{0},q}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathcal{P}:-\mathbf{X}_{opt}\mathbf{d}\mathbf{G}_{p_{opt}}(q)\mathbf{X}_{opt}=\mathbf{0}\right\}.

With the preliminary analysis completed, we are now ready to present and analyze the two penalized constrained optimization problems of interest in the following two sections of this paper.

4. Control Penalized Constrained Trace Minimization

4.1. Problem Statement

The control penalization technique [19] is a well-known and often applied method of regularizing the solution of optimization problems. It has the benefit of implicitly constraining the parameter space and improving the well-posedness properties of the optimization problem. We study this technique in the context of control device design and placement in this section. The ideas presented in this section will serve as a pedagogical stepping stone for the more complex arguments needed to analyze the problem presented in the following section.

The control penalized optimization problem of interest is to seek a $\left({\mathbf{X}_{opt},p_{opt}}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathcal{P}$ that minimizes

(29)

\mathcal{J}_{\beta}(\mathbf{X},p):=\texttt{trace}\left(\mathbf{X}\mathbf{W}\right)+\frac{\beta}{2}\left\|p\right\|_{\mathcal{P}}^{2}

constrained by the strong operator-valued Riccati equation

(30)

\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}=\mathbf{0},

where $\mathbf{W}\in\mathscr{L}^{s}(H)$ is a symmetric positive semi-definite weighting operator, $\mathbf{A}:\mathcal{D}(\mathbf{A})\subset H\rightarrow H$ is the generator of the exponentially stable $C_{0}$ -semigroup $\mathbf{S}(t)\in\mathscr{L}(H)$ , $\mathbf{Q}\in\mathscr{J}_{1}^{s}(H)$ is a nonnegative operator, and $\mathbf{G}_{\left({\cdot}\right)}:\mathcal{P}\rightarrow\mathscr{J}_{1}^{s}(H)$ is the parametrized operator associated with the control device. The definition of $\mathcal{J}_{\beta}(\cdot)$ presented in (29) is equivalent to the cost functional presented in the introduction of this work. This is because for each $p\in\mathcal{P}$ , there is only one $\mathbf{X}\in\mathscr{J}_{1}^{s}(H)$ that satisfies (30).

In the following discussion, we will first present the first-order optimality system associated with the penalized optimization problem studied in this section, then we provide a derivation of this set of equations. Next we determine that there exists only one solution to the first order optimality system and that this solution satisfies the second-order sufficient conditions to qualify as a constrained minimizer of the cost functional (29). In other words, there exists only one global constrained minimizer for (29).

4.2. First-Order Optimality System

The first-order optimality system associated with the constrained optimization problem is given as follows: Seek a $\left({\mathbf{X}_{opt},\boldsymbol{\Lambda}_{opt},p_{opt}}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathscr{L}^{s}(H)\times\mathcal{P}$ that satisfies

Primal Problem:
(31a)		$\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}=0$
Dual Problem:
(31b)		$\left({\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X}}\right)\boldsymbol{\Lambda}+\boldsymbol{\Lambda}\left({\mathbf{A}-\mathbf{X}\mathbf{G}_{p}}\right)=-\mathbf{W}$
Optimality Condition:
(31c)		$p=\frac{1}{\beta}\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}.$

The variable $\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H)$ is the dual solution that arises from applying the Lagrange multiplier to (31a). We will determine that there is only one solution $\left({\mathbf{X}_{opt},\boldsymbol{\Lambda}_{opt},p_{opt}}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathscr{L}^{s}(H)\times\mathcal{P}$ that satisfies (31) and that $\left({\mathbf{X}_{opt},p_{opt}}\right)$ is in fact a constrained minimizer of (29).

4.2.1. Derivation of the Optimality System

We now derive the first-order optimality system (31). We begin by introducing the Lagrange multiplier $\boldsymbol{\Lambda}\in\mathscr{L}(H)$ . The Lagrangian functional associated with the cost functional (29) with constraint (30) is the following

(32)

\mathcal{L}\left({\mathbf{X},p,\boldsymbol{\Lambda}}\right):=\left<\mathbf{I},\mathbf{X}\mathbf{W}\right>+\frac{\beta}{2}\left\|p\right\|^{2}_{\mathcal{P}}+\left<\boldsymbol{\Lambda},\left[\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}\right]\right>,

where we have utilized the identity $\texttt{trace}\left(\mathbf{X}\mathbf{W}\right):=\left<\mathbf{I},\mathbf{X}\mathbf{W}\right>$ . Because $\left<\boldsymbol{\Lambda},\cdot\right>$ is a functional belonging to $\mathscr{J}_{1}(H)^{\prime}$ , we have that (32) is well-defined since we have demonstrated that there exists a $\mathbf{X}\in\mathscr{J}_{1}^{s}(H)$ that satisfies (31a) in the $\mathscr{J}_{1}(H)$ topology in Theorem 1.

The first order necessary condition for optimality, i.e. that $\left({\mathbf{X},p,\boldsymbol{\Lambda}}\right)\in\mathscr{J}_{1}(H)\times\mathcal{P}\times\mathscr{L}(H)$ is a saddle-point of (32), is the following

(33)

\frac{\partial\mathcal{L}}{\partial\mathbf{X}}(\boldsymbol{\Psi})=0,\quad\frac{\partial\mathcal{L}}{\partial p}(q)=0,\quad\frac{\partial\mathcal{L}}{\partial\boldsymbol{\Lambda}}(\boldsymbol{\Phi})=0,

for all $\left({\boldsymbol{\Psi},q,\boldsymbol{\Phi}}\right)\in\mathscr{L}(H)\times\mathcal{P}\times\mathscr{J}_{1}(H)$ . In this work, we choose to work with the strong form of (33), i.e.

(34)

\frac{\partial\mathcal{L}}{\partial\mathbf{X}}=0,\quad\frac{\partial\mathcal{L}}{\partial p}=0,\quad\frac{\partial\mathcal{L}}{\partial\boldsymbol{\Lambda}}=0,

because we have already derived the necessary theoretical results for the well-posedness analysis using the strong form of the equation that arise from (34). It is easily determined that (34) being satisfied also implies that (33) is also satisfied, therefore it is sufficient to consider (34) in our analysis. Furthermore, (33) implies (34) because $\mathscr{J}_{1}(H)$ and $\mathscr{L}(H)$ form a duality pairing under the trace operator. We proceed in deriving each term in (34) in the remaining paragraphs of this subsection.

We begin our discussion by deriving (31a). Taking the Gatéux derivative of $\mathcal{L}(\cdot,\cdot,\cdot)$ with respect to $\boldsymbol{\Lambda}$ yields

\frac{\partial\mathcal{L}}{\partial\boldsymbol{\Lambda}}(\boldsymbol{\Phi})=\left<\boldsymbol{\Phi},\left[\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}\right]\right>

for all $\boldsymbol{\Phi}\in\mathscr{L}(H)$ . Setting $\frac{\partial\mathcal{L}}{\partial\boldsymbol{\Lambda}}(\boldsymbol{\Phi})=0$ then yields the following necessary condition

(35)

\left<\boldsymbol{\Phi},\left[\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}\right]\right>=0

for all $\boldsymbol{\Phi}\in\mathscr{L}(H)$ . We have that (35) is true if and only if (31a) is satisfied because there is a one-to-one correspondence between $\left<\boldsymbol{\Phi},\cdot\right>$ and $\mathscr{J}_{1}(H)^{\prime}$ , and observing that Theorem 1 indicates that (5) is satisfied in the $\mathscr{J}_{1}(H)$ norm topology.

We now derive (31b). The Gatéux derivative of $\mathcal{L}(\cdot,\cdot,\cdot)$ with respect to $\mathbf{X}$ is the following

	$\displaystyle\frac{\partial\mathcal{L}}{\partial\mathbf{X}}(\boldsymbol{\Psi})$	$\displaystyle=\left<\mathbf{I},\boldsymbol{\Psi}\mathbf{W}\right>+\left<\boldsymbol{\Lambda},\left[\left({\mathbf{A}-\mathbf{X}\mathbf{G}_{p}}\right)\boldsymbol{\Psi}+\boldsymbol{\Psi}\left({\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X}}\right)\right]\right>$
		$\displaystyle=\left<\mathbf{W},\boldsymbol{\Psi}\right>+\left<\left({\mathbf{A}^{}-\mathbf{G}_{p}\mathbf{X}^{}}\right)\boldsymbol{\Lambda},\boldsymbol{\Psi}\right>+\left<\boldsymbol{\Lambda}^{},(\mathbf{A}-\mathbf{X}^{}\mathbf{G}_{p})\boldsymbol{\Psi}^{*}\right>$
		$\displaystyle=\left<\mathbf{W},\boldsymbol{\Psi}\right>+\left<\left({\mathbf{A}^{}-\mathbf{G}_{p}\mathbf{X}^{}}\right)\boldsymbol{\Lambda},\boldsymbol{\Psi}\right>+\left<(\mathbf{A}^{}-\mathbf{G}_{p}\mathbf{X})\boldsymbol{\Lambda}^{},\boldsymbol{\Psi}^{*}\right>$
		$\displaystyle=\left<\mathbf{W},\boldsymbol{\Psi}\right>+\left<\left({\mathbf{A}^{}-\mathbf{G}_{p}\mathbf{X}^{}}\right)\boldsymbol{\Lambda},\boldsymbol{\Psi}\right>+\left<\boldsymbol{\Lambda}\left({\mathbf{A}-\mathbf{X}^{*}\mathbf{G}_{p}}\right),\boldsymbol{\Psi}\right>$
		$\displaystyle=\left<\mathbf{W},\boldsymbol{\Psi}\right>+\left<\left({\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X}}\right)\boldsymbol{\Lambda},\boldsymbol{\Psi}\right>+\left<\boldsymbol{\Lambda}\left({\mathbf{A}-\mathbf{X}\mathbf{G}_{p}}\right),\boldsymbol{\Psi}\right>$

for all $\boldsymbol{\Psi}\in\mathscr{J}_{1}(H)$ . Note that we have made heavy use of (1) in the derivation above. In the final equality, we have applied the property that $\mathbf{X}^{*}=\mathbf{X}$ . Setting $\frac{\partial L}{\partial\mathbf{X}}(\boldsymbol{\Psi})=0$ then results in

(36)

\left<\left({\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X}}\right)\boldsymbol{\Lambda},\boldsymbol{\Psi}\right>+\left<\boldsymbol{\Lambda}\left({\mathbf{A}-\mathbf{X}\mathbf{G}_{p}}\right),\boldsymbol{\Psi}\right>=-\left<\mathbf{W},\boldsymbol{\Psi}\right>

for all $\boldsymbol{\Psi}\in\mathscr{J}_{1}(H)$ . Because $\left<\left[\left({\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X}}\right)\boldsymbol{\Lambda}+\boldsymbol{\Lambda}\left({\mathbf{A}-\mathbf{X}\mathbf{G}_{p}}\right)+\mathbf{W}\right],\cdot\right>$ is defined in the weak-* topology of $\mathscr{J}_{1}(H)^{\prime}$ , we have that (36) is satisfied if and only if (31b) is satisfied. This is because the zero element in $\mathscr{J}_{1}(H)^{\prime}$ is the only functional that maps any element in $\mathscr{J}_{1}(H)$ to zero in the complex plane.

We conclude this section with the derivation of (31c).

	$\displaystyle\frac{\partial L}{\partial p}(q)$	$\displaystyle=\beta\left({p,q}\right)_{\mathcal{P}}-\left<\boldsymbol{\Lambda},\mathbf{X}\mathbf{d}\mathbf{G}_{p}(q)\mathbf{X}\right>$
		$\displaystyle=\beta\left<p,q\right>_{\mathcal{P}}-\left<\mathbf{X}^{*}\boldsymbol{\Lambda},\mathbf{d}\mathbf{G}_{p}(q)\mathbf{X}\right>$
		$\displaystyle=\beta\left<p,q\right>_{\mathcal{P}}-\left<\boldsymbol{\Lambda}^{}\mathbf{X},\mathbf{X}^{}\mathbf{d}\mathbf{G}_{p}(q)^{*}\right>$
		$\displaystyle=\beta\left<p,q\right>_{\mathcal{P}}-\left<\mathbf{X}\boldsymbol{\Lambda}^{}\mathbf{X},\mathbf{d}\mathbf{G}_{p}(q)^{}\right>$
		$\displaystyle=\beta\left<p,q\right>_{\mathcal{P}}-\left<\mathbf{X}^{}\boldsymbol{\Lambda}\mathbf{X}^{},\mathbf{d}\mathbf{G}_{p}(q)\right>$
		$\displaystyle=\left({\beta p-\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}\boldsymbol{\Lambda}\mathbf{X},q}\right)_{\mathcal{P}}$
		$\displaystyle=0,$

after applying the fact that $\mathbf{X}^{*}=\mathbf{X}$ , $\boldsymbol{\Lambda}^{*}=\boldsymbol{\Lambda}$ , and making heavy use of (1). We then have that

(37)

(p,q)_{\mathcal{P}}=\frac{1}{\beta}\left({\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}\boldsymbol{\Lambda}\mathbf{X},q}\right)_{\mathcal{P}}

for all $q\in\mathcal{P}$ . We then arrive at (31c) since (37) can only be satisfied if its strong form is satisfied since $\mathcal{P}$ is a Hilbert space.

4.2.2. Well-Posedness of the Optimality System

We now determine the conditions that must be satisfied in order for (31) to only have one solution. First, notice that (31a) is well-posed for any $p\in\mathcal{P}$ as a consequence of Theorem 1 and observing that $\mathbf{G}_{p}\in\mathscr{J}_{1}^{s}(H)$ . Next, notice that $\mathbf{X}\in\mathscr{J}_{1}^{s}(H)$ is coupled to (31b) as an input parameter. Lemma 2 indicates that (31b) is well-posed for any $\mathbf{X}\in\mathscr{J}_{!}^{s}(H)$ . Because (31a) is independent of $\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H)$ , there exists a unique $\left({\mathbf{X},\boldsymbol{\Lambda}}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathscr{L}^{s}(H)$ that satisfies the coupled equation (22) and (23) for any choice of $p\in\mathcal{P}$ . Because of this, we have determined the existence of a mapping $p\mapsto\left({\mathbf{X}(p),\boldsymbol{\Lambda}(p)}\right)$ for any $p\in\mathcal{P}$ . This mapping is continuous as a consequence of Lemmas 3 and 4. It is our goal then to determine the conditions by which we can select a unique $p\in\mathcal{P}$ so that $\left({\mathbf{X}(p),\boldsymbol{\Lambda}(p),p}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathscr{L}^{s}(H)\times\mathcal{P}$ satisfies (31c).

Let us define $f(p):\mathcal{P}\rightarrow\mathcal{P}$ as follows

f(p):=\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}(p)\boldsymbol{\Lambda}(p)\mathbf{X}(p),

where $\left({\mathbf{X}(p),\boldsymbol{\Lambda}(p)}\right)$ satisfies (31a) and (31b) respectively. It then becomes clear that (31c) can be written in the following fixed point form

p=f(p).

It is our goal to invoke the Banach fixed point theorem [7, Theorem 3.7-1] to determine the conditions under which the function $f(\cdot)$ is a contractive map, i.e., $f(\cdot)$ satisfies

\left\|f(p_{1})-f(p_{2})\right\|_{\mathcal{P}}\leq k\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}

with $k<1$ for any $p_{1},p_{2}\in\mathcal{P}$ . More concretely, we wish to demonstrate for any $p_{1},p_{2}\in\mathcal{P}$ , that

(38)

\frac{1}{\beta}\left\|\mathbf{d}\mathbf{G}_{p_{1}}^{*}\mathbf{X}_{1}\boldsymbol{\Lambda}_{1}\mathbf{X}_{1}-\mathbf{d}\mathbf{G}_{p_{2}}^{*}\mathbf{X}_{2}\boldsymbol{\Lambda}_{2}\mathbf{X}_{2}\right\|_{\mathcal{P}}\leq k\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}.

We now characterize the constant $k\in\mathbb{R}_{+}$ . Lemmas 2, 3, and 4 determines that both $\mathbf{X}\in\mathscr{J}_{1}^{s}(H)$ and $\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H)$ are Lipschitz continuous functions of $p\in\mathcal{P}$ under the assumption that (15) and (16) are satisfied. If we further assume (17) is satisfied, then it follows that

	$\displaystyle\frac{1}{\beta}\left\\|\mathbf{d}\mathbf{G}_{p_{1}}^{}\mathbf{X}_{1}\boldsymbol{\Lambda}_{1}\mathbf{X}_{1}-\mathbf{d}\mathbf{G}_{p_{2}}^{}\mathbf{X}_{2}\boldsymbol{\Lambda}_{2}\mathbf{X}_{2}\right\\|_{\mathcal{P}}$	$\displaystyle\leq\frac{1}{\beta}\bigg(\left\\|\mathbf{d}\mathbf{G}_{p_{1}}-\mathbf{d}\mathbf{G}_{p_{1}}\right\\|_{\mathscr{L}(\mathcal{P};\mathscr{J}_{1}(H))}\left\\|\mathbf{X}_{1}\right\\|^{2}_{1}\left\\|\boldsymbol{\Lambda}_{2}\right\\|_{\mathscr{L}(H)}$
		$\displaystyle\qquad+\left\\|\mathbf{d}\mathbf{G}_{p_{1}}\right\\|_{\mathscr{L}\left({\mathcal{P};\mathscr{J}_{1}(H)}\right)}\left\\|\mathbf{X}_{1}-\mathbf{X}_{2}\right\\|_{1}\left\\|\boldsymbol{\Lambda}_{2}\right\\|_{\mathscr{L}(H)}\left\\|\mathbf{X}_{2}\right\\|_{1}$
		$\displaystyle\qquad+\left\\|\mathbf{d}\mathbf{G}_{p_{1}}\right\\|_{\mathscr{L}(\mathcal{P};\mathscr{J}_{1}(H))}\left\\|\mathbf{X}_{1}\right\\|_{1}\left\\|\boldsymbol{\Lambda}_{1}-\boldsymbol{\Lambda}_{2}\right\\|_{\mathscr{L}(H)}\left\\|\mathbf{X}_{2}\right\\|_{1}$
		$\displaystyle\qquad+\left\\|\mathbf{d}\mathbf{G}_{p_{1}}\right\\|_{\mathscr{L}\left({\mathcal{P};\mathscr{J}_{1}(H)}\right)}\left\\|\mathbf{X}_{1}\right\\|_{1}\left\\|\boldsymbol{\Lambda}_{1}\right\\|_{\mathscr{L}(H)}\left\\|\mathbf{X}_{1}-\mathbf{X}_{2}\right\\|_{1}\bigg)$
		$\displaystyle\leq\frac{1}{\beta}\bigg(\frac{M^{6}}{16\alpha^{3}}\left\\|\mathbf{Q}\right\\|_{1}^{2}\left\\|\mathbf{W}\right\\|_{\mathscr{L}(H)}\left\\|\mathbf{d}\mathbf{G}_{p_{1}}-\mathbf{d}\mathbf{G}_{p_{1}}\right\\|_{\mathscr{L}(\mathcal{P};\mathscr{J}_{1}(H))}$
		$\displaystyle\qquad+\frac{C_{\mathbf{d}\mathbf{G}}M^{4}}{2\alpha^{2}}\left\\|\mathbf{Q}\right\\|_{1}\left\\|\mathbf{W}\right\\|_{\mathscr{L}(H)}\left\\|\mathbf{X}_{1}-\mathbf{X}_{2}\right\\|_{1}$
		$\displaystyle\qquad+\frac{C_{\mathbf{d}\mathbf{G}}M^{4}}{2\alpha^{2}}\left\\|\mathbf{Q}\right\\|_{1}^{2}\left\\|\boldsymbol{\Lambda}_{1}-\boldsymbol{\Lambda}_{2}\right\\|_{\mathscr{L}(H)}\bigg)$
		$\displaystyle\leq\frac{1}{\beta}\bigg(\frac{L_{\mathbf{d}\mathbf{G}}M^{6}}{16\alpha^{3}}\left\\|\mathbf{Q}\right\\|_{1}^{2}\left\\|\mathbf{W}\right\\|_{\mathscr{L}(H)}\left\\|p_{1}-p_{2}\right\\|_{\mathcal{P}}$
		$\displaystyle\qquad+\frac{L_{\mathbf{d}\mathbf{G}}C_{\mathbf{d}\mathbf{G}}M^{10}}{16\alpha^{5}}\left\\|\mathbf{Q}\right\\|_{1}^{3}\left\\|\mathbf{W}\right\\|_{\mathscr{L}(H)}\left\\|p_{1}-p_{2}\right\\|_{\mathcal{P}}$
		$\displaystyle\qquad+\frac{L_{\mathbf{G}}C_{\mathbf{d}\mathbf{G}}M^{4}}{2\alpha^{2}}\left\\|\mathbf{Q}\right\\|_{1}^{2}\left({\frac{M^{10}\gamma}{16\alpha^{5}}\left\\|\mathbf{Q}\right\\|^{2}_{1}+\frac{M^{6}}{4\alpha^{3}}\left\\|\mathbf{Q}\right\\|_{1}}\right)\left\\|p_{1}-p_{2}\right\\|_{\mathcal{P}}\bigg).$
		$\displaystyle=k_{\alpha,\beta,M,\mathbf{Q},\mathbf{W},\mathbf{G}}\left\\|p_{1}-p_{2}\right\\|_{\mathcal{P}}$

Therefore, we have demonstrated that (31c) has a unique solution if the penalty parameter $\beta$ is chosen sufficiently large enough to sufficiently reduce the contributions of $\alpha,M,\left\|\mathbf{Q}\right\|_{1},\left\|\mathbf{W}\right\|_{\mathscr{L}\left({H}\right)}$ , and the constants associated with $\mathbf{G}_{\left({\cdot}\right)}$ .

We now analyze the Hessian operator $\mathbf{d}^{2}\mathcal{L}_{opt}\left[\cdot,\cdot\right]:\left[\mathscr{J}_{1}^{s}(H)\times\mathcal{P}\right]\times\left[\mathscr{J}_{1}^{s}(H)\times\mathcal{P}\right]\rightarrow\mathbb{R}$ associated with the Lagrangian cost functional (32) evaluated at $\left({\mathbf{X}_{opt},p_{opt},\boldsymbol{\Lambda}_{opt}}\right)$ . We hope to demonstrate that the second-order optimality condition is satisfied, i.e. that $\mathbf{d}^{2}\mathcal{L}_{opt}\left[(\boldsymbol{\Phi},q),(\boldsymbol{\Phi},q)\right]$ is positive definite for every $\left({\boldsymbol{\Phi},q}\right)$ in the critical cone $\mathcal{K}\left({\mathbf{X}_{opt},p_{opt}}\right)$ (as defined in (28)), We now define $\mathbf{d}^{2}\mathcal{L}_{opt}\left[\cdot,\cdot\right]$ as follows

	$\displaystyle\mathbf{d}^{2}\mathcal{L}_{opt}\left[(\boldsymbol{\Phi},q),(\boldsymbol{\Psi},r)\right]$	$\displaystyle=-\left<\boldsymbol{\Lambda}_{opt},\left[\boldsymbol{\Phi}\mathbf{G}_{p_{opt}}\boldsymbol{\Psi}+\boldsymbol{\Psi}\mathbf{G}_{{}_{opt}}\boldsymbol{\Phi}\right]\right>$
		$\displaystyle\qquad-\left<\boldsymbol{\Lambda}_{opt},\left[\boldsymbol{\Phi}\mathbf{d}\mathbf{G}_{p_{opt}}(r)\mathbf{X}_{opt}+\mathbf{X}_{opt}\mathbf{d}\mathbf{G}_{p_{opt}}(r)\boldsymbol{\Phi}\right]\right>$
		$\displaystyle\qquad-\left<\boldsymbol{\Lambda}_{opt},\left[\boldsymbol{\Psi}\mathbf{d}\mathbf{G}_{p_{opt}}(q)\mathbf{X}_{opt}+\mathbf{X}_{opt}\mathbf{d}\mathbf{G}_{p_{opt}}(q)\boldsymbol{\Psi}\right]\right>$
		$\displaystyle\qquad+\beta\left({q,r}\right)_{\mathcal{P}}-\left<\boldsymbol{\Lambda}_{opt},\mathbf{X}_{opt}\mathbf{d}^{2}\mathbf{G}_{p_{opt}}\left({q,r}\right)\mathbf{X}_{opt}\right>$

for all $\left({\boldsymbol{\Phi},q}\right),\left({\Psi,r}\right)\in\mathscr{J}_{1}(H)\times\mathcal{P}$ . Now restricting $\left({\boldsymbol{\Phi},q}\right)$ and $\left({\boldsymbol{\Psi},r}\right)$ to the critical cone $\mathcal{K}(\mathbf{X}_{opt},p_{opt})$ then yields

\displaystyle\mathbf{d}^{2}\mathcal{L}_{opt}\left[(\boldsymbol{\Phi},q),(\boldsymbol{\Psi},r)\right]

\displaystyle=\beta\left({q,r}\right)_{\mathcal{P}}-\left<\boldsymbol{\Lambda}_{opt},\mathbf{X}_{opt}\mathbf{d}^{2}\mathbf{G}_{p_{opt}}\left({q,r}\right)\mathbf{X}_{opt}\right>

since $\boldsymbol{\Phi}=\boldsymbol{\Psi}=0$ must be satisfied for any element belonging to $\mathcal{K}\left({\mathbf{X}_{opt},p_{opt}}\right)$ . Taking $q=r$ then immediately indicates that $\mathbf{d}^{2}\mathcal{L}_{opt}$ is a positive semi-definite operator on $\mathcal{K}(\mathbf{X}_{opt},p_{opt})$ if $\beta$ is chosen sufficiently large, thereby satisfying the necessary second order condition for $\left({\mathbf{X}_{opt},p_{opt}}\right)$ to qualify as a constrained minimizer to the cost functional (29).

We formalize our findings in the following.

Theorem 2.

Assume that $\mathbf{A}:\mathcal{D}(\mathbf{A})\rightarrow H$ is the generator of a $C_{0}$ -semigroup, $\mathbf{Q}\in\mathscr{J}_{1}^{s}(H)$ , and $\mathbf{W}\in\mathscr{L}^{s}(H)$ . If conditions (15), (16), and (17) are satisfied for the mapping $\mathbf{G}_{(\cdot)}:\mathcal{P}\rightarrow\mathscr{J}_{1}^{s}(H)$ , then there exists a unique solution $\left({\mathbf{X}_{opt},\boldsymbol{\Lambda}_{opt},p}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathscr{L}^{s}\left({H}\right)\times\mathcal{P}$ that satisfies the first-order optimality system (31) associated with the constrained weighted trace minimization problem provided that the penalty parameter $\beta\in\mathbb{R}_{+}$ is chosen sufficiently large enough so that $k_{\alpha,\beta,M,\mathbf{Q},\mathbf{W},\mathbf{G}}<1$ .

Further assume that (20) is satisfied for the mapping $\mathbf{G}_{(\cdot)}:\mathcal{P}\rightarrow\mathscr{J}_{1}^{s}(H)$ . Then $\left({\mathbf{X}_{opt},p_{opt}}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathcal{P}$ is the unique constrained minimizer for the cost functional (29) if $\beta\in\mathbb{R}_{+}$ is sufficiently large.

In general, Theorem 2 indicates that $\beta$ can be chosen arbitrarily large to force $\mathcal{J}_{\beta}(\cdot)$ to have a unique minimizer.

5. Approximate Control Constraint Enforcement

In this section, we consider a penalized technique to approximately enforce a trace constraint on the operator $\mathbf{G}_{p}\in\mathscr{J}_{1}^{s}(H)$ for all $p\in\mathcal{P}$ . One important property of this penalization approach is that the approximate constraint enforcement becomes exact as the penalization parameter reaches positive infinity. It is through this mechanism by which we are able to determine that a wide class of sensor placement and design problems admits a unique constrained minimizer. Much of the general techniques used to analyze the problem described in this section have been discussed in the previous section. Because of this, we will only briefly touch upon details that have direct analogs to what was discussed for the previous problem and focus our attention on the technicalities associated with the current problem of interest.

5.1. Problem Statement

The penalized optimization problem analyzed in this section is to seek a constrained minimizer $\left({\mathbf{X}_{opt},p_{opt}}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathcal{P}$ for the following cost functional

(39)

\mathcal{J}_{\beta}(\mathbf{X},p):=\texttt{trace}\left(\mathbf{X}\mathbf{W}\right)+\frac{\beta}{2}\left[\texttt{trace}\left(\mathbf{G}_{p}\right)-\gamma\right]^{2}

subject to

(40)

\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}=\mathbf{0},

where we assume that $\mathbf{Q}\in\mathscr{J}_{1}^{s}(H)$ is again a positive semi-definite trace-class operator.

5.2. First-Order Optimality System

The Lagrangian functional associated with the constrained optimization problem studied in this section is the following

(41)

\mathcal{L}(\mathbf{X},p,\boldsymbol{\Lambda}):=\left<\mathbf{I},\mathbf{X}\mathbf{W}\right>+\frac{\beta}{2}\left[\texttt{trace}\left(\mathbf{G}_{p}\right)-\gamma\right]^{2}+\left<\boldsymbol{\Lambda},\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}\right>.

Taking the first Fréchet derivative of $\mathcal{L}(\cdot,\cdot,\cdot)$ with respect to $\mathbf{X}\in\mathscr{J}_{1}^{s}(H)$ , $p\in\mathcal{P}$ , and $\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H)$ yields the first-order optimality system associated with (41) stated below.

Primal Problem:
(42a)		$\mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}=0$
Dual Problem:
(42b)		$\left({\mathbf{A}^{*}-\mathbf{G}_{p}\mathbf{X}}\right)\boldsymbol{\Lambda}+\boldsymbol{\Lambda}\left({\mathbf{A}-\mathbf{X}\mathbf{G}_{p}}\right)=-\mathbf{W}$
Optimality Condition:
(42c)		$p=\frac{1}{\left\\|\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}\right\\|_{\mathscr{L}(H)}}\left[\mathbf{d}\mathbf{G}_{p}^{}\mathbf{d}\mathbf{G}_{p}\right]^{-1}\mathbf{d}\mathbf{G}_{p}^{}\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}\mathbf{d}\mathbf{G}_{p}p$
Control Operator Trace Constraint:
(42d)		$\texttt{trace}\left(\mathbf{G}_{p}\right)=\gamma+\frac{1}{\beta}\left\\|\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}\right\\|_{\mathscr{L}(H)}$

Because the derivation of (42a) and (42b) is nearly identical to that of (31a) and (31b), we will only discuss the derivation of (42c) and (42d) in the following.

Remark 1.

An inspection of (42d) indicates that $\texttt{trace}\left(\mathbf{G}_{p}\right)\rightarrow\gamma$ as $\beta\rightarrow+\infty$ . Applying (42d) in (39) then indicates that $\mathcal{J}_{\beta}(\mathbf{X};p)$ becomes $\texttt{trace}\left(\mathbf{X}\mathbf{W}\right)$ as $\beta\rightarrow+\infty$ .

5.2.1. Derivation of the Optimality Condition

The chain rule and necessary stationary implies that

\frac{\partial\mathcal{L}}{\partial p}(q):=\frac{\partial\mathcal{L}}{\partial\mathbf{G}_{p}}\left[\frac{\partial\mathbf{G}_{p}}{\partial p}(q)\right]=0

for all $q\in\mathcal{P}$ .Because we have assumed that $\mathbf{d}\mathbf{G}_{p}\neq 0$ in (18), we have that

(43)	$\displaystyle\frac{\partial\mathcal{L}}{\partial\mathbf{G}_{p}}(\mathbf{H})$	$\displaystyle=\beta\left<\mathbf{I},\mathbf{H}\right>\left[\texttt{trace}\left(\mathbf{G}_{p}\right)-\gamma\right]-\left<\boldsymbol{\Lambda},\mathbf{X}\mathbf{H}\mathbf{X}\right>$
		$\displaystyle=\beta[\texttt{trace}\left(\mathbf{G}_{p}\right)-\gamma]\left<\mathbf{I},\mathbf{H}\right>-\left<\mathbf{X}\boldsymbol{\Lambda}\mathbf{X},\mathbf{H}\right>$
		$\displaystyle=0$

for all $\mathbf{H}\in\mathscr{J}_{1}(H)$ . This then implies that

\beta\left[\texttt{trace}\left(\mathbf{G}_{p}\right)-\gamma\right]\mathbf{I}=\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}

and (42d) follows after norming both sides of the above equation with the $\mathscr{L}(H)$ norm and performing some algebraic manipulation.

Taking $\mathbf{H}=\mathbf{d}\mathbf{G}_{p}(q)$ for all $q\in\mathcal{P}$ in (43) results in

	$\displaystyle\frac{\partial\mathcal{L}}{\partial p}(\mathbf{d}\mathbf{G}_{p}(q))$	$\displaystyle=\beta[\texttt{trace}\left(\mathbf{G}_{p}\right)-\gamma]\left<\mathbf{I},\mathbf{d}\mathbf{G}_{p}(q)\right>-\left<\mathbf{X}\boldsymbol{\Lambda}\mathbf{X},\mathbf{d}\mathbf{G}_{p}(q)\right>$
		$\displaystyle=\beta\left[\texttt{trace}\left(\mathbf{G}_{p}\right)-\gamma\right]\left({\mathbf{d}\mathbf{G}_{p}^{}\mathbf{I},q}\right)_{\mathcal{P}}-\left({\mathbf{d}\mathbf{G}_{p}^{}\mathbf{X}\boldsymbol{\Lambda}\mathbf{X},q}\right)_{\mathcal{P}}$
		$\displaystyle=0.$

This then implies that

\mathbf{d}\mathbf{G}^{*}_{p}\mathbf{I}=\frac{1}{\beta\left[\texttt{trace}\left(\mathbf{G}_{p}\right)-\gamma\right]}\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}.

We now apply (42d) to obtain

\mathbf{d}\mathbf{G}^{*}_{p}\mathbf{I}=\frac{1}{\left\|\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}\right\|_{\mathscr{L}\left({H}\right)}}\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}.

Right-multiplying this equation on both sides by $\mathbf{d}\mathbf{G}_{p}p$ then yields

\mathbf{d}\mathbf{G}^{*}_{p}\mathbf{d}\mathbf{G}_{p}p=\frac{1}{\left\|\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}\right\|_{\mathscr{L}\left({H}\right)}}\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}\mathbf{d}\mathbf{G}_{p}p

and (42c) is obtained by left-multiplying both sides of the above by $\left[\mathbf{d}\mathbf{G}^{*}_{p}\mathbf{d}\mathbf{G}_{p}\right]^{-1}$ after recalling (19). With this, we are ready to analyze the first-order optimality system (42).

5.2.2. Unique Solution to First-Order Optimality System

We begin our analysis by noticing that there is a continuous mapping $p\mapsto\left({\mathbf{X}(p),\boldsymbol{\Lambda}(p)}\right)$ where $\mathbf{X}\in\mathscr{J}_{1}^{s}(H)$ and $\boldsymbol{\Lambda}\in\mathscr{L}^{s}(H)$ correspond to the solutions to (42a) and (42b) respectively. This observation is made using the same argument as in §4.2.2. With this, we may focus on determining that (42c) has a unique fixed point $p\in\mathcal{P}$ on the manifold induced by (42d).

Let us write $f(\cdot):\mathcal{P}\rightarrow\mathcal{P}$ as the nonlinear function defined in (42c), i.e.

f(p):=\frac{1}{\left\|\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}\right\|_{\mathscr{L}(H)}}\left[\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{d}\mathbf{G}_{p}\right]^{-1}\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}(p)\boldsymbol{\Lambda}(p)\mathbf{X}(p)\mathbf{d}\mathbf{G}_{p}p.

It is our job again to determine the conditions under which there exists a positive constant $k<1$ that satisfies

\left\|f(p_{1})-f(p_{2})\right\|_{\mathcal{P}}\leq k\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}

for any two $p_{1},p_{2}\in\mathcal{P}$ .

Notice that (42c) is scale-invariant. Let us take

(44)

p:=s\widetilde{p}

where $s\in\mathbb{R}_{+}$ is a positive scaling factor and $\widetilde{p}$ is a reference coordinate space defined such that $\sup\left\|\widetilde{p}\right\|_{\mathcal{P}}=1$ . The scale-invariance arises from the observation that, no matter how one chooses the scaling factor $s$ in the affine transformation in (44), it never appears in the definition of the fixed-point problem (42c). This is because the scaling factors $s$ cancel out in (42c) when applying this affine coordinate transformation. This then allows us to take $\sup\left\|p\right\|_{\mathcal{P}}=1$ in our analysis without loss of generality.

We now begin our well-posedness analysis by partitioning the fixed point equation into the following

p=\left[(I)(II)(III)\right]p,

where we have denoted

(I):=\frac{1}{\left\|\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}\right\|_{\mathscr{L}(H)}},\quad(II):=\left[\mathbf{d}\mathbf{G}^{*}_{p}\mathbf{d}\mathbf{G}_{p}\right]^{-1},\quad(III):=\mathbf{d}\mathbf{G}_{p}^{*}\mathbf{X}\boldsymbol{\Lambda}\mathbf{X}\mathbf{d}\mathbf{G}_{p}.

These partitioned terms are bounded as follows.

(45)

|(I)|\leq\frac{1}{\mu},

where $\mu:=\inf_{p\in\mathcal{P}}\left\|\mathbf{X}(p)\boldsymbol{\Lambda}(p)\mathbf{X}(p)\right\|_{\mathscr{L}(H)}$ .

The lower bound of $\mu$ is bounded above by zero since $\mathbf{X}=\mathbf{0}$ if and only if $\mathbf{Q}=0$ and $\boldsymbol{\Lambda}\neq\mathbf{0}$ as long as $\mathbf{W}\neq\mathbf{\mathbf{}}0$ . To see this, consider the Bochner integral form of the operator-valued Riccati equation (6). First, assume that $\mathbf{X}=\mathbf{0}$ , then $\mathbf{Q}$ must be $\mathbf{0}$ since $\int_{0}^{+\infty}\mathbf{S}(t)\mathbf{Q}\mathbf{S}^{*}(t)dt=\mathbf{0}$ if and only if $\mathbf{Q}=\mathbf{0}$ . Now assume $\mathbf{Q}=\mathbf{0}$ , then $\int_{0}^{+\infty}\mathbf{S}(t)\left({\mathbf{X}\mathbf{G}_{p}\mathbf{X}}\right)\mathbf{S}^{*}(t)dt=\mathbf{0}$ implies that $\mathbf{X}=\mathbf{0}$ . Next, $\boldsymbol{\Lambda}$ satisfies the Bochner integral form of the dual problem (14). The linearity of (14) and well-posedness of the equation then implies that $\boldsymbol{\Lambda}=0$ if and only if $\mathbf{W}=0$ . Hence, we must have that $\inf_{p\in\mathcal{P}}\left\|\mathbf{X}(p)\boldsymbol{\Lambda}(p)\mathbf{X}(p)\right\|_{\mathscr{L}\left({H}\right)}>0.$

We then define

\left\|(II)\right\|_{\mathscr{L}\left({\mathcal{P}}\right)}:=K.

And finally,

\left\|(III)\right\|_{\mathscr{L}\left({\mathcal{P}}\right)}\leq C_{\mathbf{d}\mathbf{G}}^{2}\frac{M^{6}}{8\alpha^{3}}\left\|\mathbf{Q}\right\|_{1}^{2}\left\|\mathbf{W}\right\|_{\mathscr{L}\left({H}\right)}.

It is again sufficient to assume that $\sup_{p\in\mathcal{P}}\left\|p\right\|_{\mathcal{P}}=1$ due to the scale invariance of the fixed point equation. With this, we determine that

	$\displaystyle\left\\|(I)_{1}(II)_{1}(III)_{1}p_{1}-(I)_{2}(II)_{2}(III)_{2}p_{2}\right\\|_{\mathcal{P}}$	$\displaystyle\leq\left\|(I)_{1}-(I)_{2}\right\|\left\\|(II)_{2}\right\\|_{\mathscr{L}\left({\mathcal{P}}\right)}\left\\|(III)_{2}\right\\|_{\mathscr{L}\left({\mathcal{P}}\right)}\left\\|p_{2}\right\\|_{\mathcal{P}}$
		$\displaystyle\quad+\|(I)_{1}\|\left\\|(II)_{1}-(II)_{2}\right\\|_{\mathscr{L}\left({\mathcal{P}}\right)}\left\\|(III)_{3}\right\\|_{\mathscr{L}\left({\mathcal{P}}\right)}\left\\|p_{2}\right\\|_{\mathcal{P}}$
		$\displaystyle\quad+\|(I)_{1}\|\left\\|(II)_{1}\right\\|_{\mathscr{L}\left({\mathcal{P}}\right)}\left\\|(III)_{1}-(III)_{2}\right\\|_{\mathscr{L}(\mathcal{P})}\left\\|p_{2}\right\\|_{\mathcal{P}}$
		$\displaystyle\quad+\|(I)_{1}\|\left\\|(II)_{1}\right\\|_{\mathscr{L}\left({\mathcal{P}}\right)}\left\\|(III)_{1}\right\\|_{\mathscr{L}\left({\mathcal{P}}\right)}\left\\|p_{1}-p_{2}\right\\|_{\mathcal{P}}$
		$\displaystyle\leq\frac{KC^{2}_{\mathbf{d}\mathbf{G}}M^{6}}{8\alpha}\left\\|\mathbf{Q}\right\\|_{1}^{2}\left\\|\mathbf{W}\right\\|_{\mathscr{L}\left({H}\right)}\|(I)_{1}-(I)_{2}\|$
		$\displaystyle\quad+\frac{C^{2}_{\mathbf{d}\mathbf{G}}M^{6}}{8\alpha^{3}\mu}\left\\|\mathbf{Q}\right\\|_{1}^{2}\left\\|\mathbf{W}\right\\|_{\mathscr{L}(H)}\left\\|(II)_{1}-(II)_{2}\right\\|_{\mathscr{L}\left({\mathcal{P}}\right)}$
		$\displaystyle\quad+\frac{K}{\mu}\left\\|(III)_{1}-(III)_{2}\right\\|_{\mathscr{L}\left({\mathcal{P}}\right)}$
		$\displaystyle\quad+\frac{KC_{\mathbf{d}\mathbf{G}}^{2}M^{6}}{8\alpha^{3}\mu}\left\\|p_{1}-p_{2}\right\\|_{\mathcal{P}}.$

We now bound every difference term in the above inequality. First, see that

	$\displaystyle\|(I)_{1}-(I)_{2}\|$	$\displaystyle=\left\|\frac{1}{\left\\|\mathbf{X}_{1}\boldsymbol{\Lambda}_{1}\mathbf{X}_{1}\right\\|_{\mathscr{L}\left({H}\right)}}-\frac{1}{\left\\|\mathbf{X}_{2}\boldsymbol{\Lambda}_{2}\mathbf{X}_{2}\right\\|_{\mathscr{L}\left({H}\right)}}\right\|$
		$\displaystyle\leq\frac{\left\\|\mathbf{X}_{2}\boldsymbol{\Lambda}_{2}\mathbf{X}_{2}-\mathbf{X}_{1}\boldsymbol{\Lambda}_{1}\mathbf{X}_{1}\right\\|_{\mathscr{L}\left({H}\right)}}{\left\\|\mathbf{X}_{1}\boldsymbol{\Lambda}_{1}\mathbf{X}_{1}\right\\|_{\mathscr{L}(H)}\left\\|\mathbf{X}_{2}\boldsymbol{\Lambda}_{2}\mathbf{X}_{2}\right\\|_{\mathscr{L}\left({H}\right)}}$
		$\displaystyle\leq\frac{1}{\mu^{2}}\bigg(\left\\|\mathbf{X}_{2}-\mathbf{X}_{1}\right\\|_{\mathscr{L}(H)}\left\\|\boldsymbol{\Lambda}_{1}\right\\|_{\mathscr{L}\left({H}\right)}\left\\|\mathbf{X}_{1}\right\\|_{\mathscr{L}\left({H}\right)}$
		$\displaystyle\qquad+\left\\|\mathbf{X}_{2}\right\\|_{\mathscr{L}\left({H}\right)}\left\\|\boldsymbol{\Lambda}_{2}-\boldsymbol{\Lambda}_{1}\right\\|_{\mathscr{L}\left({H}\right)}\left\\|\mathbf{X}_{1}\right\\|_{\mathscr{L}\left({H}\right)}$
		$\displaystyle\qquad+\left\\|\mathbf{X}_{2}\right\\|_{\mathscr{L}\left({H}\right)}\left\\|\boldsymbol{\Lambda}_{2}\right\\|_{\mathscr{L}\left({H}\right)}\left\\|\mathbf{X}_{2}-\mathbf{X}_{1}\right\\|_{\mathscr{L}\left({H}\right)}\bigg)$
		$\displaystyle\leq\frac{M^{4}}{2\alpha^{2}\mu^{2}}\left\\|\mathbf{Q}\right\\|_{1}\left\\|\mathbf{W}\right\\|_{\mathscr{L}\left({H}\right)}\left\\|\mathbf{X}_{1}-\mathbf{X}_{2}\right\\|_{1}+\frac{M^{4}}{4\alpha^{2}\mu^{2}}\left\\|\mathbf{Q}\right\\|_{1}^{2}\left\\|\boldsymbol{\Lambda}_{1}-\boldsymbol{\Lambda}_{2}\right\\|_{\mathscr{L}\left({H}\right)}$
		$\displaystyle\leq\frac{L_{\mathbf{G}}M^{10}}{16\alpha^{5}\mu}\left\\|\mathbf{Q}\right\\|_{1}^{3}\left\\|\mathbf{W}\right\\|_{\mathscr{L}\left({H}\right)}\left\\|p_{1}-p_{2}\right\\|_{\mathcal{P}}+\frac{L_{\mathbf{G}}M^{4}}{4\alpha^{2}\mu^{2}}\left({\frac{M^{10}\gamma_{\beta}}{16\alpha^{5}}\left\\|\mathbf{Q}\right\\|^{4}_{1}+\frac{M^{6}}{4\alpha^{3}}\left\\|\mathbf{Q}\right\\|^{3}_{1}}\right)\left\\|p_{1}-p_{2}\right\\|_{\mathcal{P}}$
		$\displaystyle=k_{(I),\alpha,\mathbf{Q}}\left\\|p_{1}-p_{2}\right\\|_{\mathcal{P}},$

where we have defined $\gamma_{\beta}:=\gamma+\frac{1}{\beta}\sup_{p\in\mathcal{P}}\left\|\mathbf{X}(p)\boldsymbol{\Lambda}(p)\mathbf{X}(p)\right\|_{\mathscr{L}\left({H}\right)}$ . Next, from the Lipshitz continuity of $\mathbf{d}\mathbf{G}_{(\cdot)}$ with respect to $p\in\mathcal{P}$ , we have that

\left\|(II)_{1}-(II)_{2}\right\|_{\mathscr{L}\left({\mathcal{P}}\right)}\leq L\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}.

Finally, we derive a Lipschitz bound for the final partitioned difference term. Leveraging the analysis presented in the previous section, we have that

		$\displaystyle\left\\|(III)_{1}-(III)_{2}\right\\|_{\mathscr{L}\left({H}\right)}$
		$\displaystyle\quad\leq\left\\|\left({\mathbf{d}\mathbf{G}_{p_{1}}^{}\mathbf{X}_{1}\boldsymbol{\Lambda}_{1}\mathbf{X}_{1}-\mathbf{d}\mathbf{G}_{p_{2}}^{}\mathbf{X}_{2}\boldsymbol{\Lambda}_{2}\mathbf{X}_{2}}\right)\mathbf{d}\mathbf{G}_{p_{2}}\right\\|_{\mathscr{L}\left({\mathcal{P}}\right)}+\left\\|\mathbf{d}\mathbf{G}_{p_{1}}^{*}\mathbf{X}_{1}\boldsymbol{\Lambda}_{1}\mathbf{X}_{1}\left({\mathbf{d}\mathbf{G}_{p_{1}}-\mathbf{d}\mathbf{G}_{p_{2}}}\right)\right\\|_{\mathscr{L}\left({H}\right)}$
		$\displaystyle\quad\leq C_{\mathbf{d}\mathbf{G}}\bigg(\frac{L_{\mathbf{d}\mathbf{G}}M^{6}}{16\alpha^{3}}\left\\|\mathbf{Q}\right\\|_{1}^{2}\left\\|\mathbf{W}\right\\|_{\mathscr{L}(H)}\left\\|p_{1}-p_{2}\right\\|_{\mathcal{P}}$
		$\displaystyle\qquad+\frac{L_{\mathbf{d}\mathbf{G}}C_{\mathbf{d}\mathbf{G}}M^{10}}{16\alpha^{5}}\left\\|\mathbf{Q}\right\\|_{1}^{3}\left\\|\mathbf{W}\right\\|_{\mathscr{L}(H)}\left\\|p_{1}-p_{2}\right\\|_{\mathcal{P}}$
		$\displaystyle\qquad+\frac{L_{\mathbf{G}}C_{\mathbf{d}\mathbf{G}}M^{4}}{2\alpha^{2}}\left\\|\mathbf{Q}\right\\|_{1}^{2}\left({\frac{M^{10}\gamma_{\beta}}{16\alpha^{5}}\left\\|\mathbf{Q}\right\\|^{2}_{1}+\frac{M^{6}}{4\alpha^{3}}\left\\|\mathbf{Q}\right\\|_{1}}\right)\left\\|p_{1}-p_{2}\right\\|_{\mathcal{P}}\bigg)$
		$\displaystyle\qquad+\frac{C_{\mathbf{d}\mathbf{G}}L_{\mathbf{d}\mathbf{G}}M^{6}}{8\alpha^{3}}\left\\|\mathbf{Q}\right\\|_{1}^{2}\left\\|\mathbf{W}\right\\|_{\mathscr{L}\left({H}\right)}\left\\|p_{1}-p_{2}\right\\|_{\mathcal{P}}$
		$\displaystyle=k_{(III),\alpha,bQ}\left\\|p_{1}-p_{2}\right\\|_{\mathcal{P}}.$

Putting this all together, we have determined that there exists a constant $k_{M,\alpha,\beta,\mathbf{G},\mathbf{Q},\mathbf{W}}\in\mathbb{R}_{+}$ dependent on $M,\alpha,\beta,\left\|\mathbf{G}\right\|_{1},\left\|\mathbf{Q}\right\|_{1},\left\|\mathbf{W}\right\|_{\mathscr{L}(H)}$ so that

\left\|f(p_{1})-f(p_{2})\right\|_{\mathcal{P}}\leq k_{M,\alpha,\beta,\mathbf{G},\mathbf{Q},\mathbf{W}}\left\|p_{1}-p_{2}\right\|_{\mathcal{P}}

for any two $p_{1},p_{2}\in\mathcal{P}$ . Inspection of the definition of the constant $k_{M,\alpha,\beta,\mathbf{G},\mathbf{Q},\mathbf{W}}$ indicates that if $\alpha,\beta\in\mathbb{R}_{+}$ is sufficiently large, and $\gamma,\left\|\mathbf{Q}\right\|_{1},\left\|\mathbf{W}\right\|\in\mathbb{R}_{+}$ is sufficiently small, then (42c) has a unique fixed point.

Evaluating the second variation of $\mathcal{L}(\cdot,\cdot,\cdot)$ at $(\mathbf{X}_{opt},p_{opt},\boldsymbol{\Lambda}_{opt})$ in the direction vectors contained in the critical cone $\mathcal{K}(\mathbf{X}_{opt},p_{opt})$ , as defined in (28), then yields

	$\displaystyle\mathbf{d}^{2}\mathcal{L}_{opt}[(\boldsymbol{\Phi},q),(\boldsymbol{\Phi},q)]$	$\displaystyle=\beta\left[\texttt{trace}\left(\mathbf{G}_{p_{opt}}\right)-\gamma\right]\left<\mathbf{I},\mathbf{d}^{2}\mathbf{G}_{p_{opt}}(q,q)\right>$
		$\displaystyle\qquad+\beta\left<\mathbf{I},\mathbf{d}\mathbf{G}_{p_{opt}}(q)\right>^{2}-\left<\boldsymbol{\Lambda}_{opt},\mathbf{X}_{opt}\mathbf{d}^{2}\mathbf{G}_{p_{opt}}(q,q)\mathbf{X}_{opt}\right>$
		$\displaystyle=\left\\|\mathbf{X}_{opt}\mathbf{G}_{p_{opt}}\mathbf{X}_{opt}\right\\|_{\mathscr{L}\left({H}\right)}\left<\mathbf{I},\mathbf{d}^{2}\mathbf{G}_{p_{opt}}(q,q)\right>$
		$\displaystyle\qquad+\beta\left<\mathbf{I},\mathbf{d}\mathbf{G}_{p_{opt}}(q)\right>^{2}-\left<\boldsymbol{\Lambda}_{opt},\mathbf{X}_{opt}\mathbf{d}^{2}\mathbf{G}_{p_{opt}}(q,q)\mathbf{X}_{opt}\right>$

for all $\left({\boldsymbol{\Phi},q}\right)\in\mathcal{K}\left({\mathbf{X}_{opt},p_{opt}}\right).$ Inspecting the expression above indicates that $\mathbf{d}^{2}\mathcal{L}_{opt}[(\boldsymbol{\Phi},q),(\boldsymbol{\Phi},q)]$ is positive definite if $\beta\in\mathbb{R}_{+}$ is sufficiently large after recalling (42d).

Theorem 3.

Assume that $\mathbf{A}:\mathcal{D}(\mathbf{A})\rightarrow H$ is the generator of a $C_{0}$ -semigroup, $\mathbf{Q}\in\mathscr{J}_{1}^{s}(H)$ , and $\mathbf{W}\in\mathscr{L}^{s}(H)$ . Then, if conditions (15), (16), (17), and (19) are satisfied for the mapping $\mathbf{G}_{(\cdot)}:\mathcal{P}\rightarrow\mathscr{J}_{1}^{s}(H)$ , then there exists a unique solution $\left({\mathbf{X}_{opt},\boldsymbol{\Lambda}_{opt},p}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathscr{L}^{s}\left({H}\right)\times\mathcal{P}$ that satisfies the first-order optimality system (42) associated with the constrained weighted trace minimization problem, provided that the penalty parameter $\beta\in\mathbb{R}_{+}$ is chosen sufficiently large enough and that $\alpha\in\mathbb{R}_{+}$ is sufficiently large to reduce the contributions of $\left\|\mathbf{Q}\right\|_{1},\left\|\mathbf{W}\right\|_{\mathscr{L}(H)},$ and $M$ in the definition of the constant $k_{M,\alpha,\beta,\mathbf{G},\mathbf{Q},\mathbf{W}}$ .

Further assume that (20) is satisfied for the mapping $\mathbf{G}_{(\cdot)}:\mathcal{P}\rightarrow\mathscr{J}_{1}^{s}(H)$ . Then $\left({\mathbf{X}_{opt},p_{opt}}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathcal{P}$ is the unique constrained minimizer for the cost functional (39) provided that the conditions described above for uniqueness are satisfied.

Theorem 3 provides the necessary conditions needed for $\mathcal{J}_{\beta}(\cdot)$ to have a constrained minimizer. In general, the constrained minimizer is not guaranteed to be unique unless $\alpha$ is sufficiently large enough to reduce the effects of $\left\|\mathbf{Q}\right\|_{1},\left\|\mathbf{W}\right\|_{\mathscr{L}\left({H}\right)}$ , and $M$ in the definition of the contraction constant $k_{M,\alpha,\beta,\mathbf{G},\mathbf{Q},\mathbf{W}}$ . This is in contrast to the constrained optimization problem presented in §4 where $\beta$ can be chosen large enough to guarantee the uniqueness.

Assume for now that the conditions in Theorem 3 are satisfied and $\alpha$ is sufficiently large enough to guarantee the uniqueness of the fixed point for $\beta$ chosen large enough. Let us now define $(\mathbf{X}^{\beta}_{opt},p^{\beta}_{opt})\in\mathscr{J}_{1}^{s}(H)\times\mathcal{P}$ to be the constrained minimizer associated with the choice of $\beta$ in the definition of (39). Theorem 3 suggests that there is a unique constrained minimizer to $\mathcal{J}_{\beta}(\cdot)$ for every $\beta>\beta_{min}$ for some threshold value $\beta_{min}$ . Taking $\beta\rightarrow+\infty$ preserves the contractive property of the fixed point problem (42c) while simultaneously enforcing the constraint $\texttt{trace}\left(\mathbf{G}_{p}\right)=\gamma$ . This then implies that the limit $\lim_{\beta\rightarrow\infty}\left({\mathbf{X}^{\beta}_{opt},p^{\beta}_{opt}}\right)=\left({\mathbf{X}^{\infty}_{opt},p^{\infty}_{opt}}\right)$ is well-defined. We also observe that the Hessian operator, while remaining positive, becomes unbounded in this limit. This suggests that the stationary point associated with the constrained minimizer $\left({\mathbf{X}^{\infty}_{opt},p^{\infty}_{opt}}\right)$ is not twice differentiable. This discussion along with Remark 1 then implies the following.

Corollary 1.

Assume that the conditions in Theorem 3 are satisfied. Then there exists a unique constrained minimizer $\left({\mathbf{X}_{opt},p_{opt}}\right)\in\mathscr{J}_{1}^{s}(H)\times\mathcal{P}$ that satisfies the following constrained minimization problem.

\min_{p\in\mathcal{P}}\texttt{trace}\left(\mathbf{X}(p)\mathbf{W}\right)

subject to

\left\{\begin{aligned} \mathbf{A}\mathbf{X}+\mathbf{X}\mathbf{A}^{*}-\mathbf{X}\mathbf{G}_{p}\mathbf{X}+\mathbf{Q}&=\mathbf{0}\\ \texttt{trace}\left(\mathbf{G}_{p}\right)&=\gamma.\end{aligned}\right.

6. Discussion

In this work, we have determined the well-posedness of the strong form of the operator-valued Riccati equation (the primal problem). Upon this foundation, we are able to determine the well-posedness of the dual problem and prove that the solutions to both the primal and dual problems are Lipschitz continuous with respect to the parameter variable that describes the operator associated with the control device. From there, we are able to determine that both constrained weighted trace minimization problems presented in this work have unique constrained minimizers under certain necessary conditions through a fixed-point argument. In the penalization parameter can be chosen sufficiently in the penalized parameter optimization problem to force uniqueness, whereas this property does not extend to the problem of penalization for approximate constraint enforcement. Under the conditions in which the second optimization problem does have a unique minimizer, we determine that in the infinite limit of the penalization parameter that the uniqueness of the minimizer is preserved. This then implies that exact constraint enforcement also leads to a unique minimizer provided that the necessary conditions prescribed in Theorem 3 are satisfied.

The results of this work are mostly theoretical. However, it can be utilized to design efficient sensor and actuator design and placement algorithms. The uniqueness of the minimizer enables the use of gradient-based optimization algorithms, as opposed to more expensive global optimization algorithms, since uniqueness is known a priori, The results regarding the strong form of the operator-valued Riccati equation indicates that its solution is compact in the sense that it maps $\mathcal{D}(\mathbf{A}^{*})^{\prime}$ into $\mathcal{D}(\mathbf{A}^{*})$ . We plan on utilizing this observation in an attempt to derive optimal convergence rates for the numerical approximation for operator-valued Riccati equations that arise from unbounded sensing and actuation problems in a future work.

References

[1] Sheldon Axler. Linear algebra done right. Springer Nature, 2024.
[2] Alain Bensoussan. Optimization of sensors’ location in a distributed filtering problem. In Stability of Stochastic Dynamical Systems: Proceedings of the International Symposium Organized by “The Control Theory Centre”, University of Warwick, July 10–14, 1972 Sponsored by the “International Union of Theoretical and Applied Mechanics”, pages 62–84. Springer, 2006.
[3] Alain Bensoussan, Giuseppe Da Prato, Michel C Delfour, and Sanjoy K Mitter. Representation and control of infinite dimensional systems. Springer, 2007.
[4] John A Burns and Carlos N Rautenberg. The infinite-dimensional optimal filtering problem with mobile and stationary sensor networks. Numerical Functional Analysis and Optimization, 36(2):181–224, 2015.
[5] John A Burns and Carlos N Rautenberg. Solutions and approximations to the riccati integral equation with values in a space of compact operators. SIAM Journal on Control and Optimization, 53(5):2846–2877, 2015.
[6] James Cheung. On the approximation of operator-valued riccati equations in hilbert spaces. Journal of Mathematical Analysis and Applications, 547(1):129250, 2025.
[7] Philippe G Ciarlet. Linear and nonlinear functional analysis with applications. SIAM, 2025.
[8] John B Conway. A course in operator theory, volume 21. American Mathematical Society, 2025.
[9] Ruth F Curtain and Hans Zwart. An introduction to infinite-dimensional linear systems theory, volume 21. Springer Science & Business Media, 2012.
[10] J. Diestel and J. J. Uhl. Vector Measures. American Mathematical Society, 1977.
[11] M Sajjad Edalatzadeh, Dante Kalise, Kirsten A Morris, and Kevin Sturm. Optimal actuator design for vibration control based on lqr performance and shape calculus. arXiv preprint arXiv:1903.07572, 2019.
[12] Jerome A Goldstein. Semigroups of linear operators and applications. Courier Dover Publications, 2017.
[13] Michael Hintermüller, Carlos N Rautenberg, Masoumeh Mohammadi, Martin Kanitsar, et al. Optimal sensor placement: A robust approach. SIAM J. Control. Optim., 55(6):3609–3639, 2017.
[14] Weiwei Hu, Kirsten Morris, and Yangwen Zhang. Sensor location in a controlled thermal fluid. In 2016 IEEE 55th Conference on Decision and Control (CDC), pages 2259–2264. IEEE, 2016.
[15] Andreas Kirsch et al. An introduction to the mathematical theory of inverse problems, volume 120. Springer, 2011.
[16] Kirsten Morris. Linear-quadratic optimal actuator location. IEEE Transactions on Automatic Control, 56(1):113–124, 2011.
[17] Kirsten Morris and Steven Yang. Comparison of actuator placement criteria for control of structures. Journal of Sound and Vibration, 353:1–18, 2015.
[18] Louis Sharrock and Nikolas Kantas. Joint online parameter estimation and optimal sensor placement for the partially observed stochastic advection-diffusion equation. SIAM/ASA Journal on Uncertainty Quantification, 10(1):55–95, 2022.
[19] Fredi Tröltzsch. Optimal control of partial differential equations: theory, methods, and applications, volume 112. American Mathematical Soc., 2010.

		$\displaystyle\lim_{h\rightarrow 0^{+}}h^{-1}\int_{0}^{+\infty}\left[\mathbf{S}(t+h)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{}(t+h)-\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{}(t)\right]\zeta dt$
		$\displaystyle\quad=\lim_{h\rightarrow 0^{+}}h^{-1}\left\{\lim_{\tau\rightarrow+\infty}\int_{0}^{\tau}\left[\mathbf{S}(t+h)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{}(t+h)-\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{}(t)\right]\zeta dt\right\}$
		$\displaystyle\quad=\lim_{h\rightarrow 0^{+}}h^{-1}\left\{\lim_{\tau\rightarrow+\infty}\left[\int_{h}^{\tau+h}\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{}(t)\zeta dt-\int_{0}^{\tau}\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{}(t)\zeta dt\right]\right\}$
		$\displaystyle\quad=\lim_{h\rightarrow 0^{+}}h^{-1}\left[\lim_{\tau\rightarrow+\infty}\int_{\tau}^{\tau+h}\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{}(t)\zeta dt-\int_{0}^{h}\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{}(t)\zeta dt\right]$
		$\displaystyle=\lim_{\tau\rightarrow+\infty}\left[\lim_{h\rightarrow 0^{+}}h^{-1}\int_{\tau}^{\tau+h}\mathbf{S}(t)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(t)\zeta dt\right]-\mathbf{Q}\zeta+\mathbf{X}\mathbf{G}\mathbf{X}\zeta$
		$\displaystyle=\lim_{\tau\rightarrow+\infty}\left[\mathbf{S}(\tau)\left({\mathbf{Q}-\mathbf{X}\mathbf{G}\mathbf{X}}\right)\mathbf{S}^{*}(\tau)\zeta\right]-\mathbf{Q}\zeta+\mathbf{X}\mathbf{G}\mathbf{X}\zeta$
		$\displaystyle=-\mathbf{Q}\zeta+\mathbf{X}\mathbf{G}\mathbf{X}\zeta$

	$\displaystyle\texttt{trace}\left(\mathbf{X}\right)$	$\displaystyle\leq\int_{0}^{+\infty}\texttt{trace}\left(\mathbf{S}(t)\mathbf{Q}\mathbf{S}^{*}(t)\right)dt$
		$\displaystyle\leq\int_{0}^{+\infty}\left\\|\mathbf{S}^{*}(t)\right\\|^{2}_{\mathscr{L}\left({H}\right)}\left\\|\mathbf{Q}\right\\|_{1}ds$
		$\displaystyle\leq M^{2}\left\\|\mathbf{Q}\right\\|_{1}\int_{0}^{+\infty}e^{-2\alpha t}dt$
		$\displaystyle=\frac{M^{2}}{2\alpha}\left\\|\mathbf{Q}\right\\|_{\mathscr{J}_{1}\left({H}\right)},$

	$\displaystyle\int_{0}^{+\infty}\left\\|\mathbf{S}_{1}(t)\mathbf{P}\mathbf{S}_{2}^{*}(t)\right\\|_{\mathscr{L}\left({H}\right)}dt$	$\displaystyle\leq\int_{0}^{+\infty}M_{1}M_{2}e^{-\alpha_{1}t}e^{-\alpha_{2}t}\left\\|\mathbf{P}\right\\|_{\mathscr{L}\left({H}\right)}dt$
		$\displaystyle\leq M_{}^{2}\left\\|\mathbf{P}\right\\|_{\mathscr{L}\left({H}\right)}\int_{0}^{+\infty}e^{-2\alpha_{}t}dt$
		$\displaystyle\leq\frac{M_{}^{2}}{2\alpha_{}}\left\\|\mathbf{P}\right\\|_{\mathscr{L}\left({H}\right)},$

(27)		$\displaystyle\left\\|\boldsymbol{\Lambda}_{1}-\boldsymbol{\Lambda}_{2}\right\\|_{\mathscr{L}\left({H}\right)}$	$\displaystyle\leq\int_{0}^{+\infty}M^{2}e^{-2\alpha t}\left\\|F(\mathbf{X}_{1},\mathbf{X}_{2},\boldsymbol{\Lambda}_{1},\boldsymbol{\Lambda}_{2},\mathbf{G}_{p_{1}},\mathbf{G}_{p_{2}})\right\\|_{\mathscr{L}\left({H}\right)}dt$
(27)			$\displaystyle=\frac{M^{2}}{2\alpha}\left\\|F(\mathbf{X}_{1},\mathbf{X}_{2},\boldsymbol{\Lambda}_{1},\boldsymbol{\Lambda}_{2},\mathbf{G}_{p_{1}},\mathbf{G}_{p_{2}})\right\\|_{\mathscr{L}\left({H}\right)}.$

		$\displaystyle\left\\|F(\mathbf{X}_{1},\mathbf{X}_{2},\boldsymbol{\Lambda}_{1},\boldsymbol{\Lambda}_{2},\mathbf{G}_{p_{1}},\mathbf{G}_{p_{2}})\right\\|_{\mathscr{L}\left({H}\right)}$
		$\displaystyle\quad\leq\left\\|\mathbf{G}_{p_{1}}(\mathbf{X}_{1}-\mathbf{X}_{2})\boldsymbol{\Lambda}_{1}\right\\|_{\mathscr{L}\left({H}\right)}+\left\\|(\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}})\mathbf{X}_{1}\boldsymbol{\Lambda}_{1}\right\\|_{\mathscr{L}\left({H}\right)}$
		$\displaystyle\qquad+\left\\|\boldsymbol{\Lambda}_{2}\mathbf{X}_{2}(\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}})\right\\|_{\mathscr{L}\left({H}\right)}+\left\\|\boldsymbol{\Lambda}_{2}(\mathbf{X}_{1}-\mathbf{X}_{2})\mathbf{G}_{p_{1}}\right\\|_{\mathscr{L}\left({H}\right)}$
		$\displaystyle\quad\leq\left({\frac{M^{8}g}{8\alpha^{4}}\left\\|\mathbf{Q}\right\\|^{2}_{1}+\frac{M^{4}}{2\alpha^{2}}\left\\|\mathbf{Q}\right\\|_{1}}\right)\left\\|\mathbf{G}_{p_{1}}-\mathbf{G}_{p_{2}}\right\\|_{\mathscr{L}(H)},$

Penalized Weighted Trace Minimization for Optimal Control Device Design and Placement

Abstract.

1. Introduction

2. Notation

2.1. Trace-Class Operators

2.1.1. Symmetric Operators

2.2. Exponentially Stable C0C_{0}-Semigroups

3. The Operator-Valued Riccati Equation and Its Dual Equation

3.1. Strong Operator-Valued Riccati Equation

Definition 1.

Theorem 1.

Proof.

3.2. Sylvester’s Equation

Lemma 1.

Proof.

3.3. Dual Problem

Lemma 2.

Proof.

3.4. Parametrized Control Device Operator

3.5. Lipschitz Continuity of the Primal and Dual Solutions

Lemma 3.

Proof.

Lemma 4.

Proof.

3.6. The Critical Cone

4. Control Penalized Constrained Trace Minimization

4.1. Problem Statement

4.2. First-Order Optimality System

4.2.1. Derivation of the Optimality System

4.2.2. Well-Posedness of the Optimality System

Theorem 2.

5. Approximate Control Constraint Enforcement

5.1. Problem Statement

5.2. First-Order Optimality System

Remark 1.

5.2.1. Derivation of the Optimality Condition

5.2.2. Unique Solution to First-Order Optimality System

Theorem 3.

Corollary 1.

6. Discussion

References

2.2. Exponentially Stable $C_{0}$ -Semigroups