Mixed-Derivative Total Variation

Vincent Guillemet, and Michael Unser
Abstract

The formulation of norms on continuous-domain Banach spaces with exact pixel-based discretization is advantageous for solving inverse problems (IPs). In this paper, we investigate a new regularization that is a convex combination of a TV term and the (2)\mathcal{M}(\mathbb{R}^{2}) norm of mixed derivatives. We show that the extreme points of the corresponding unit ball are indicator functions of polygons whose edges are aligned with either the x1x_{1}- or x2x_{2}-axis. We then apply this result to construct a new regularization for IPs, which can be discretized exactly by tensor products of first-order B-splines, or equivalently, pixels. Furthermore, we exactly discretize the loss of the denoising problem on its canonical pixel basis and prove that it admits a unique solution, which is also a solution to the underlying continuous-domain IP.

Index Terms:
Geometric measure theory, continuous-domain inverse problems, tensor product, representer theorem, denoising.

I Introduction

The study of continuous-domain formulations of inverse problems (IP) is crucial in computational imaging [1, 2]. In this setting, the objective is to recover an unknown image f:2f:\mathbb{R}^{2}\to\mathbb{R} from noisy measurements. Due to the pixel-based geometry of images, such problems are commonly addressed by seeking an approximate solution ff^{\star} in

𝒳1={f=𝐤2𝐚[𝐤]𝟙E𝐤},\mathcal{X}_{1}=\left\{f=\sum_{\mathbf{k}\in\mathbb{Z}^{2}}\mathbf{a}[\mathbf{k}]\mathbbm{1}_{E_{\mathbf{k}}}\right\}, (1)

where the sets (E𝐤)𝐤2(E_{\mathbf{k}})_{\mathbf{k}\in\mathbb{Z}^{2}} form a pixel-based decomposition of 2\mathbb{R}^{2}. In several acquisition modalities, the forward measurement process is not naturally associated with a canonical pixel basis, in which case 𝒳1\mathcal{X}_{1} must be designed by the practitioner. In this context, it is advantageous to identify a continuous-domain Banach space 𝒳\mathcal{X} that contains all pixel-type basis functions and within which 𝒳1\mathcal{X}_{1} can be refined. The search space 𝒳\mathcal{X} is typically selected jointly with the regularization functional Reg()\text{Reg}(\cdot) [3, 4], which encodes prior knowledge about the geometry of ff^{\star} when the IP is ill-posed.

I-A TV Regularization

A standard regularization for the pixel basis is the total variation (TV) [5], which, for f𝒳=BV(Ω)f\in\mathcal{X}=\mathrm{BV}(\Omega) [6, 7] and in its anisotropic form, is given by

{f}(2)2=([ID]{f}(2)+[DI]{f}(2)),\left\lVert\nabla\{f\}\right\rVert_{\mathcal{M}(\mathbb{R}^{2})^{2}}=\\ \left(\left\lVert[\mathrm{I}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}(\mathbb{R}^{2})}+\left\lVert[\mathrm{D}\otimes\mathrm{I}]\{f\}\right\rVert_{\mathcal{M}(\mathbb{R}^{2})}\right), (2)

where (2)\left\lVert\cdot\right\rVert_{\mathcal{M}(\mathbb{R}^{2})} denotes the norm on the space of bounded measures [8]. This norm has the advantage of being exactly discretizable on the pixel basis, which makes it particularly suitable for image processing. Specifically, when f𝒳1f\in\mathcal{X}_{1} is a linear combination of square pixels with weights 𝐚\mathbf{a}, (2) can be discretized as

c(𝒉0,1𝐚1+𝒉1,0𝐚1),c\left(\left\lVert\bm{h}_{0,1}\ast\mathbf{a}\right\rVert_{\ell_{1}}+\left\lVert\bm{h}_{1,0}\ast\mathbf{a}\right\rVert_{\ell_{1}}\right), (3)

where cc is determined by the pixel size, and

𝒉0,1=[11],𝒉1,0=[11].\bm{h}_{0,1}=\begin{bmatrix}-1\\ 1\end{bmatrix},\quad\bm{h}_{1,0}=\begin{bmatrix}-1&1\end{bmatrix}. (4)

The search for the optimal f𝒳1f^{\star}\in\mathcal{X}_{1} thus reduces to a compressed sensing problem [9, 10, 11], which can be solved using classical algorithms such as FISTA [12] or primal-dual methods [13, 14, 15, 16]. One reason for the popularity of TV regularization is the strong theoretical foundation on which it is built.

The norm {}(2)2\left\lVert\nabla\{\cdot\}\right\rVert_{\mathcal{M}(\mathbb{R}^{2})^{2}} is not only well defined for pixels, but also for indicator functions of Caccioppoli sets [17, 18], i.e., sets that can be approximated in measure by polygons. Within the framework of representer theorems (RTs) for Banach spaces [19, 20, 21, 22], Bredies and Carioni [23] built upon the results of [24, 25, 26] to show that TV regularization promotes solutions to inverse problems that are finite linear combinations of simple sets, Caccioppoli sets whose boundaries are represented by a unique Jordan curve [26].

In contrast to the spline solutions promoted by the one-dimensional TV counterpart D{}()\left\lVert\mathrm{D}\{\cdot\}\right\rVert_{\mathcal{M}(\mathbb{R})}, the generality of these simple sets prevents them from benefiting from the convenient spline structure [27, 28, 29, 30, 31].

I-B Mixed-Derivative Regularization

One approach to promoting pixel-type solutions with a spline structure is to impose additional regularity. In [32], the authors considered the search space DD(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D}}(\mathrm{K}), consisting of functions ff whose generalized gradient {f}\nabla\{f\} and generalized mixed derivative [DD]{f}[\mathrm{D}\otimes\mathrm{D}]\{f\} are bounded measures supported in K=[0,1]×[0,1]\mathrm{K}=[0,1]\times[0,1]. They showed that the extreme points of the unit ball associated with the norm [DD]{}(2)\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\cdot\}\right\rVert_{\mathcal{M}(\mathbb{R}^{2})} are tensor products of D\mathrm{D}-splines. However, this formulation is not entirely satisfactory because the regularization [DD]{}(2)\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\cdot\}\right\rVert_{\mathcal{M}(\mathbb{R}^{2})} does not control the perimeter of sets, but only their number of corners. As a result, it may promote solutions containing components that are undesirable, such as indicators of unbounded sets.

I-C Contributions

In this paper, for θ]0,1]\theta\in]0,1], we regularize with the new norm

θ=θ[DD]{}(2)+(1θ){}(2)2,\left\lVert\cdot\right\rVert_{\theta}=\theta\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\cdot\}\right\rVert_{\mathcal{M}(\mathbb{R}^{2})}+(1-\theta)\left\lVert\nabla\{\cdot\}\right\rVert_{\mathcal{M}(\mathbb{R}^{2})^{2}}, (5)

which generalizes the anisotropic TV. Our contributions are fourfold.

Functional Framework: We define the search space DD,0(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}) as the set of functions fDD(K)f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D}}(\mathrm{K}) supported in K\mathrm{K}. We prove that this space is a dual Banach space for the norm [DD]{}\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\cdot\}\right\rVert_{\mathcal{M}}, and we establish that all θ\left\lVert\cdot\right\rVert_{\theta}-norms are equivalent.

Extreme Point Characterization: We show that the extreme points of the unit ball of the convex cone of positive functions (DD,0+(K),θ)\left(\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathrm{K}),\left\lVert\cdot\right\rVert_{\theta}\right) are indicator functions of polygons whose edges are parallel to either the x1x_{1}- or x2x_{2}-axis. We then apply this result to the regularization of abstract IPs with finite data.

Resolution of IPs: We demonstrate that tensor-product B-splines of order 1 exactly discretize a broad class of IP loss functionals, and that θ\left\lVert\cdot\right\rVert_{\theta} can be discretized exactly as

θc𝒉0,1𝐚1+θc𝒉1,0𝐚1+(1θ)𝒉1,1𝐚1,\theta c\left\lVert\bm{h}_{0,1}\ast\mathbf{a}\right\rVert_{\ell_{1}}+\theta c\left\lVert\bm{h}_{1,0}\ast\mathbf{a}\right\rVert_{\ell_{1}}+(1-\theta)\left\lVert\bm{h}_{1,1}\ast\mathbf{a}\right\rVert_{\ell_{1}}, (6)

where 𝒉1,1=𝒉0,1𝒉1,0\bm{h}_{1,1}=\bm{h}_{0,1}\otimes\bm{h}_{1,0}. We further prove that these exact discretizations converge, in an appropriate sense, to the continuous-domain IP as the pixel size tends to zero.

Application to Denoising: We prove that the canonical pixel-based discretization of the denoising loss admits a unique solution, which also solves the corresponding continuous-domain problem. Finally, we numerically assess the efficiency of the proposed method.

I-D Related Works

Higher-Order Generalization of TV. Bredies et al. introduced the generalized total variation (GTV) [33, 34, 23], which incorporates all derivatives up to a specified order kk. In particular, the second-order TV (k=2k=2) is defined on the space of functions with bounded Hessian [35] and, for a smooth function ff, takes the form

TV(2){f}=22{f}(𝐱)Fd𝐱,TV^{(2)}\{f\}=\int_{\mathbb{R}^{2}}\left\lVert\nabla^{2}\{f\}(\mathbf{x})\right\rVert_{F}\,\mathrm{d}\mathbf{x}, (7)

where F\left\lVert\cdot\right\rVert_{F} denotes the Frobenius norm. Lefkimiatis et al. extended this notion by introducing the Hessian TV (HTV), in which the Frobenius norm is replaced by a Schatten-pp norm [36, 37, 38]. Regularization with HTV has been employed for the learning of continuous piecewise-linear (CPWL) mappings [39, 40, 41]. Ambrosio et al. showed that the closure, in the unit ball of the space of functions with finite HTV, of the set of CPWL extreme points contains all extremal points [42, 43].

Our regularization lies, in terms of regularity constraints, between TV and HTV. Specifically, it enforces more regularity than TV due to the presence of the DD\mathrm{D}\otimes\mathrm{D} term, but less than HTV because it omits the terms ID2\mathrm{I}\otimes\mathrm{D}^{2} and D2I\mathrm{D}^{2}\otimes\mathrm{I}. In particular, HTV is not well defined on pixel basis.

Extreme Points of Mixed Norms. We investigate the extreme points of the unit ball associated with our regularization norm, which takes the form

=θA+(1θ)B,\left\lVert\cdot\right\rVert=\theta\left\lVert\cdot\right\rVert_{A}+(1-\theta)\left\lVert\cdot\right\rVert_{B},

where A\left\lVert\cdot\right\rVert_{A} and B\left\lVert\cdot\right\rVert_{B} are two norms whose extreme points are known. When the search space 𝒳\mathcal{X} can be decomposed as a direct sum 𝒳A𝒳B\mathcal{X}_{A}\oplus\mathcal{X}_{B}, with A|𝒳B=0\left\lVert\cdot\right\rVert_{A}|_{\mathcal{X}_{B}}=0 and B|𝒳A=0\left\lVert\cdot\right\rVert_{B}|_{\mathcal{X}_{A}}=0, the extreme points of \left\lVert\cdot\right\rVert have been systematically characterized [44]. However, when such a direct-sum decomposition is not possible, as is the case in this work, the description of the extreme points becomes significantly more challenging. In finite dimensions, they are located at the intersections of the hyperplanes that define the boundaries of the unit balls [45, 46, 47]. To the best of our knowledge, no analogous characterization is currently available in infinite-dimensional settings.

Continuous Data. TV was originally introduced as a regularization for the denoising loss f2(Ω)\left\lVert f-\cdot\right\rVert_{\mathcal{L}_{2}(\Omega)}, where Ω\Omega is a domain and ff denotes the continuous data [5]. Subsequent works have deepened the theoretical understanding of this approach. The existence and uniqueness of solutions for this problem, as well as for the TV flow problem, were established in [48, 49, 50]. The regularity of solutions was analysed in [51], while discretization and convergence properties, including convergence rates, were investigated in [52, 53, 54]. Chambolle and Pock provide a comprehensive overview of TV discretization and convergence techniques in [55]. In the present work, we focus on the study of IPs with finite data.

II Theoretical Formulation

II-A Mixed-Derivative Native Space

Let K=[0,1]2\mathrm{K}=[0,1]^{2} denote the unit square. We define the native space DD,0(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}) as

{f𝒟(2):{[DD]{f}(K),ϕ𝒞c(Kc),f,ϕ=0},\left\{f\in\mathcal{D}^{\prime}(\mathbb{R}^{2}):\quad\begin{cases}[\mathrm{D}\otimes\mathrm{D}]\{f\}\in\mathcal{M}(\mathrm{K}),\\ \forall\phi\in\mathcal{C}_{c}^{\infty}(\mathrm{K}^{c}),\ \langle f,\phi\rangle=0\end{cases}\right\}, (8)

where (K)\mathcal{M}(\mathrm{K}) denotes the space of bounded measures supported in K\mathrm{K}, and Kc=2K\mathrm{K}^{c}=\mathbb{R}^{2}\setminus\mathrm{K}. The second condition in (8) enforces that the function ff vanishes outside K\mathrm{K}. In Proposition 1, we provide a more practical, integral-based characterization of the functions ff belonging to DD,0(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}). To this end, we introduce the integral transform [DD]1{}[\mathrm{D}\otimes\mathrm{D}]^{-1}\{\cdot\}, defined for all m(K)m\in\mathcal{M}(\mathrm{K}) by

[DD]1{m}\displaystyle[\mathrm{D}\otimes\mathrm{D}]^{-1}\{m\} =Ku(x1)u(x2)dm(x1,x2)\displaystyle=\int_{\mathrm{K}}u(\cdot-x_{1})\,u(\cdot-x_{2})\,\mathrm{d}m(x_{1},x_{2})
=(uu)m,\displaystyle=(u\otimes u)\ast m, (9)

where uu is the heaviside function.

Proposition 1.

A function ff belongs to DD,0(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}) if and only if it belongs to

{f[DD]1{(K)}:ϕ𝒞c(Kc),f,ϕ=0}.\left\{f\in[\mathrm{D}\otimes\mathrm{D}]^{-1}\{\mathcal{M}(\mathrm{K})\}:\forall\phi\in\mathcal{C}_{c}^{\infty}(\mathrm{K}^{c}),\ \langle f,\phi\rangle=0\right\}. (10)
Proof of Proposition 1.

Clearly, if ff satisfies (10), then fDD,0(K)f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}). To prove the converse, assume fDD,0(K)f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}) and define m=[DD]{f}m=[\mathrm{D}\otimes\mathrm{D}]\{f\}. We claim that

[DD]1{m}=f[\mathrm{D}\otimes\mathrm{D}]^{-1}\{m\}=f (11)

which proves the proposition. Observe that [DD]1{m}[\mathrm{D}\otimes\mathrm{D}]^{-1}\{m\} may differ from ff only by an element of the null space of DD\mathrm{D}\otimes\mathrm{D} in 𝒟(2)\mathcal{D}^{\prime}(\mathbb{R}^{2}). Hence, one can write

f=[DD]1{m}+ϕ11+1ϕ2,f=[\mathrm{D}\otimes\mathrm{D}]^{-1}\{m\}+\phi_{1}\otimes 1+1\otimes\phi_{2}, (12)

for some (ϕ1,ϕ2)𝒟()2(\phi_{1},\phi_{2})\in\mathcal{D}^{\prime}(\mathbb{R})^{2}. Note that

supp(m)K\displaystyle\text{supp}(m)\subset\mathrm{K}\Rightarrow supp([DD]1{m})[0,[2\displaystyle\text{supp}\left([\mathrm{D}\otimes\mathrm{D}]^{-1}\{m\}\right)\subset[0,\infty[^{2}
\displaystyle\Rightarrow supp(ϕ11+1ϕ2)[0,[2,\displaystyle\text{supp}(\phi_{1}\otimes 1+1\otimes\phi_{2})\subset[0,\infty[^{2}, (13)

which directly implies that ϕ11+1ϕ2=0\phi_{1}\otimes 1+1\otimes\phi_{2}=0 and proves the claim. ∎

Let θ]0,1]\theta\in]0,1], we equip DD,0(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}) with the mixed norm θ\left\lVert\cdot\right\rVert_{\theta} defined as

θ=θ[DD]{}(K)+(1θ){}(K)2,\left\lVert\cdot\right\rVert_{\theta}=\theta\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\cdot\}\right\rVert_{\mathcal{M}(\mathrm{K})}+(1-\theta)\left\lVert\nabla\{\cdot\}\right\rVert_{\mathcal{M}(\mathrm{K})^{2}}, (14)

with

{}(K)2=[ID]{}(K)+[DI]{}(K).\left\lVert\nabla\{\cdot\}\right\rVert_{\mathcal{M}(\mathrm{K})^{2}}=\left\lVert[\mathrm{I}\otimes\mathrm{D}]\{\cdot\}\right\rVert_{\mathcal{M}(\mathrm{K})}+\left\lVert[\mathrm{D}\otimes\mathrm{I}]\{\cdot\}\right\rVert_{\mathcal{M}(\mathrm{K})}. (15)

In the sequel, we write \left\lVert\cdot\right\rVert_{\mathcal{M}} instead of (K)\left\lVert\cdot\right\rVert_{\mathcal{M}(\mathrm{K})} and 2\left\lVert\cdot\right\rVert_{\mathcal{M}^{2}} instead of (K)2\left\lVert\cdot\right\rVert_{\mathcal{M}(\mathrm{K})^{2}}. The key properties of (DD,0(K),θ)\left(\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}),\left\lVert\cdot\right\rVert_{\theta}\right) are summarised in Theorem 1.

Theorem 1.

Let θ]0,1]\theta\in]0,1]. Then, the vector space (DD,0(K),θ)\big(\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}),\left\lVert\cdot\right\rVert_{\theta}\big) satisfies the following properties:

  • 1.

    All norms θ\left\lVert\cdot\right\rVert_{\theta} are well-defined and equivalent; that is, for all (θ,θ)]0,1]2(\theta,\theta^{\prime})\in]0,1]^{2}, there exists a constant C>0C>0 such that

    fDD,0(K):fθCfθ.\forall f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}):\quad\left\lVert f\right\rVert_{\theta}\leq C\left\lVert f\right\rVert_{\theta^{\prime}}. (16)
  • 2.

    It is a Banach space.

  • 3.

    Equipped with the norm [DD]{}\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\cdot\}\right\rVert_{\mathcal{M}}, it is the dual space of [DD]{𝒞0(2)}/M0[\mathrm{D}\otimes\mathrm{D}]\{\mathcal{C}_{0}(\mathbb{R}^{2})\}/M_{0}, where

    M0={v[DD]{𝒞0(2)}:fDD,0(K),f,v=0},M_{0}=\big\{v\in[\mathrm{D}\otimes\mathrm{D}]\{\mathcal{C}_{0}(\mathbb{R}^{2})\}:\\ \forall f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}),\langle f,v\rangle=0\big\}, (17)

    and [DD]{𝒞0(2)}[\mathrm{D}\otimes\mathrm{D}]\{\mathcal{C}_{0}(\mathbb{R}^{2})\} is equipped with the norm

    [DD]{ϕ}=ϕ(2).\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\phi\}\right\rVert=\left\lVert\phi\right\rVert_{\infty(\mathbb{R}^{2})}. (18)
Proof of Theorem 1.

Item 1. Let m=[DD]{f}m=[\mathrm{D}\otimes\mathrm{D}]\{f\} and f=(uu)mf=(u\otimes u)\ast m. Then,

[ID]{f}=(u1)m|K|m\displaystyle\left\lVert[\mathrm{I}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}}=\left\lVert(u\otimes 1)\ast m\right\rVert_{\mathcal{M}}\leq|\mathrm{K}|\left\lVert m\right\rVert_{\mathcal{M}}
\displaystyle\Rightarrow {f}2|K|[DD]{f}\displaystyle\left\lVert\nabla\{f\}\right\rVert_{\mathcal{M}}\leq 2|\mathrm{K}|\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}} (19)

where |K|=K1d𝐱.|\mathrm{K}|=\int_{\mathrm{K}}1\mathrm{d}\mathbf{x}. Consequently, (16) holds with

C=max(θθ,1θ1θ),\displaystyle C=\text{max}\left(\frac{\theta}{\theta^{\prime}},\frac{1-\theta}{1-\theta^{\prime}}\right), forθ]0,1[\displaystyle\quad\text{for}\quad\theta^{\prime}\in]0,1[ (20)
C=(θ+(1θ)|K|2),\displaystyle C=(\theta+(1-\theta)|\mathrm{K}|2), forθ=1.\displaystyle\quad\text{for}\quad\theta^{\prime}=1. (21)

Item 2. We know from [32] that [DD]1{(K)}[\mathrm{D}\otimes\mathrm{D}]^{-1}\{\mathcal{M}(\mathrm{K})\} is a Banach space for the norm [DD]{}\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\cdot\}\right\rVert_{\mathcal{M}}. Since DD,0(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}) is a closed subspace of [DD]1((K))[\mathrm{D}\otimes\mathrm{D}]^{-1}(\mathcal{M}(\mathrm{K})), it is also Banach space for [DD]{}\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\cdot\}\right\rVert_{\mathcal{M}}. By the norm equivalence, it follows that it is a Banach space for θ\left\lVert\cdot\right\rVert_{\theta}.
Item 3. Here, all quotient spaces are equipped with their quotient norms. We know from [32] that [DD]1{(K)}[\mathrm{D}\otimes\mathrm{D}]^{-1}\{\mathcal{M}(\mathrm{K})\} equipped with [DD]{}\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\cdot\}\right\rVert_{\mathcal{M}} is the dual space of [DD]{𝒞0(2)}/M[\mathrm{D}\otimes\mathrm{D}]\{\mathcal{C}_{0}(\mathbb{R}^{2})\}/M, with

M={v[DD]{𝒞0}(2):f[DD]1{(K)},f,v=0}.M=\{v\in[\mathrm{D}\otimes\mathrm{D}]\{\mathcal{C}_{0}\}(\mathbb{R}^{2}):\\ \forall f\in[\mathrm{D}\otimes\mathrm{D}]^{-1}\{\mathcal{M}(\mathrm{K})\},\langle f,v\rangle=0\}. (22)

Since DD,0(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}) is a Banach subspace of [DD]1{(K)}[\mathrm{D}\otimes\mathrm{D}]^{-1}\{\mathcal{M}(\mathrm{K})\}, the predual of DD,0(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}) is given by the (pre-)annihilator

([DD]{𝒞0(2)}/M)/M0=[DD]{𝒞0(2)}/M0.\left([\mathrm{D}\otimes\mathrm{D}]\{\mathcal{C}_{0}(\mathbb{R}^{2})\}/M\right)/M_{0}=[\mathrm{D}\otimes\mathrm{D}]\{\mathcal{C}_{0}(\mathbb{R}^{2})\}/M_{0}. (23)

The key takeaway of Theorem 1 is that all θ\theta-norms, for θ>0\theta>0, are topologically equivalent on DD,0(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}). We have shown that DD,0(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}) also admits a weak\mathrm{weak}^{\star} topology, such that (fn)n=1(f_{n})_{n=1}^{\infty} is weak\mathrm{weak}^{\star}-convergent to ff in DD,0(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}) if and only if

[v][DD]{𝒞0(2)}/M0:limnfn,[v]=f,[v].\forall[v]\in[\mathrm{D}\otimes\mathrm{D}]\{\mathcal{C}_{0}(\mathbb{R}^{2})\}/M_{0}:\quad\lim_{n\to\infty}\langle f_{n},[v]\rangle=\langle f,[v]\rangle. (24)

Observe that, even though the quotient seems to complicate the predual structure, one still has that (fn)n=1(f_{n})_{n=1}^{\infty} is weak\mathrm{weak}^{\star}-convergent to ff if and only if

v[DD]{𝒞0(2)}:limnfn,v=f,v,\forall v\in[\mathrm{D}\otimes\mathrm{D}]\{\mathcal{C}_{0}(\mathbb{R}^{2})\}:\quad\lim_{n\to\infty}\langle f_{n},v\rangle=\langle f,v\rangle, (25)

because changing the representative in the quotient class, by definition, does not change the value of the brackets.

Next, we discuss the relation of DD,0(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}) with the space BV(Ω)\mathrm{BV}(\Omega) of functions of bounded variation, which is naturally equipped with the norm {}2\left\lVert\nabla\{\cdot\}\right\rVert_{\mathcal{M}^{2}}. We recall that, for a domain Ω\Omega,

BV(Ω)={f1(Ω):{f}(Ω)2}.\mathrm{BV}(\Omega)=\{f\in\mathcal{L}_{1}(\Omega):\nabla\{f\}\in\mathcal{M}(\Omega)^{2}\}. (26)
Proposition 2.

For a domain Ω\Omega such that KΩ\mathrm{K}\subset\Omega, the following holds:

  • 1.

    the functional {}2\left\lVert\nabla\{\cdot\}\right\rVert_{\mathcal{M}^{2}} is a norm on DD,0(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K});

  • 2.

    the space DD,0(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}) is a subspace of BV(Ω)\mathrm{BV}(\Omega), i.e.

    DD,0(K)BV(Ω);\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K})\subset\mathrm{BV}(\Omega); (27)
  • 3.

    the vector space (DD,0(K),{}2)\big(\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}),\left\lVert\nabla\{\cdot\}\right\rVert_{\mathcal{M}^{2}}\big) is not closed.

Proof of Proposition 2.

Item 1. Absolute homogeneity, the triangle inequality, and non-negativity are straightforward. In addition,

{f}2=0f=c\left\lVert\nabla\{f\}\right\rVert_{\mathcal{M}^{2}}=0\Leftrightarrow f=c (28)

for some cc\in\mathbb{R}. Since ff is compactly supported, one has that c=0c=0.
Item 2. It follows from Item 1 of Theorem 1 that {f}(2)2\nabla\{f\}\in\mathcal{M}(\mathbb{R}^{2})^{2}. Because supp({f})KΩ\mathrm{supp}(\nabla\{f\})\subset\mathrm{K}\subset\Omega, we find that {f}(Ω)2\nabla\{f\}\in\mathcal{M}(\Omega)^{2}. The same argument applies to conclude that f1(Ω)f\in\mathcal{L}_{1}(\Omega).
Item 3. Let K=[0,2]2\mathrm{K}=[0,2]^{2} and β()=𝟙[0,1]()\beta(\cdot)=\mathbbm{1}_{[0,1]}(\cdot). We define

fn+1()=k=0nβ(2k)β(ck2k)f_{n+1}(\cdot)=\sum_{k=0}^{n}\beta\left(\frac{\cdot}{2^{-k}}\right)\otimes\beta\left(\frac{\cdot-c_{k}}{2^{-k}}\right) (29)

with ck=2(112k)c_{k}=2\left(1-\frac{1}{2^{k}}\right). Then, a straightforward calculation shows that the sequence (fn)n=1(f_{n})_{n=1}^{\infty} with fnDD,0(K)f_{n}\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}) converges in {}2\left\lVert\nabla\{\cdot\}\right\rVert_{\mathcal{M}^{2}} to ff^{\infty}, which is a staircase with an accumulation of steps towards the x2x_{2} axis. In addition, we calculate that

[DD]{f}=δ(0,0)δ(1,0)+k=1δ(2ck,ck)δ(2ck+1,ck),[\mathrm{D}\otimes\mathrm{D}]\{f^{\infty}\}=\\ \delta_{(0,0)}-\delta_{(1,0)}+\sum_{k=1}^{\infty}\delta_{(2-c_{k},c_{k})}-\delta_{(2-c_{k+1},c_{k})}, (30)

which is an unbounded measure with [DD]{f}=\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f^{\infty}\}\right\rVert_{\mathcal{M}}=\infty. Consequently, (DD,0(K),{}2)\big(\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}),\left\lVert\nabla\{\cdot\}\right\rVert_{\mathcal{M}^{2}}\big) is not closed. ∎

II-B Sets, Indecomposable Sets and Approximations

We study the geometrical properties of sets that satisfy [DD]{𝟙A}<\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{A}\}\right\rVert_{\mathcal{M}}<\infty. As a first step, we show that an indicator function 𝟙A\mathbbm{1}_{A}, as an element of DD,0(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}), forms an equivalence class.

Proposition 3.

Let AKA\subset\mathrm{K} be a measurable set. If a measurable set AKA^{\prime}\subset\mathrm{K} differs from AA by a set of measure zero, then

θ]0,1],𝟙A𝟙Aθ=0.\forall\theta\in]0,1],\quad\left\lVert\mathbbm{1}_{A}-\mathbbm{1}_{A^{\prime}}\right\rVert_{\theta}=0. (31)

In particular, if 𝟙ADD,0(K)\mathbbm{1}_{A}\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}), then 𝟙ADD,0(K)\mathbbm{1}_{A^{\prime}}\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}).

Proof of Proposition 3.

We calculate that

ϕ𝒞c(2),𝟙A𝟙A,[DD]{ϕ}=0\displaystyle\forall\phi\in\mathcal{C}_{c}^{\infty}(\mathbb{R}^{2}),\quad\langle\mathbbm{1}_{A}-\mathbbm{1}_{A^{\prime}},[\mathrm{D}\otimes\mathrm{D}]\{\phi\}\rangle=0 (32)
\displaystyle\Rightarrow ϕ𝒞0(2),𝟙A𝟙A,[DD]{ϕ}=0\displaystyle\forall\phi\in\mathcal{C}_{0}(\mathbb{R}^{2}),\quad\langle\mathbbm{1}_{A}-\mathbbm{1}_{A^{\prime}},[\mathrm{D}\otimes\mathrm{D}]\{\phi\}\rangle=0 (33)
\displaystyle\Rightarrow 𝟙A𝟙A1=0\displaystyle\left\lVert\mathbbm{1}_{A}-\mathbbm{1}_{A^{\prime}}\right\rVert_{1}=0 (34)
\displaystyle\Rightarrow 𝟙A𝟙Aθ=0.\displaystyle\left\lVert\mathbbm{1}_{A}-\mathbbm{1}_{A^{\prime}}\right\rVert_{\theta}=0. (35)

In (32), we used the assumption that AA^{\prime} differs from AA by a set of measure zero. In (33) we used the fact that 𝒞c(2)\mathcal{C}_{c}^{\infty}(\mathbb{R}^{2}) is dense in 𝒞0(2)\mathcal{C}_{0}(\mathbb{R}^{2}), and that the linear and continuous functional 𝟙A𝟙A,\langle\mathbbm{1}_{A}-\mathbbm{1}_{A^{\prime}},\cdot\rangle admits a unique (linear and continuous) extension to 𝒞0(2)\mathcal{C}_{0}(\mathbb{R}^{2}). In (34) we used Item 3 of Theorem 1. In (35) we used Item 1 of Theorem 1. Finally,

𝟙A=𝟙A(𝟙A𝟙A)DD,0(K).\mathbbm{1}_{A^{\prime}}=\mathbbm{1}_{A}-(\mathbbm{1}_{A}-\mathbbm{1}_{A^{\prime}})\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}). (36)

The fact that 𝟙A\mathbbm{1}_{A} forms an equivalence class is often forgotten in the sequel as it is (often) clear from the context which representative to take. Proposition 4 shows that such indicators must have the form of a polygon with finitely many corners and edges parallel to the x1x_{1} or the x2x_{2} axis.

Proposition 4.

If 𝟙ADD,0(K)\mathbbm{1}_{A}\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}), then [DD]{𝟙A}[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{A}\} is a discrete measure with finitely many atoms. In addition, one has the representation

𝟙A=k=1Kaku(x1,k)u(x2,k),withak{1,1}\mathbbm{1}_{A}=\sum_{k=1}^{K}a_{k}u(\cdot-x_{1,k})u(\cdot-x_{2,k}),\quad\text{with}\quad a_{k}\in\{-1,1\} (37)

and (x1,k,x2,k)K.(x_{1,k},x_{2,k})\in\mathrm{K}.

Proof of Proposition 4.

Step 1. We define

𝟙A(𝐭)\displaystyle\mathbbm{1}_{A}(\mathbf{t}) =Ku(t1x1)u(t2x2)dm(x1,x2)\displaystyle=\int_{\mathrm{K}}u(t_{1}-x_{1})u(t_{2}-x_{2})\mathrm{d}m(x_{1},x_{2})
=m([0,t1]×[0,t2]).\displaystyle=m([0,t_{1}]\times[0,t_{2}]). (38)

Observe that in (38) the value of 𝟙A(𝐭)\mathbbm{1}_{A}(\mathbf{t}) is set by measuring the closed square. Other choices, such as m([0,t1[×[0,t2[)m([0,t_{1}[\times[0,t_{2}[), would yield functions that fall in the same equivalence class as 𝟙A.\mathbbm{1}_{A}.
Step 2. We assume by contradiction that there exists a set EE of the form

E=]e1,e1+]×]e2,e2+]K,with(e1+,e2+)K,\displaystyle E=]e_{1}^{-},e_{1}^{+}]\times]e_{2}^{-},e_{2}^{+}]\cap\mathrm{K},\quad\text{with}\quad(e_{1}^{+},e_{2}^{+})\in\mathrm{K}, (39)

such that m(E).m(E)\notin\mathbb{Z}. Observe that, for E0,0=[0,e1]×[0,e2]K,E_{0,0}=[0,e_{1}^{-}]\times[0,e_{2}^{-}]\cap\mathrm{K}, E1,0=]e1,e1+]×[0,e2]K,E_{1,0}=]e_{1}^{-},e_{1}^{+}]\times[0,e_{2}^{-}]\cap\mathrm{K}, and E0,1=[0,e1]×]e2,e2+]KE_{0,1}=[0,e_{1}^{-}]\times]e_{2}^{-},e_{2}^{+}]\cap\mathrm{K}, one has that

m(E+E0,0+E1,0+E0,1)=𝟙A(e1+,e2+){0,1}.\displaystyle m(E+E_{0,0}+E_{1,0}+E_{0,1})=\mathbbm{1}_{A}(e_{1}^{+},e_{2}^{+})\in\{0,1\}. (40)

Consequently, we either have that

  • m(E0,0),m(E_{0,0})\notin\mathbb{Z}, in which case m(E0,0)=𝟙A(e1,e2){0,1}m(E_{0,0})=\mathbbm{1}_{A}(e_{1}^{-},e_{2}^{-})\in\{0,1\}, which is a contradiction;

  • m(E0,1),m(E_{0,1})\notin\mathbb{Z}, in which case

    • if m(E0,0)m(E_{0,0})\notin\mathbb{Z}, then we find a contradiction as before;

    • if m(E0,0)m(E_{0,0})\in\mathbb{Z}, then m(E0,1+E0,0)=𝟙A(e1,e2+){0,1}\mathbb{Z}\notowner m(E_{0,1}+E_{0,0})=\mathbbm{1}_{A}(e_{1}^{-},e_{2}^{+})\in\{0,1\} which is a contradiction;

  • m(E1,0),m(E_{1,0})\notin\mathbb{Z}, in which case we reach a contradiction as in the m(E0,1)m(E_{0,1})\notin\mathbb{Z} case.

Further, one can prove that the measure mm

  • 1.

    is integer valued on intersections of sets of the form (39);

  • 2.

    is integer valued on unions of sets of the form (39);

  • 3.

    is integer valued on the relative complement of a set of the form (39).

It follows that mm is integer-valued on any Borel set [56, Proposition 1.1.5]. This implies, with the boundedness of mm, that [57, Chapters 1-2]

m=k=1Kakδ(x1,k,x2,k),ak{0}.m=\sum_{k=1}^{K}a_{k}\delta_{(x_{1,k},x_{2,k})},\quad a_{k}\in\mathbb{Z}\setminus\{0\}. (41)

The fact that 𝟙A\mathbbm{1}_{A} is {0,1}\{0,1\}-valued finally implies that ak{1,1}a_{k}\in\{-1,1\}. ∎

One important class is formed by indecomposable sets [23, 26], which we relabel as \nabla-indecomposable in Definition 1. These sets are central in our description of the unit ball of (DD,0+(K),θ)(\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathrm{K}),\left\lVert\cdot\right\rVert_{\theta}), as the indecomposability property translates into an extreme point structure. We extend this class to [DD][\mathrm{D}\otimes\mathrm{D}]-indecomposable sets in Definition 2.

Definition 1.

Let AKA\subset\mathrm{K} be such that 𝟙ADD,0(K)\mathbbm{1}_{A}\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}). The set AA is \nabla-decomposable if there exists a partition A1,A2A_{1},A_{2} of AA such that, 𝟙A1DD,0(K)\mathbbm{1}_{A_{1}}\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}), 𝟙A2DD,0(K)\mathbbm{1}_{A_{2}}\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}),

{𝟙A1}2+{𝟙A2}2={𝟙A}2<.\left\lVert\nabla\{\mathbbm{1}_{A_{1}}\}\right\rVert_{\mathcal{M}^{2}}+\left\lVert\nabla\{\mathbbm{1}_{A_{2}}\}\right\rVert_{\mathcal{M}^{2}}=\left\lVert\nabla\{\mathbbm{1}_{A}\}\right\rVert_{\mathcal{M}^{2}}<\infty. (42)

and |A1|>0,|A_{1}|>0, |A2|>0.|A_{2}|>0. If the set AA is not \nabla-decomposable, it is \nabla-indecomposable. The set of all \nabla-indecomposable sets is denoted by

.\mathcal{E}_{\nabla}. (43)
Definition 2.

Let AKA\subset\mathrm{K} be such that 𝟙ADD,0(K)\mathbbm{1}_{A}\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}). The set AA is [DD][\mathrm{D}\otimes\mathrm{D}]-decomposable if there exists a partition A1,A2A_{1},A_{2} of AA such that, 𝟙A1DD,0(K)\mathbbm{1}_{A_{1}}\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}), 𝟙A2DD,0(K)\mathbbm{1}_{A_{2}}\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}),

[DD]{𝟙A1}+[DD]{𝟙A2}=[DD]{𝟙A}<,\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{A_{1}}\}\right\rVert_{\mathcal{M}}+\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{A_{2}}\}\right\rVert_{\mathcal{M}}=\\ \left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{A}\}\right\rVert_{\mathcal{M}}<\infty, (44)

and |A1|>0,|A_{1}|>0, |A2|>0.|A_{2}|>0. If the set AA is not [DD][\mathrm{D}\otimes\mathrm{D}]-decomposable, it is [DD][\mathrm{D}\otimes\mathrm{D}]-indecomposable. The set of all [DD][\mathrm{D}\otimes\mathrm{D}]-indecomposable sets is denoted by

DD.\mathcal{E}_{\mathrm{D}\otimes\mathrm{D}}. (45)

The relation between [DD][\mathrm{D}\otimes\mathrm{D}]-indecomposability and \nabla-indecomposability is revealed in Proposition 5.

Proposition 5.

The strict inclusion holds true

DD.\mathcal{E}_{\mathrm{D}\otimes\mathrm{D}}\subset\mathcal{E}_{\nabla}. (46)
Proof of Proposition 5.

Item 1. We provide an example of a \nabla-indecomposable set AA that is not [DD][\mathrm{D}\otimes\mathrm{D}]-indecomposable. To do so, we define

𝟙A1(t1,t2)=β(t13)β(t2)\displaystyle\mathbbm{1}_{A_{1}}(t_{1},t_{2})=\beta\left(\frac{t_{1}}{3}\right)\otimes\beta(t_{2}) (47)
𝟙A2(t1,t2)=β(t11)β(t21)\displaystyle\mathbbm{1}_{A_{2}}(t_{1},t_{2})=\beta(t_{1}-1)\otimes\beta(t_{2}-1) (48)

as well as 𝟙A=𝟙A1+𝟙A2\mathbbm{1}_{A}=\mathbbm{1}_{A_{1}}+\mathbbm{1}_{A_{2}}. On the one hand, AA is [DD][\mathrm{D}\otimes\mathrm{D}]-decomposable as, i{1,2}\forall i\in\{1,2\},

[DD]{𝟙Ai}=4,[DD]{𝟙A}=8.\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{A_{i}}\}\right\rVert_{\mathcal{M}}=4,\quad\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{A}\}\right\rVert_{\mathcal{M}}=8. (49)

On the other hand, we know from [26, Proposition 2] that AA is \nabla-indecomposable.

Item 2. We show that, if (A1,A2)(A_{1},A_{2}) is a \nabla-decomposition of AA, then it is also a [DD][\mathrm{D}\otimes\mathrm{D}]-decomposition. We define m=[DD]{𝟙A}m=[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{A}\}, m1=[DD]{𝟙A1}m_{1}=[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{A_{1}}\}, and m2=[DD]{𝟙A2}m_{2}=[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{A_{2}}\}. It follows from Proposition 4 that mm, m1m_{1}, and m2m_{2} are discrete measures with finitely many atoms, such that

mi=k=1Kiai,kδbi,kδci,kwithai,k{1,1}.m_{i}=\sum_{k=1}^{K_{i}}a_{i,k}\delta_{b_{i,k}}\otimes\delta_{c_{i,k}}\quad\text{with}\quad a_{i,k}\in\{-1,1\}. (50)

Assume, by contradiction, that (A1,A2)(A_{1},A_{2}) fails to be a [DD][\mathrm{D}\otimes\mathrm{D}]-decomposition. Then, there exists (k,k)(k,k^{\prime}) such that

(b1,k,c1,k)=(b2,k,c2,k),anda1,k=a2,k.(b_{1,k},c_{1,k})=(b_{2,k^{\prime}},c_{2,k^{\prime}}),\quad\text{and}\quad a_{1,k}=-a_{2,k^{\prime}}. (51)

This means that (b1,k,c1,k)(b_{1,k},c_{1,k}) (respectively (b2,k,c2,k)(b_{2,k^{\prime}},c_{2,k^{\prime}})) is a corner of A1A_{1} (respectively A2A_{2}). Observe that, for (A1,A2)(A_{1},A_{2}) to be a \nabla-decomposition of AA, their edges may overlap only on corners, and not on segments. This implies that both corners are facing each other, in which case a1,ka_{1,k} and a2,ka_{2,k^{\prime}} must have the same value. This is a contradiction.

Indicator functions 𝟙ADD,0(K)\mathbbm{1}_{A}\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}) are not only useful for the indecomposable structure, but also for their approximation properties. We conclude this section with Lemma 1, which states that any function fDD,0(K)f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}) can be approximated, in an appropriate sense, by linear combinations of indicator functions. The proof is given in Appendix -A.

Lemma 1.

For fDD,0(K)f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}), there exists a sequence (fn)n=1(f_{n})_{n=1}^{\infty} of functions fnDD,0(K)f_{n}\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}), such that [DD]{fn}[\mathrm{D}\otimes\mathrm{D}]\{f_{n}\} is a discrete measure supported on 1n[1n]×1n[1n],\frac{1}{n}[1\cdots n]\times\frac{1}{n}[1\cdots n], and such that

  • 1.

    it is weak\mathrm{weak}^{\star}-convergent to ff in DD,0(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K});

  • 2.

    over the mixed derivative,

    [DD]{f}[DD]{fn}\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}}\geq\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f_{n}\}\right\rVert_{\mathcal{M}}

    and

    [DD]{f}=limn[DD]{fn};\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}}=\underset{n\to\infty}{\mathrm{lim}}\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f_{n}\}\right\rVert_{\mathcal{M}};
  • 3.

    it is convergent to ff in 1(K)\mathcal{L}_{1}(\mathrm{K});

  • 4.

    over the gradient, {f}2=limn{fn}2\left\lVert\nabla\{f\}\right\rVert_{\mathcal{M}^{2}}=\underset{n\to\infty}{\mathrm{lim}}\left\lVert\nabla\{f_{n}\}\right\rVert_{\mathcal{M}^{2}}.

II-C Coarea and Cocorner Formulas

In the sequel, we will use the (convex) cone DD,0+(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathrm{K}) of positive functions, defined as

{fDD,0(K):ϕ𝒞c(2),ϕ0f,ϕ0},\big\{f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}):\forall\phi\in\mathcal{C}_{c}^{\infty}(\mathbb{R}^{2}),\ \phi\geq 0\Rightarrow\langle f,\phi\rangle\geq 0\big\}, (52)

which is closed in both the strong and the weak\mathrm{weak}^{\star} topologies.

We develop tools to split a function fDD,0+(2)f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathbb{R}^{2}) into a sum of indicator functions on indecomposable sets. To this end, we first recall the coarea formula. For fDD,0+(2)f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathbb{R}^{2}) and s[0,[s\in[0,\infty[, we define the functions

P(f,s)={min(f,s)}2P_{-}(f,s)=\left\lVert\nabla\{\min(f,s)\}\right\rVert_{\mathcal{M}^{2}} (53)

and

P+(f,s)={max(fs,0)}2.P_{+}(f,s)=\left\lVert\nabla\{\max(f-s,0)\}\right\rVert_{\mathcal{M}^{2}}. (54)
Theorem 2.

The functions P(f,)P_{-}(f,\cdot) and P+(f,)P_{+}(f,\cdot) are continuous, respectively increasing and decreasing, and satisfy the equalities

P(f,s)=0s{𝟙{x:f(x)α}}2dα,P_{-}(f,s)=\int_{0}^{s}\left\lVert\nabla\{\mathbbm{1}_{\{x:f(x)\geq\alpha\}}\}\right\rVert_{\mathcal{M}^{2}}\,\mathrm{d}\alpha, (55)
P+(f,s)=s{𝟙{x:f(x)α}}2dα.P_{+}(f,s)=\int_{s}^{\infty}\left\lVert\nabla\{\mathbbm{1}_{\{x:f(x)\geq\alpha\}}\}\right\rVert_{\mathcal{M}^{2}}\,\mathrm{d}\alpha. (56)

In addition, they satisfy the BVBV-coarea formula

{f}2=P(f,s)+P+(f,s)=0{𝟙{x:f(x)α}}2dα.\left\lVert\nabla\{f\}\right\rVert_{\mathcal{M}^{2}}=P_{-}(f,s)+P_{+}(f,s)\\ =\int_{0}^{\infty}\left\lVert\nabla\{\mathbbm{1}_{\{x:f(x)\geq\alpha\}}\}\right\rVert_{\mathcal{M}^{2}}\,\mathrm{d}\alpha. (57)

The coarea formula, for the 2\ell_{2} norm on the gradient, was proved for Lipschitz functions in [58], for BV functions in [59] and was later extended for general norms (for example, see [60]). Equations (55) and (56) follow from the application of the coarea formula on min(f,s)\text{min}(f,s) and max(fs,0)\text{max}(f-s,0).

Next, we provide a new formula inspired by the coarea one. We define the functions C(f,)C_{-}(f,\cdot) and C+(f,)C_{+}(f,\cdot) as

C(f,s)=[DD]{min(f,s)},C_{-}(f,s)=\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathrm{min}(f,s)\}\right\rVert_{\mathcal{M}}, (58)
C+(f,s)=[DD]{max(fs,0)}.C_{+}(f,s)=\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathrm{max}(f-s,0)\}\right\rVert_{\mathcal{M}}. (59)

In Lemma 2, we adapt Theorem 2 for the operator DD\mathrm{D}\otimes\mathrm{D} and for piecewise constant functions.

Lemma 2.

Let fDD,0+(K)f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathrm{K}) be such that [DD]{f}[\mathrm{D}\otimes\mathrm{D}]\{f\} is a discrete measure with finitely many atoms. Then, the functions C(f,)C_{-}(f,\cdot) and C+(f,)C_{+}(f,\cdot) are continuous, respectively increasing and decreasing, and satisfy the equalities

C(f,s)=0s[DD]{𝟙{x:f(x)α}}dα,C_{-}(f,s)=\int_{0}^{s}\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{\{x:f(x)\geq\alpha\}}\}\right\rVert_{\mathcal{M}}\mathrm{d}\alpha, (60)
C+(f,s)=s[DD]{𝟙{x:f(x)α}}dα.C_{+}(f,s)=\int_{s}^{\infty}\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{\{x:f(x)\geq\alpha\}}\}\right\rVert_{\mathcal{M}}\mathrm{d}\alpha. (61)

In addition, they verify the cocorner formula

[DD]{f}=\displaystyle\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}}= C(f,s)+C+(f,s)\displaystyle\,C_{-}(f,s)+C_{+}(f,s)
=\displaystyle= 0[DD]{𝟙{x:f(x)α}}dα.\displaystyle\int_{0}^{\infty}\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{\{x:f(x)\geq\alpha\}}\}\right\rVert_{\mathcal{M}}\mathrm{d}\alpha. (62)
Proof of Lemma 2.

We set [DD]{f}=m=k=1Kakδ𝐱k[\mathrm{D}\otimes\mathrm{D}]\{f\}=m=\sum_{k=1}^{K}a_{k}\delta_{\mathbf{x}_{k}} and define

C~(f,s)=0s[DD]{𝟙{x:f(x)α}}dα.\tilde{C}_{-}(f,s)=\int_{0}^{s}\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{\{x:f(x)\geq\alpha\}}\}\right\rVert_{\mathcal{M}}\mathrm{d}\alpha. (63)

Observe that ff takes finitely many values (v0,v1,,vM)(v_{0},v_{1},\ldots,v_{M}) with vmvm+1v_{m}\leq v_{m+1}, v0=0v_{0}=0 and vM+1=v_{M+1}=\infty. Then, for α]vm,vm+1[\alpha\in]v_{m},v_{m+1}[ one has that

{x:f(x)α}={x:f(x)>vm}.\{x:f(x)\geq\alpha\}=\{x:f(x)>v_{m}\}. (64)

For s]vm,vm+1[s\in]v_{m^{\prime}},v_{m^{\prime}+1}[ with m[0M1]m^{\prime}\in[0\cdots M-1], we calculate that

C~(f,s)=\displaystyle\tilde{C}_{-}(f,s)= m=1m(vmvm1)[DD]{𝟙{x:f(x)>vm1}}\displaystyle\sum_{m=1}^{m^{\prime}}(v_{m}-v_{m-1})\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{\{x:f(x)>v_{m-1}\}}\}\right\rVert_{\mathcal{M}}
+(svm)[DD]{𝟙{x:f(x)>vm}}.\displaystyle+(s-v_{m^{\prime}})\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{\{x:f(x)>v_{m^{\prime}}\}}\}\right\rVert_{\mathcal{M}}. (65)

It follows directly from the representation (II-C) that C~(f,)\tilde{C}_{-}(f,\cdot) is increasing and continuous. Next, we observe that [DD]{𝟙{x:f(x)>α}}[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{\{x:f(x)>\alpha\}}\} may only be non-zero on 𝐱k\mathbf{x}_{k}. In addition, for α<α\alpha<\alpha^{\prime}, 𝐱k\mathbf{x}_{k} is in the support of both [DD]{𝟙{x:f(x)>α}}[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{\{x:f(x)>\alpha^{\prime}\}}\} and [DD]{𝟙{x:f(x)>α}}[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{\{x:f(x)>\alpha\}}\} if and only if it is a corner of 𝟙{x:f(x)>α}\mathbbm{1}_{\{x:f(x)>\alpha^{\prime}\}} and 𝟙{x:f(x)>α}\mathbbm{1}_{\{x:f(x)>\alpha\}}, in which case the Dirac mass centred at 𝐱k\mathbf{x}_{k} must have an amplitude of the same sign because {x:f(x)>α}{x:f(x)>α}\{x:f(x)>\alpha^{\prime}\}\subset\{x:f(x)>\alpha\}. It follows from this remark and (II-C) that

C~(f,s)=[DD]{m=1m(vmvm1)𝟙{x:f(x)>vm1}}+[DD]{(svm)𝟙{x:f(x)>vm}}\tilde{C}_{-}(f,s)=\Bigg\|[\mathrm{D}\otimes\mathrm{D}]\left\{\sum_{m=1}^{m^{\prime}}(v_{m}-v_{m-1})\mathbbm{1}_{\{x:f(x)>v_{m-1}\}}\right\}\\ +[\mathrm{D}\otimes\mathrm{D}]\{(s-v_{m^{\prime}})\mathbbm{1}_{\{x:f(x)>v_{m^{\prime}}\}}\}\Bigg\|_{\mathcal{M}} (66)

and, consequently, that C~(f,s)=C(f,s).\tilde{C}_{-}(f,s)=C_{-}(f,s). A similar argumentation would show that C+(f,)C_{+}(f,\cdot) is continuous, decreasing, and that

C+(f,s)=s[DD]{𝟙{x:f(x)α}}dα.\displaystyle C_{+}(f,s)=\int_{s}^{\infty}\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{\{x:f(x)\geq\alpha\}}\}\right\rVert_{\mathcal{M}}\mathrm{d}\alpha. (67)

Finally,

C(f,s)+C+(f,s)\displaystyle C_{-}(f,s)+C_{+}(f,s) =0[DD]{𝟙{x:f(x)α}}dα\displaystyle=\int_{0}^{\infty}\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{\{x:f(x)\geq\alpha\}}\}\right\rVert_{\mathcal{M}}\mathrm{d}\alpha
=lims[DD]{min(f,s)}\displaystyle=\underset{s\to\infty}{\text{lim}}\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\text{min}(f,s)\}\right\rVert_{\mathcal{M}}
=[DD]{f}.\displaystyle=\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}}. (68)

In Lemma 3 we extend, with a limit argument, Lemma 2 for arbitrary functions in DD,0+(K).\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathrm{K}).

Lemma 3.

If fDD,0+(K)f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathrm{K}), then the functions C(f,)C_{-}(f,\cdot) and C+(f,)C_{+}(f,\cdot) are continuous and respectively increasing and decreasing. In addition, they verify the equality

s>0:[DD]{f}=C(f,s)+C+(f,s).\forall s>0:\quad\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f\}\right\rVert=C_{-}(f,s)+C_{+}(f,s). (69)
Proof of Lemma 3.

Step 1. Consider a sequence of functions (fn)n=1(f_{n})_{n=1}^{\infty} converging to ff as in Lemma 1. Without loss of generality, we assume that fnDD,0+(K)f_{n}\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathrm{K}). Our Claim 1 is that, up to a subsequence,

limnC(fn,s)=C(f,s),\underset{n\to\infty}{\text{lim}}C_{-}(f_{n},s)=C_{-}(f,s), (70)

and

limnC+(fn,s)=C+(f,s).\underset{n\to\infty}{\text{lim}}C_{+}(f_{n},s)=C_{+}(f,s). (71)

To prove this, we first recall that

min(fn,s)=fn+s|fns|2.\text{min}(f_{n},s)=\frac{f_{n}+s-|f_{n}-s|}{2}. (72)

Item 2 of Lemma 1 implies that fn+sf_{n}+s and fnsf_{n}-s converge in 1(K)\mathcal{L}_{1}(\mathrm{K}) to f+sf+s and fsf-s, which, together with the triangle inequality, yields the convergence of |fns||f_{n}-s| to |fs||f-s|. Therefore, min(fn,s)\text{min}(f_{n},s) converges in 1(K)\mathcal{L}_{1}(\mathrm{K}) to min(f,s)\text{min}(f,s). In particular,

ψ([0,1])([0,1])([0,1]2):limnmin(fn,s),ψ=min(f,s),ψ.\forall\psi\in\mathcal{L}_{\infty}([0,1])\otimes\mathcal{L}_{\infty}([0,1])\subset\mathcal{L}_{\infty}([0,1]^{2}):\\ \underset{n\to\infty}{\text{lim}}\langle\text{min}(f_{n},s),\psi\rangle=\langle\text{min}(f,s),\psi\rangle. (73)

Remark that

D{𝒞c()}|[0,1]D{𝒞c()}|[0,1]([0,1])([0,1])\mathrm{D}\{\mathcal{C}_{c}^{\infty}(\mathbb{R})\}|_{[0,1]}\otimes\mathrm{D}\{\mathcal{C}_{c}^{\infty}(\mathbb{R})\}|_{[0,1]}\subset\mathcal{L}_{\infty}([0,1])\otimes\mathcal{L}_{\infty}([0,1]) (74)

and that D{𝒞c()}D{𝒞c()}\mathrm{D}\{\mathcal{C}_{c}^{\infty}(\mathbb{R})\}\otimes\mathrm{D}\{\mathcal{C}_{c}^{\infty}(\mathbb{R})\} is dense in [DD]{𝒞0(2)}[\mathrm{D}\otimes\mathrm{D}]\{\mathcal{C}_{0}(\mathbb{R}^{2})\}. Furthermore, it follows from Item 2 of Lemma 1 that

[DD]{min(fn,s)}\displaystyle\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\text{min}(f_{n},s)\}\right\rVert_{\mathcal{M}} [DD]{fn}\displaystyle\leq\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f_{n}\}\right\rVert_{\mathcal{M}}
[DD]{f}.\displaystyle\leq\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}}. (75)

Consequently, the sequence (min(fn,s))n=1(\text{min}(f_{n},s))_{n=1}^{\infty} is uniformly bounded and, from (74), weak\mathrm{weak}^{\star}-convergent in DD,0+(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathrm{K}) to min(f,s)\text{min}(f,s). A similar argument shows that the sequence (max(fns,0))n=1(\text{max}(f_{n}-s,0))_{n=1}^{\infty} is uniformly bounded and weak\mathrm{weak}^{\star}-convergent in DD,0+(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathrm{K}) to max(fs,0)\text{max}(f-s,0). It follows that

[DD]{min(f,s)}liminf nC(fn,s),\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\text{min}(f,s)\}\right\rVert_{\mathcal{M}}\leq\underset{n\to\infty}{\text{liminf }}C_{-}(f_{n},s), (76)

and that

[DD]{max(fs,0)}liminf nC+(fn,s).\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\text{max}(f-s,0)\}\right\rVert_{\mathcal{M}}\leq\underset{n\to\infty}{\text{liminf }}C_{+}(f_{n},s). (77)

In addition,

C(fn,s)+C+(fn,s)\displaystyle C_{-}(f_{n},s)+C_{+}(f_{n},s) =[DD]{fn}\displaystyle=\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f_{n}\}\right\rVert_{\mathcal{M}} (78)
[DD]{f}.\displaystyle\leq\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}}. (79)

Finally, if either (76) or (77) has a strict inequality, we conclude that

[DD]{f}\displaystyle\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}}
\displaystyle\leq [DD]{min(f,s)}+[DD]{max(fs,0)}\displaystyle\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\text{min}(f,s)\}\right\rVert_{\mathcal{M}}+\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\text{max}(f-s,0)\}\right\rVert_{\mathcal{M}} (80)
<\displaystyle<  liminf nC(fn,s)+liminf nC+(fn,s)\displaystyle\underset{n\to\infty}{\text{ liminf }}C_{-}(f_{n},s)+\underset{n\to\infty}{\text{liminf }}C_{+}(f_{n},s) (81)
=\displaystyle= [DD]{f},\displaystyle\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}}, (82)

which is a contradiction. In (80) we used the fact that

f=min(f,s)+max(fs,0),f=\text{min}(f,s)+\text{max}(f-s,0), (83)

and the triangular inequality. In (82) we used (79). Consequently, (76) and (77) hold with an equality and, up to a subsequence, (70) and (71) hold.
Step 2. The function C(f,)C_{-}(f,\cdot) is increasing. Indeed, Lemma 2 yields that, n\forall n\in\mathbb{N} and s1s2\forall s_{1}\leq s_{2},

C(fn,s1)C(fn,s2)C(f,s1)C(f,s2),\displaystyle C_{-}(f_{n},s_{1})\leq C_{-}(f_{n},s_{2})\Rightarrow C_{-}(f,s_{1})\leq C_{-}(f,s_{2}), (84)

where we used Claim 1 to pass to the limit. The claim that C+(f,)C_{+}(f,\cdot) is decreasing and (69) are proved likewise.
Step 3. We now show that C(f,)C_{-}(f,\cdot) and C+(f,)C_{+}(f,\cdot) are continuous. Let (sn)n=1(s_{n})_{n=1}^{\infty} be a sequence convergent to ss. An argument similar to the one in Step 1 yields that (in parallel with (76) and (77))

C(f,s)\displaystyle C_{-}(f,s) =liminfnC(f,sn),\displaystyle=\underset{n\to\infty}{\mathrm{liminf}}\quad C_{-}(f,s_{n}),
C+(f,s)\displaystyle C_{+}(f,s) =liminfnC+(f,sn),\displaystyle=\underset{n\to\infty}{\mathrm{liminf}}\quad C_{+}(f,s_{n}), (85)

from which it follows that

[DD]{f}\displaystyle\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}} =liminfnC(f,sn)+C+(f,sn)\displaystyle=\underset{n\to\infty}{\mathrm{liminf}}\quad C_{-}(f,s_{n})+C_{+}(f,s_{n})
limsupnC(f,sn)+C+(f,sn)\displaystyle\leq\underset{n\to\infty}{\mathrm{limsup}}\quad C_{-}(f,s_{n})+C_{+}(f,s_{n}) (86)
=limsupn[DD]{f}\displaystyle=\underset{n\to\infty}{\mathrm{limsup}}\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}} (87)
=[DD]{f}.\displaystyle=\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}}. (88)

Equation (87) follows from (69), and the inequality in (86) cannot be strict. It follows that

C(f,s)\displaystyle C_{-}(f,s) =liminfnC(f,sn)=limsupnC(f,sn),\displaystyle=\underset{n\to\infty}{\mathrm{liminf}}\quad C_{-}(f,s_{n})=\underset{n\to\infty}{\mathrm{limsup}}\quad C_{-}(f,s_{n}),
C+(f,s)\displaystyle C_{+}(f,s) =liminfnC+(f,sn)=limsupnC+(f,sn).\displaystyle=\underset{n\to\infty}{\mathrm{liminf}}\quad C_{+}(f,s_{n})=\underset{n\to\infty}{\mathrm{limsup}}\quad C_{+}(f,s_{n}). (89)

II-D Extreme Point Characterization

We are now in a position to state our main result, Theorem 3.

Theorem 3.

The extreme points of the unit ball in DD,0+(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathrm{K}), equipped with θ\left\lVert\cdot\right\rVert_{\theta}, are exactly of the form a𝟙Aa\mathbbm{1}_{A} with a=1𝟙Aθa=\frac{1}{\left\lVert\mathbbm{1}_{A}\right\rVert_{\theta}} and,

  • if θ=1,\theta=1, then ADDA\in\mathcal{E}_{\mathrm{D}\otimes\mathrm{D}};

  • if θ]0,1[,\theta\in]0,1[, then A.A\in\mathcal{E}_{\nabla}.

Proof of Theorem 3.

Step 1. We show that an extreme point must be of the form a𝟙Aa\mathbbm{1}_{A}. Let ff be an extreme point and consider the functions

F(s)=θC(f,s)+(1θ)P(f,s).F_{-}(s)=\theta C_{-}(f,s)+(1-\theta)P_{-}(f,s). (90)
F+(s)=θC+(f,s)+(1θ)P+(f,s).F_{+}(s)=\theta C_{+}(f,s)+(1-\theta)P_{+}(f,s). (91)

We know that F()F_{-}(\cdot) and F+()F_{+}(\cdot) are continuous and respectively increasing and decreasing. In addition, they satisfy the equality

1=fθ=F(s)+F+(s).1=\left\lVert f\right\rVert_{\theta}=F_{-}(s)+F_{+}(s). (92)

By continuity, there exists s1s_{1} such that F(s1)=F+(s1)=0.5F_{-}(s_{1})=F_{+}(s_{1})=0.5, with the decomposition

f=12 2min(f,s1)+12 2max(fs1,0).f=\frac{1}{2}\,2\text{min}(f,s_{1})+\frac{1}{2}\,2\text{max}(f-s_{1},0). (93)

We find that

2min(f,s1)θ=2F(f,s1)=1\left\lVert 2\text{min}(f,s_{1})\right\rVert_{\theta}=2F_{-}(f,s_{1})=1 (94)

and that

2max(fs1,0)θ=2F+(f,s1)=1.\left\lVert 2\text{max}(f-s_{1},0)\right\rVert_{\theta}=2F_{+}(f,s_{1})=1. (95)

Since ff is an extreme point, we must have that

2max(fs1,0)=2min(f,s1).2\text{max}(f-s_{1},0)=2\text{min}(f,s_{1}). (96)

If fs1f\geq s_{1}, then f=2s1f=2s_{1}. If fs1f\leq s_{1}, then f=0.f=0. These two equations imply that ff has to be of the form a𝟙Aa\mathbbm{1}_{A} with a=1𝟙Aθ.a=\frac{1}{\left\lVert\mathbbm{1}_{A}\right\rVert_{\theta}}. Finally, if θ]0,1[,\theta\in]0,1[, we assume by contradiction that EE is not \nabla-indecomposable. Then, it is not [DD][\mathrm{D}\otimes\mathrm{D}]-indecomposable and there exist two sets A1,A2A_{1},A_{2}, such that E=A1A2E=A_{1}\cup A_{2},

[DD]{𝟙A}=[DD]{𝟙A1}+[DD]{𝟙A2},\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{A}\}\right\rVert_{\mathcal{M}}\\ =\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{A_{1}}\}\right\rVert_{\mathcal{M}}+\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathbbm{1}_{A_{2}}\}\right\rVert_{\mathcal{M}}, (97)

and

{𝟙A}2={𝟙A1}2+{𝟙A2}2.\left\lVert\nabla\{\mathbbm{1}_{A}\}\right\rVert_{\mathcal{M}^{2}}=\left\lVert\nabla\{\mathbbm{1}_{A_{1}}\}\right\rVert_{\mathcal{M}^{2}}+\left\lVert\nabla\{\mathbbm{1}_{A_{2}}\}\right\rVert_{\mathcal{M}^{2}}. (98)

We set f1=𝟙A1𝟙A1θf_{1}=\frac{\mathbbm{1}_{A_{1}}}{\left\lVert\mathbbm{1}_{A_{1}}\right\rVert_{\theta}}, f2=𝟙A2𝟙A2θf_{2}=\frac{\mathbbm{1}_{A_{2}}}{\left\lVert\mathbbm{1}_{A_{2}}\right\rVert_{\theta}} and find that

a𝟙A=a𝟙A1θ𝟙A1+a𝟙A2θ𝟙A2,a\mathbbm{1}_{A}=a\left\lVert\mathbbm{1}_{A_{1}}\right\rVert_{\theta}\mathbbm{1}_{A_{1}}+a\left\lVert\mathbbm{1}_{A_{2}}\right\rVert_{\theta}\mathbbm{1}_{A_{2}}, (99)

which is in contradiction with the assumption that a𝟙Aa\mathbbm{1}_{A} is an extreme point. The case θ=1\theta=1 is similar and omitted.
Step 2. We show that a𝟙Aa\mathbbm{1}_{A} is an extreme point. Assume by contradiction that it is not. Then, there exists λ]0,1[\lambda\in\,]0,1[ and two functions f1f2f_{1}\neq f_{2} in the unit ball such that

a𝟙A=λf1+(1λ)f2.a\mathbbm{1}_{A}=\lambda f_{1}+(1-\lambda)f_{2}. (100)

Let E[0,1]2E\subset[0,1]^{2} be a measurable set. We define

θ,E=θ[DD]{}(E)+(1θ){}(E)2.\left\lVert\cdot\right\rVert_{\theta,E}=\theta\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\cdot\}\right\rVert_{\mathcal{M}(E)}+(1-\theta)\left\lVert\nabla\{\cdot\}\right\rVert_{\mathcal{M}(E)^{2}}. (101)

We must have

a𝟙Aθ,E=λf1θ,E+(1λ)f2θ,E,\left\lVert a\mathbbm{1}_{A}\right\rVert_{\theta,E}=\lambda\left\lVert f_{1}\right\rVert_{\theta,E}+(1-\lambda)\left\lVert f_{2}\right\rVert_{\theta,E}, (102)

otherwise we would reach the contradiction

1\displaystyle 1 =a𝟙Aθ,E+a𝟙Eθ,Ec\displaystyle=\left\lVert a\mathbbm{1}_{A}\right\rVert_{\theta,E}+\left\lVert a\mathbbm{1}_{E}\right\rVert_{\theta,E^{c}}
<λf1θ,E+(1λ)f2θ,E\displaystyle<\lambda\left\lVert f_{1}\right\rVert_{\theta,E}+(1-\lambda)\left\lVert f_{2}\right\rVert_{\theta,E}
+λf1θ,Ec+(1λ)f2θ,Ec=1,\displaystyle\quad+\lambda\left\lVert f_{1}\right\rVert_{\theta,E^{c}}+(1-\lambda)\left\lVert f_{2}\right\rVert_{\theta,E^{c}}=1, (103)

which is impossible.

If θ]0,1[\theta\in\,]0,1[, then (102) implies that supp{{fi}}A.\mathrm{supp}\{\nabla\{f_{i}\}\}\subset\partial A. In particular, {fi}\nabla\{f_{i}\} must vanish in the interior A\overset{\circ}{A}. It follows from the Constancy Theorem [61], [23, Lemma 4.6] that fif_{i} is constant on AA. Finally, since a𝟙A=λf1+(1λ)f2a\mathbbm{1}_{A}=\lambda f_{1}+(1-\lambda)f_{2} with f10f_{1}\geq 0 and f20f_{2}\geq 0, it follows that f1f_{1} and f2f_{2} must vanish on 2A\mathbb{R}^{2}\setminus A and satisfy f1=f2=a𝟙Af_{1}=f_{2}=a\mathbbm{1}_{A}. This contradicts the assumption that f1f2f_{1}\neq f_{2}.

If θ=1\theta=1, then supp{[DD]{f1}}\mathrm{supp}\{[\mathrm{D}\otimes\mathrm{D}]\{f_{1}\}\} and supp{[DD]{f2}}\mathrm{supp}\{[\mathrm{D}\otimes\mathrm{D}]\{f_{2}\}\} must be subsets of the finite set of corners of AA, denoted by AcA_{c}. We use EcE_{c} with subscript cc to denote the set of corners supp([DD]{1E})\mathrm{supp}([\mathrm{D}\otimes\mathrm{D}]\{1_{E}\}) of a set EE. The functions f1f_{1} and f2f_{2} must be piecewise constant on AA and vanish outside. Assume, by contradiction, that f1f_{1} and f2f_{2} are not constant on AA. Then,

fi=k=1Kai,k𝟙Ek,withai,kai,k,f_{i}=\sum_{k=1}^{K}a_{i,k}\mathbbm{1}_{E_{k}},\quad\text{with}a_{i,k}\neq a_{i,k^{\prime}}, (104)

and, by construction, a2,k=aa1,ka_{2,k}=a-a_{1,k}. We claim that there exists a k[1K]k\in[1\cdots K], wlog k=1k=1, such that

{AcE1,cAE.\begin{cases}A_{c}\subset E_{1,c}\\ \partial A\subset\partial E.\end{cases} (105)

Assume by contradiction that there exists no such kk. Then, there exists a corner x~\tilde{x} and two indices (k,k)(k,k^{\prime}), wlog (k,k)=(1,2)(k,k^{\prime})=(1,2), such that

{x~Ax~Acx~E1,cx~E2,c.\begin{cases}\tilde{x}\in\partial A\\ \tilde{x}\notin A_{c}\\ \tilde{x}\in E_{1,c}\\ \tilde{x}\in E_{2,c}.\end{cases} (106)

This implies that x~supp([DD]{fi})\tilde{x}\notin\mathrm{supp}([\mathrm{D}\otimes\mathrm{D}]\{f_{i}\}) and, therefore, that one must have ai,1=ai,2a_{i,1}=a_{i,2} for the associated Dirac mass to cancel each other. We apply the same argumentation to find an index k, wlog k=2k=2, such that

{E1AE2E1,cAcE2,c.\begin{cases}\partial E_{1}\setminus\partial A\subset\partial E_{2}\\ E_{1,c}\setminus A_{c}\subset E_{2,c}.\end{cases} (107)

In particular, E1E_{1} and E2E_{2} share a corner x~supp([DD]{fi})\tilde{x}\notin\mathrm{supp}([\mathrm{D}\otimes\mathrm{D}]\{f_{i}\}), which therefore must be cancelled. This happens if and only if ai,1=ai,2a_{i,1}=a_{i,2}, which is a contradiction. Thus, f1=a1,1E1f_{1}=a_{1,1}E_{1}, f2=a2,1E1,f_{2}=a_{2,1}E_{1}, and a1,1=a2,1a_{1,1}=a_{2,1}. Finally f1=f2f_{1}=f_{2}, which is a contradiction with the initial assumption.

In Corollary 1, we leverage the characterisation of extreme points from Theorem 3 to analyse the solution set of a data-fidelity term regularised by θ.\left\lVert\cdot\right\rVert_{\theta}.

Corollary 1.

Let the loss functional 𝒥\mathcal{J} be defined as

𝒥()=E(𝐲,,𝝂)+λθ\mathcal{J}(\cdot)=E(\mathbf{y},\langle\cdot,\bm{\nu}\rangle)+\lambda\left\lVert\cdot\right\rVert_{\theta} (108)

with

  • 𝐲M\mathbf{y}\in\mathbb{R}^{M} is a fixed-dimensional vector representing the available measurements;

  • the data-fidelity functional E:M×M+{}E:\mathbb{R}^{M}\times\mathbb{R}^{M}\to\mathbb{R}^{+}\cup\{\infty\} is proper, coercive, continuous, and strictly convex in its second argument;

  • the measurement operator ,𝝂:DD,0+(K)M\langle\cdot,\bm{\nu}\rangle:\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathrm{K})\to\mathbb{R}^{M} is linear and weak\mathrm{weak}^{\star} continuous.

Then, the solution set

𝒱=argminfDD,0+(K)𝒥(f)\mathcal{V}=\underset{f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathrm{K})}{\operatorname{argmin}}\,\mathcal{J}(f) (109)

is nonempty, convex, and weak\mathrm{weak}^{\star}-compact. Moreover, its extreme points are of the form

k=1Kak𝟙Ek,withKM,\sum_{k=1}^{K}a_{k}\mathbbm{1}_{E_{k}},\quad\text{with}\quad K\leq M, (110)

where EkE_{k}\in\mathcal{E}_{\nabla} if θ]0,1[\theta\in]0,1[, and EkDDE_{k}\in\mathcal{E}_{\mathrm{D}\otimes\mathrm{D}} if θ=1\theta=1.

Proof of Corollary 1.

Define 𝒥=inffDD,0+(K)𝒥(f)\mathcal{J}^{\star}=\underset{f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathrm{K})}{\text{inf}}\mathcal{J}(f) and consider a sequence of solutions (fn)n=1(f_{n})_{n=1}^{\infty} such that

<limn𝒥(fn)=𝒥<,-\infty<\underset{n\to\infty}{\text{lim}}\mathcal{J}(f_{n})=\mathcal{J}^{\star}<\infty, (111)

which exists because 𝒥\mathcal{J} is coercive and proper. It follows that (fn)n=1(f_{n})_{n=1}^{\infty} is bounded in θ\left\lVert\cdot\right\rVert_{\theta} and, by norm equivalence, bounded in [DD]{}\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\cdot\}\right\rVert_{\mathcal{M}}. According to the Banach–Alaoglu theorem, up to a subsequence, (fn)n=1(f_{n})_{n=1}^{\infty} is weak\mathrm{weak}^{\star}-convergent to some limit ff. It follows from the weak\mathrm{weak}^{\star}-lower semicontinuity of 𝒥\mathcal{J} that

limn𝒥(fn)=𝒥(f)=𝒥.\underset{n\to\infty}{\text{lim}}\mathcal{J}(f_{n})=\mathcal{J}(f)=\mathcal{J}^{\star}. (112)

Consequently, 𝒱\mathcal{V} is nonempty, and the characterisation of its extreme points follows from [22, Corollary 3.8] and Theorem 3. ∎

The norm [DD]{}\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\cdot\}\right\rVert_{\mathcal{M}} penalises the number of corners of the sets EkE_{k}, whereas {}2\left\lVert\nabla\{\cdot\}\right\rVert_{\mathcal{M}^{2}} penalises their perimeter. Each has a distinct regularisation effect on the solutions of the optimisation problem. Informally, the former promotes solutions with simpler sets EkE_{k}, while the latter favours solutions with smaller sets EkE_{k}. This effect is proportional to θ\theta, which, in practice, must be tuned.

We conclude this section with Proposition 6, which identifies an important class of weak\mathrm{weak}^{\star}-continuous measurement operators ,𝝂\langle\cdot,\bm{\nu}\rangle.

Proposition 6.

If 𝛎=(νm)m=1M\bm{\nu}=(\nu_{m})_{m=1}^{M} is such that νm(K)\nu_{m}\in\mathcal{L}_{\infty}(\mathrm{K}), then the operator

,𝝂:DD,0(K)M,f(f,νm)m=1M\langle\cdot,\bm{\nu}\rangle:\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K})\to\mathbb{R}^{M},\quad f\mapsto(\langle f,\nu_{m}\rangle)_{m=1}^{M} (113)

is weak\mathrm{weak}^{\star}-continuous.

Proof of Proposition 6.

From [32] we have

DD,0(K)DD(K)DD(2).\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K})\subset\mathcal{M}_{\mathrm{D}\otimes\mathrm{D}}(\mathrm{K})\subset\mathcal{M}_{\mathrm{D}\otimes\mathrm{D}}(\mathbb{R}^{2}). (114)

The space DD(2)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D}}(\mathbb{R}^{2}), which is essentially the space of distributions f𝒟(2)f\in\mathcal{D}^{\prime}(\mathbb{R}^{2}) such that [DD]{f}(2)[\mathrm{D}\otimes\mathrm{D}]\{f\}\in\mathcal{M}(\mathbb{R}^{2}), is the dual space of 𝒞DD(2)\mathcal{C}_{\mathrm{D}\otimes\mathrm{D}}(\mathbb{R}^{2}). In particular, it is shown in [32] that (K)𝒞DD(2)\mathcal{L}_{\infty}(\mathrm{K})\subset\mathcal{C}_{\mathrm{D}\otimes\mathrm{D}}(\mathbb{R}^{2}). This implies that if 𝝂=(νm)m=1M\bm{\nu}=(\nu_{m})_{m=1}^{M} is such that νm(K)\nu_{m}\in\mathcal{L}_{\infty}(\mathrm{K}), then the operator

,𝝂:DD(2)M,f(f,νm)m=1M\langle\cdot,\bm{\nu}\rangle:\mathcal{M}_{\mathrm{D}\otimes\mathrm{D}}(\mathbb{R}^{2})\to\mathbb{R}^{M},\quad f\mapsto(\langle f,\nu_{m}\rangle)_{m=1}^{M} (115)

is weak\mathrm{weak}^{\star}-continuous. This, together with (114), completes the proof. ∎

III Resolution of the Optimization Problem

The resolution of the problem in (109) is not directly feasible, as the underlying search space DD,0+(K)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathrm{K}) is infinite-dimensional. Moreover, the fact that the edges and corners of EkE_{k} in (110) may lie anywhere in K\mathrm{K} makes their estimation challenging.

III-A Pixel-Based Exact Discretization

We discretize the search space into a finite-dimensional cone, of functions of the form (110), with corners that lie on the dyadic grid

𝒳n=2n[02n1]×2n[02n1],\mathcal{X}_{n}=2^{-n}[0\cdots 2^{n}-1]\times 2^{-n}[0\cdots 2^{n}-1], (116)

with (2n+1)2(2^{n}+1)^{2} knots, and associated pixels

En1,n2n=[(n11)2n,n12n]×[(n21)2n,n22n].E^{n}_{n_{1},n_{2}}=\left[\frac{(n_{1}-1)}{2^{n}},\frac{n_{1}}{2^{n}}\right]\times\left[\frac{(n_{2}-1)}{2^{n}},\frac{n_{2}}{2^{n}}\right]. (117)

The discretized search space DD,0+(𝒳n)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathcal{X}_{n}) is

{fDD,0+(K):[DD]{f}(𝒳n)},\left\{f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathrm{K}):\quad[\mathrm{D}\otimes\mathrm{D}]\{f\}\in\mathcal{M}(\mathcal{X}_{n})\right\}, (118)

and the associated discretized optimization problem is as in

𝒱n=argminfDD,0+(𝒳n)(E(𝐲,f,𝝂)+λfθ).\mathcal{V}_{n}=\underset{f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathcal{X}_{n})}{\text{argmin}}\left(E(\mathbf{y},\langle f,\bm{\nu}\rangle)+\lambda\left\lVert f\right\rVert_{\theta}\right). (119)

Our objective is now to reformulate (119) as an optimisation problem over +2n×2n\mathbb{R}^{2^{n}\times 2^{n}}_{+}, where +=[0,[\mathbb{R}_{+}=[0,\infty[. We note that any function fDD,0+(𝒳n)f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathcal{X}_{n}) can be uniquely expressed as

f=n1,n2=1,12n,2nan1,n2𝟙En1,n2n,𝐚=(an1,n2)n1,n2=1,12n,2n+2n×2n,f=\sum_{n_{1},n_{2}=1,1}^{2^{n},2^{n}}a_{n_{1},n_{2}}\mathbbm{1}_{E_{n_{1},n_{2}}^{n}},\\ \mathbf{a}=(a_{n_{1},n_{2}})_{n_{1},n_{2}=1,1}^{2^{n},2^{n}}\in\mathbb{R}^{2^{n}\times 2^{n}}_{+}, (120)

which corresponds to the tensor product of B-splines of order 1. Furthermore, we introduce the synthesis operator T\mathrm{T}

T:DD,0+(𝒳n)+2n×2n,f(22nf,𝟙En1,n2n)n1,n2=1,12n,2n,\mathrm{T}:\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathcal{X}_{n})\to\mathbb{R}_{+}^{2^{n}\times 2^{n}},\\ f\mapsto\left(2^{2n}\langle f,\mathbbm{1}_{E^{n}_{n_{1},n_{2}}}\rangle\right)_{n_{1},n_{2}=1,1}^{2^{n},2^{n}}, (121)

and its adjoint, the analysis operator T\mathrm{T}^{\star}

T:+2n×2nDD,0+(𝒳n),𝐚n1,n2=1,12n,2nan1,n2𝟙En1,n2n.\mathrm{T}^{\star}:\mathbb{R}_{+}^{2^{n}\times 2^{n}}\to\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathcal{X}_{n}),\\ \mathbf{a}\mapsto\sum_{n_{1},n_{2}=1,1}^{2^{n},2^{n}}a_{n_{1},n_{2}}\mathbbm{1}_{E_{n_{1},n_{2}}^{n}}. (122)

The operators T\mathrm{T} and T\mathrm{T}^{\star} are implicitly dependent on the grid 𝒳n\mathcal{X}_{n}. In the sequel, we shall occasionally regard 𝐚\mathbf{a} as a sequence indexed over 2\mathbb{Z}^{2}, in which case its entries will be denoted by 𝐚[,]\mathbf{a}[\cdot,\cdot] and taken to be equal to 0 outside [02n]×[02n][0\cdots 2^{n}]\times[0\cdots 2^{n}]. We consider the new optimisation problem

𝒱~n=argmin𝐚+2n×2n(E(𝐲,T{𝐚},𝝂)+λ𝐚θ),\tilde{\mathcal{V}}_{n}=\underset{\mathbf{a}\in\mathbb{R}^{2^{n}\times 2^{n}}_{+}}{\mathrm{argmin}}\left(E(\mathbf{y},\langle\mathrm{T}^{\star}\{\mathbf{a}\},\bm{\nu}\rangle)+\lambda\left\lVert\mathbf{a}\right\rVert_{\theta}\right), (123)

where

𝐚θ=(1θ)2n(𝒉1,0𝐚1+𝒉0,1𝐚1)+θ𝒉1,1𝐚1,\left\lVert\mathbf{a}\right\rVert_{\theta}=(1-\theta)2^{-n}\left(\left\lVert\bm{h}_{1,0}\ast\mathbf{a}\right\rVert_{\ell_{1}}+\left\lVert\bm{h}_{0,1}\ast\mathbf{a}\right\rVert_{\ell_{1}}\right)\\ +\theta\left\lVert\bm{h}_{1,1}\ast\mathbf{a}\right\rVert_{\ell_{1}}, (124)

with

𝒉1,1=[1111],𝒉1,0=[11],𝒉0,1=[11].\bm{h}_{1,1}=\begin{bmatrix}1&-1\\ -1&1\end{bmatrix},\quad\bm{h}_{1,0}=\begin{bmatrix}1\\ -1\end{bmatrix},\quad\bm{h}_{0,1}=\begin{bmatrix}1&-1\\ \end{bmatrix}. (125)

A key feature of this formulation, as shown in Proposition 7, is that the problem in (123) is the exact discretisation of the problem in (119), i.e., it incurs no discretisation error.

Proposition 7.

The optimization problem in (123) is equivalent to the one in (119), in the sense that,

f𝒱n\displaystyle f\in\mathcal{V}_{n} T{f}𝒱~n.\displaystyle\Leftrightarrow\mathrm{T}\{f\}\in\tilde{\mathcal{V}}_{n}. (126)
Proof of Proposition 7.

Let 𝐚=T{f}\mathbf{a}=\mathrm{T}\{f\}. Observing that T\mathrm{T}^{\star} is the inverse of T\mathrm{T}, it suffices to prove that

[DD]{f}\displaystyle\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}} =𝒉1,1𝐚1,\displaystyle=\left\lVert\bm{h}_{1,1}\ast\mathbf{a}\right\rVert_{\ell_{1}}, (127)
[ID]{f}\displaystyle\left\lVert[\mathrm{I}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}} =2n𝒉0,1𝐚1,\displaystyle=2^{-n}\left\lVert\bm{h}_{0,1}\ast\mathbf{a}\right\rVert_{\ell_{1}}, (128)
[DI]{f}\displaystyle\left\lVert[\mathrm{D}\otimes\mathrm{I}]\{f\}\right\rVert_{\mathcal{M}} =2n𝒉1,0𝐚1.\displaystyle=2^{-n}\left\lVert\bm{h}_{1,0}\ast\mathbf{a}\right\rVert_{\ell_{1}}. (129)

We prove only (127), as the reasoning for the other two is analogous. We observe that

𝟙En1,n2n=β(2n(n11))β(2n(n21)),\displaystyle\mathbbm{1}_{E_{n_{1},n_{2}}^{n}}=\beta\left(2^{n}\cdot-(n_{1}-1)\right)\otimes\beta\left(2^{n}\cdot-(n_{2}-1)\right), (130)

and calculate that

[DD]{f}\displaystyle[\mathrm{D}\otimes\mathrm{D}]\{f\}
=\displaystyle= n1,n2=1,12n,2n𝐚[n1,n2](δn12nδ(n11)2n)(δn22nδ(n21)2n)\displaystyle\sum_{n_{1},n_{2}=1,1}^{2^{n},2^{n}}\mathbf{a}[n_{1},n_{2}]\left(\delta_{\frac{n_{1}}{2^{n}}}-\delta_{\frac{(n_{1}-1)}{2^{n}}}\right)\otimes\left(\delta_{\frac{n_{2}}{2^{n}}}-\delta_{\frac{(n_{2}-1)}{2^{n}}}\right)
=\displaystyle= (n1,n2)2𝐚[n1,n2](δn12nδ(n11)2n)(δn22nδ(n21)2n)\displaystyle\sum_{(n_{1},n_{2})\in\mathbb{Z}^{2}}\mathbf{a}[n_{1},n_{2}]\left(\delta_{\frac{n_{1}}{2^{n}}}-\delta_{\frac{(n_{1}-1)}{2^{n}}}\right)\otimes\left(\delta_{\frac{n_{2}}{2^{n}}}-\delta_{\frac{(n_{2}-1)}{2^{n}}}\right)
=\displaystyle= (n1,n2)2δn12n,n22n(𝐚[n1,n2]𝐚[n1+1,n2]\displaystyle\sum_{(n_{1},n_{2})\in\mathbb{Z}^{2}}\delta_{\frac{n_{1}}{2^{n}},\frac{n_{2}}{2^{n}}}\bigg(\mathbf{a}[n_{1},n_{2}]-\mathbf{a}[n_{1}+1,n_{2}]
𝐚[n1,n2+1]+𝐚[n1+1,n2+1]).\displaystyle-\mathbf{a}[n_{1},n_{2}+1]+\mathbf{a}[n_{1}+1,n_{2}+1]\bigg). (131)

Thus

[DD]{f}=(n1,n2)2|(𝒉1,1𝐚)[n1,n2]|.\displaystyle\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}}=\sum_{(n_{1},n_{2})\in\mathbb{Z}^{2}}\big|(\bm{h}_{1,1}\ast\mathbf{a})[n_{1},n_{2}]\big|. (132)

Proposition 7 enables one to obtain a solution fn𝒱nf_{n}^{\star}\in\mathcal{V}_{n} by solving the problem in (123). Moreover, since the operator

T{},𝝂:+2n×2nM\langle\mathrm{T}^{\star}\{\cdot\},\bm{\nu}\rangle:\mathbb{R}^{2^{n}\times 2^{n}}_{+}\to\mathbb{R}^{M} (133)

is linear, when 𝐚\mathbf{a} is vectorised, the operator in (133) can be represented by a matrix. Consequently, (123) can be solved using a primal–dual algorithm [16] or a double-staged FISTA [62].

III-B Convergence

The solution fn𝒱nf_{n}^{\star}\in\mathcal{V}_{n} obtained by solving the discretized optimisation problem (119) is only an approximation of some f𝒱f^{\star}\in\mathcal{V}. Therefore, for this exact grid-based discretisation to be reliable and practical, one must be confident that, in some sense,

limnfn=f.\underset{n\to\infty}{\lim}f_{n}^{\star}=f^{\star}. (134)

This convergence is established in Theorem 4.

Theorem 4.

For any sequence (fn)n=1(f_{n}^{\star})_{n=1}^{\infty} of discretized solutions with fn𝒱n,f_{n}^{\star}\in\mathcal{V}_{n}, and for any solution f𝒱f^{\star}\in\mathcal{V}, the following convergences hold:

  • 1.

    over the loss functional, limn𝒥(fn)=𝒥(f)\underset{n\to\infty}{\mathrm{lim}}\mathcal{J}(f_{n}^{\star})=\mathcal{J}(f^{\star});

  • 2.

    over the simulated measurements, limnfn,𝝂=f,𝝂\underset{n\to\infty}{\mathrm{lim}}\langle f_{n}^{\star},\bm{\nu}\rangle=\langle f^{\star},\bm{\nu}\rangle.

Proof of Theorem 4.

Item 1. We fix f𝒱f^{\star}\in\mathcal{V} and use Lemma 1 to find a sequence (f~n)n=1(\tilde{f}_{n})_{n=1}^{\infty} with f~nDD,0+(𝒳n)\tilde{f}_{n}\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathcal{X}_{n}) that is weak\mathrm{weak}^{\star}-convergent to ff^{\star} and satisfies fθ=limnf~nθ\left\lVert f^{\star}\right\rVert_{\theta}=\underset{n\to\infty}{\lim}\left\lVert\tilde{f}_{n}\right\rVert_{\theta}. Then, the weak\mathrm{weak}^{\star}-convergence implies that

limnE(𝐲,f~n,𝝂)=E(𝐲,f,𝝂)limn𝒥(f~n)=𝒥(f).\underset{n\to\infty}{\lim}E(\mathbf{y},\langle\tilde{f}_{n},\bm{\nu}\rangle)=E(\mathbf{y},\langle f^{\star},\bm{\nu}\rangle)\\ \Rightarrow\underset{n\to\infty}{\lim}\mathcal{J}(\tilde{f}_{n})=\mathcal{J}(f^{\star}). (135)

It follows from (135) that

𝒥(f)𝒥(fn)𝒥(f~n)limn𝒥(fn)=𝒥(f).\mathcal{J}(f^{\star})\leq\mathcal{J}(f_{n}^{\star})\leq\mathcal{J}(\tilde{f}_{n})\Rightarrow\underset{n\to\infty}{\lim}\mathcal{J}(f_{n}^{\star})=\mathcal{J}(f^{\star}). (136)

Item 2. The convergence in Item 1 implies that the sequence (𝒥(fn))n=1(\mathcal{J}(f_{n}^{\star}))_{n=1}^{\infty} is bounded. In turn, this boundedness, combined with the assumptions on EE, implies that the sequence (fn)n=1(f_{n}^{\star})_{n=1}^{\infty} is bounded in θ\left\lVert\cdot\right\rVert_{\theta}, and therefore in [DD]{}\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\cdot\}\right\rVert_{\mathcal{M}}. By the Banach-Alaoglu theorem, we extract a weak\mathrm{weak}^{\star}-convergent subsequence (fnm)m=1(f_{n_{m}}^{\star})_{m=1}^{\infty} and find that

limmfnm,𝝂=f,𝝂\underset{m\to\infty}{\lim}\langle f_{n_{m}}^{\star},\bm{\nu}\rangle=\langle f^{\star\star},\bm{\nu}\rangle (137)

for some f𝒱f^{\star\star}\in\mathcal{V}. If one assumes by contradiction that the sequence (fn,𝝂)n=1(\langle f_{n},\bm{\nu}\rangle)_{n=1}^{\infty} does not converge to f,𝝂\langle f^{\star\star},\bm{\nu}\rangle, then a classical argument [30, Theorem 3], which leverages the strict convexity of E(𝐲,)E(\mathbf{y},\cdot), leads to a contradiction. ∎

IV Application to Denoising

IV-A General Description

Our goal is to denoise an image 𝐲\mathbf{y}, which is assumed to be square and composed of pixels

En1,n2N=[n112N,n12N]×[n212N,n22N]E_{n_{1},n_{2}}^{N}=\left[\frac{n_{1}-1}{2^{N}},\frac{n_{1}}{2^{N}}\right]\times\left[\frac{n_{2}-1}{2^{N}},\frac{n_{2}}{2^{N}}\right] (138)

at resolution NN. The continuous-domain optimization problem is formulated as

𝒱=argminfDD,0+(K)(𝐲𝝂{f}22+λfθ),\mathcal{V}=\underset{f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathrm{K})}{\mathrm{argmin}}\left(\left\lVert\mathbf{y}-\bm{\nu}\{f\}\right\rVert_{2}^{2}+\lambda\left\lVert f\right\rVert_{\theta}\right), (139)

where 𝐲2N×2N\mathbf{y}\in\mathbb{R}^{2^{N}\times 2^{N}}, and 𝐧[12N]2\forall\mathbf{n}\in[1\cdots 2^{N}]^{2},

[𝝂{f}]𝐧=f,𝟙E𝐧N=E𝐧Nf(𝐱)d𝐱.[\bm{\nu}\{f\}]_{\mathbf{n}}=\langle f,\mathbbm{1}_{E_{\mathbf{n}}^{N}}\rangle=\int_{E_{\mathbf{n}}^{N}}f(\mathbf{x})\mathrm{d}\mathbf{x}. (140)

We know from Corollary 1 that 𝒱\mathcal{V} is nonempty, compact, and convex, and that its extreme points are of the form k=1Kak𝟙Ek\sum_{k=1}^{K}a_{k}\mathbbm{1}_{E_{k}} with K22NK\leq 2^{2N}. The discretized optimization problem is defined as

𝒱n=argminfDD,0+(𝒳n)(𝐲𝝂{f}22+λfθ),\mathcal{V}_{n}=\underset{f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathcal{X}_{n})}{\mathrm{argmin}}\left(\left\lVert\mathbf{y}-\bm{\nu}\{f\}\right\rVert_{2}^{2}+\lambda\left\lVert f\right\rVert_{\theta}\right), (141)

and a solution fn𝒱nf^{\star}_{n}\in\mathcal{V}_{n} can be obtained by solving (123). Finally, from Theorem 4, f𝒱\forall f^{\star}\in\mathcal{V} and 𝐧[12N]2\forall\mathbf{n}\in[1\cdots 2^{N}]^{2}, we obtain

limnE𝐧Nfn(𝐱)d𝐱=E𝐧Nf(𝐱)d𝐱.\underset{n\to\infty}{\lim}\int_{E_{\mathbf{n}}^{N}}f_{n}(\mathbf{x})\,\mathrm{d}\mathbf{x}=\int_{E_{\mathbf{n}}^{N}}f^{\star}(\mathbf{x})\,\mathrm{d}\mathbf{x}. (142)

IV-B Resolution on Finite Grid

We argue that, to solve the continuous-domain problem in (139), it is sufficient to find the solution fN𝒱Nf^{\star}_{N}\in\mathcal{V}_{N}. As a first step, we show in Theorem 5 that the sets of solutions (𝒱n)n=N(\mathcal{V}_{n})_{n=N}^{\infty} are embedded one into another.

Theorem 5.

For all N2N1NN_{2}\geq N_{1}\geq N, the inclusions hold true

𝒱N1𝒱N2𝒱.\mathcal{V}_{N_{1}}\subset\mathcal{V}_{N_{2}}\subset\mathcal{V}. (143)
Proof of Theorem 5.

We define the downsampling operator

R:DD,0+(𝒳n)\displaystyle\mathrm{R}:\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathcal{X}_{n}) DD,0+(𝒳n1)\displaystyle\to\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathcal{X}_{n-1})
𝐧=1,12n,2n𝐚[𝐧]𝟙E𝐧n\displaystyle\sum_{\mathbf{n}=1,1}^{2^{n},2^{n}}\mathbf{a}[\mathbf{n}]\mathbbm{1}_{E_{\mathbf{n}}^{n}} 𝐧=1,12n1,2n1𝐛[𝐧]𝟙E𝐧(n1)\displaystyle\mapsto\sum_{\mathbf{n}=1,1}^{2^{n-1},2^{n-1}}\mathbf{b}[\mathbf{n}]\mathbbm{1}_{E_{\mathbf{n}}^{(n-1)}} (144)

with 𝐛[𝐧]=𝐛[n1,n2]\mathbf{b}[\mathbf{n}]=\mathbf{b}[n_{1},n_{2}] equal to

14(𝐚[2n11,2n21]+𝐚[2n11,2n2]+𝐚[2n1,2n21]+𝐚[2n1,2n2]).\frac{1}{4}\big(\mathbf{a}[2n_{1}-1,2n_{2}-1]+\mathbf{a}[2n_{1}-1,2n_{2}]\\ +\mathbf{a}[2n_{1},2n_{2}-1]+\mathbf{a}[2n_{1},2n_{2}]\big). (145)

Step 1. We claim that, fDD,0+(𝒳N1)\forall f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathcal{X}_{N_{1}}) with N1>NN_{1}>N,

𝒥(R{f})𝒥(f).\mathcal{J}(\mathrm{R}\{f\})\leq\mathcal{J}(f). (146)

A straightforward calculation yields that f,𝟙E𝐧N\langle f,\mathbbm{1}_{E_{\mathbf{n}}^{N}}\rangle is equal to R{f},𝟙E𝐧N\langle\mathrm{R}\{f\},\mathbbm{1}_{E_{\mathbf{n}}^{N}}\rangle and, consequently, that

𝐲𝝂{f}22=𝐲𝝂{R{f}}22.\left\lVert\mathbf{y}-\bm{\nu}\{f\}\right\rVert_{2}^{2}=\left\lVert\mathbf{y}-\bm{\nu}\{\mathrm{R}\{f\}\}\right\rVert_{2}^{2}. (147)

Next, we get from (127) that

[DD]{f}\displaystyle\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}} =𝐧2|(𝒉1,1𝐚)[𝐧]|\displaystyle=\sum_{\mathbf{n}\in\mathbb{Z}^{2}}|(\bm{h}_{1,1}\ast\mathbf{a})[\mathbf{n}]|
[DD]{R{f}}\displaystyle\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathrm{R}\{f\}\}\right\rVert_{\mathcal{M}} =𝐧2|(𝒉1,1𝐛)[𝐧]|.\displaystyle=\sum_{\mathbf{n}\in\mathbb{Z}^{2}}|(\bm{h}_{1,1}\ast\mathbf{b})[\mathbf{n}]|. (148)

The injection of (145) in (148) and the use of several triangle inequalities (9 of them) yield that

|𝒉1,1𝐛|[n1,n2]\displaystyle|\bm{h}_{1,1}\ast\mathbf{b}|[n_{1},n_{2}]\leq 14|(𝒉1,1𝐚)[2n11,2n21]|\displaystyle\frac{1}{4}|(\bm{h}_{1,1}\ast\mathbf{a})[2n_{1}-1,2n_{2}-1]|
+\displaystyle+ 12|(𝒉1,1𝐚)[2n11,2n2]|\displaystyle\frac{1}{2}|(\bm{h}_{1,1}\ast\mathbf{a})[2n_{1}-1,2n_{2}]|
+\displaystyle+ 14|(𝒉1,1𝐚)[2n11,2n2+1]|\displaystyle\frac{1}{4}|(\bm{h}_{1,1}\ast\mathbf{a})[2n_{1}-1,2n_{2}+1]|
+\displaystyle+ 12|(𝒉1,1𝐚)[2n1,2n21]|\displaystyle\frac{1}{2}|(\bm{h}_{1,1}\ast\mathbf{a})[2n_{1},2n_{2}-1]|
+\displaystyle+ |(𝒉1,1𝐚)[2n1,2n2]|\displaystyle|(\bm{h}_{1,1}\ast\mathbf{a})[2n_{1},2n_{2}]|
+\displaystyle+ 12|(𝒉1,1𝐚)[2n1,2n2+1]|\displaystyle\frac{1}{2}|(\bm{h}_{1,1}\ast\mathbf{a})[2n_{1},2n_{2}+1]|
+\displaystyle+ 14|(𝒉1,1𝐚)[2n1+1,2n21]|\displaystyle\frac{1}{4}|(\bm{h}_{1,1}\ast\mathbf{a})[2n_{1}+1,2n_{2}-1]|
+\displaystyle+ 12|(𝒉1,1𝐚)[2n1+1,2n2]|\displaystyle\frac{1}{2}|(\bm{h}_{1,1}\ast\mathbf{a})[2n_{1}+1,2n_{2}]|
+\displaystyle+ 14|(𝒉1,1𝐚)[2n1+1,2n2+1]|.\displaystyle\frac{1}{4}|(\bm{h}_{1,1}\ast\mathbf{a})[2n_{1}+1,2n_{2}+1]|. (149)

Finally, the sum of (149) over (n1,n2)2(n_{1},n_{2})\in\mathbb{Z}^{2} yields that

[DD]{R{f}}[DD]{f},\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\mathrm{R}\{f\}\}\right\rVert_{\mathcal{M}}\leq\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}}, (150)

and a similar calculation would show that

{R{f}}2{f}2.\left\lVert\nabla\{\mathrm{R}\{f\}\}\right\rVert_{\mathcal{M}^{2}}\leq\left\lVert\nabla\{f\}\right\rVert_{\mathcal{M}^{2}}. (151)

The claim (146) follows from the combination of (147), (150), and (151).
Step 2. Let fN1𝒱N1f_{N_{1}}\in\mathcal{V}_{N_{1}} and fN2𝒱N2f_{N_{2}}\in\mathcal{V}_{N_{2}}. One can iterate Step 1 to find that

𝒥(fN1)𝒥(RN2N1{fN2})𝒥(fN2),\mathcal{J}(f_{N_{1}})\leq\mathcal{J}(\mathrm{R}^{N_{2}-N_{1}}\{f_{N_{2}}\})\leq\mathcal{J}(f_{N_{2}}), (152)

where the LHS inequality follows from the inclusion RN2N1{fN2}DD,0+(𝒳N1)\mathrm{R}^{N_{2}-N_{1}}\{f_{N_{2}}\}\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathcal{X}_{N_{1}}) and the RHS inequality follows from (146). In addition, it follows from the inclusion

DD,0+(𝒳N1)DD,0+(𝒳N2)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathcal{X}_{N_{1}})\subset\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathcal{X}_{N_{2}}) (153)

that 𝒥(fN2)𝒥(fN1)\mathcal{J}(f_{N_{2}})\leq\mathcal{J}(f_{N_{1}}), which, combined with (152), yields that 𝒥(fN2)=𝒥(fN1)\mathcal{J}(f_{N_{2}})=\mathcal{J}(f_{N_{1}}). This proves the inclusion 𝒱N1𝒱N2.\mathcal{V}_{N_{1}}\subset\mathcal{V}_{N_{2}}. Finally, if (fn)n=1(f_{n}^{\star})_{n=1}^{\infty} is a sequence of solutions with fn𝒱nf_{n}\in\mathcal{V}_{n}, then for some f𝒱f^{\star}\in\mathcal{V},

limn𝒥(fn)=𝒥(f).\underset{n\to\infty}{\text{lim}}\mathcal{J}(f_{n}^{\star})=\mathcal{J}(f^{\star}). (154)

One can iterate (146) to find that, nN2\forall n\geq N_{2},

𝒥(f)𝒥(fN2)𝒥(RnN2{fn})𝒥(fn),\mathcal{J}(f^{\star})\leq\mathcal{J}(f_{N_{2}})\leq\mathcal{J}(\mathrm{R}^{n-N_{2}}\{f_{n}\})\leq\mathcal{J}(f_{n}), (155)

The combination of Equation (155) and (154) yields the inclusion 𝒱N2𝒱.\mathcal{V}_{N_{2}}\subset\mathcal{V}.

Theorem 5 reveals that a solution of the continuous-domain problem can be found by solving the optimization problem on DD,0+(𝒳N)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathcal{X}_{N}). The optimization over DD,0+(𝒳N)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathcal{X}_{N}) has a unique solution.

Proposition 8.

There is a unique solution f𝒱f^{\star}\in\mathcal{V} whose knots lie on the grid 𝒳N\mathcal{X}_{N}, i.e.,

𝒱N={f}.\mathcal{V}_{N}=\{f^{\star}\}. (156)
Proof of Proposition 8.

We know from Theorem (4) that there is a unique value 𝐲2N×2N\mathbf{y}^{\star}\in\mathbb{R}^{2^{N}\times 2^{N}} such that, f𝒱N𝒱\forall f^{\star}\in\mathcal{V}_{N}\subset\mathcal{V},

𝐲=𝝂{f}.\mathbf{y}^{\star}=\bm{\nu}\{f^{\star}\}. (157)

We conclude with the observation that, on DD,0+(𝒳N)\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathcal{X}_{N}), the measurement operator 𝝂\bm{\nu} is a bijection. ∎

The combination of Theorem 5 and Proposition 8 yields that (123), specialised to the denoising task, has a unique solution which also happens to be a solution (through the synthesis operator) of the continuous-domain optimization problem.

IV-C Numerical Experiments

We fix the data fidelity as the squared difference and study the optimization problem in

𝒱λ,θ=argminfDD,0+(K)(12𝐲𝝂{f}22+λfθ),\mathcal{V}_{\lambda,\theta}=\underset{f\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}^{+}(\mathrm{K})}{\text{argmin}}\left(\frac{1}{2}\left\lVert\mathbf{y}-\bm{\nu}\{f\}\right\rVert_{2}^{2}+\lambda\left\lVert f\right\rVert_{\theta}\right), (158)

where 𝝂\bm{\nu} is defined in (142) and 𝐲2N×2N\mathbf{y}\in\mathbb{R}^{2^{N}\times 2^{N}} is the image to denoise. We discretize this optimization problem, as in (141), on the canonical pixel basis 𝒳N\mathcal{X}_{N} and denote by fλ,θ𝒱λ,θf_{\lambda,\theta}^{\star}\in\mathcal{V}_{\lambda,\theta} its unique solution (Theorem 5 and Proposition 8). Then, we consider the synthesis formulation (123) of the discretized optimization problem in

{𝐚λ,θ}=argmin𝐚+2N×2N(12𝐲𝐚22+λ𝒉θ𝐚1),\{\mathbf{a}_{\lambda,\theta}^{\star}\}=\underset{\mathbf{a}\in\mathbb{R}^{2^{N}\times 2^{N}}_{+}}{\text{argmin}}\left(\frac{1}{2}\left\lVert\mathbf{y}-\mathbf{a}\right\rVert_{2}^{2}+\lambda\left\lVert\bm{h}_{\theta}\ast\mathbf{a}\right\rVert_{1}\right), (159)

whose regularization term has been reparametrized for convenience, with

𝒉θ=[θ𝒉1,1(1θ2)𝒉1,0(1θ2)𝒉0,1].\bm{h}_{\theta}=\left[\begin{array}[]{rr}\theta\bm{h}_{1,1}\\ \left(1-\frac{\theta}{2}\right)\bm{h}_{1,0}\\ \left(1-\frac{\theta}{2}\right)\bm{h}_{0,1}\\ \end{array}\right]. (160)

The convolution in (159) is applied element wise on the rows of 𝒉θ.\bm{h}_{\theta}. Finally, since there is no closed form for the proximal operator of λ𝒉θ𝐚1\lambda\left\lVert\bm{h}_{\theta}\ast\mathbf{a}\right\rVert_{1}, we solve the dual problem. The latter is optimised with the FISTA algorithm [12]. In practice, our method extends without any issues to images that are not square. The denoising efficiency of our method, labelled MTV, is assessed on the BSD68 test set alongside two other benchmark methods. The results are summarised in Table I.

σ=5/255\sigma=5/255 σ=15/255\sigma=15/255 σ=25/255\sigma=25/255
TV 36.41 29.91 27.48
MTV 36.83 30.20 27.72
CRR-NN [63] 36.96 30.55 28.11
TABLE I: Denoising performance on the BSD68 test set. The chosen metric is the PSNR, averaged over the 68 images. The parameters λ\lambda and θ\theta for TV and MTV have been optimised to maximise the PSNR. The method CRR-NN is included for comparison. In this method, the authors trained 24 filters of size 15×1515\times 15, parametrised by neural networks, on a Gaussian-denoising task.

To investigate the choice of the hyper-parameter θ\theta, we denote by θσ\theta^{\star}_{\sigma} the value of θ\theta that maximises the PSNR for a specific image and noise level σ\sigma, and provide in Table II statistics on the distribution of θσ\theta^{\star}_{\sigma}.

θ5\theta^{\star}_{5} θ15\theta^{\star}_{15} θ25\theta^{\star}_{25} |θ15θ25||\theta^{\star}_{15}-\theta^{\star}_{25}|
Expectation 0.37 0.33 0.33 0.04
Standard deviation 0.07 0.08 0.12 0.04
TABLE II: Statistics of θσ\theta^{\star}_{\sigma} taken over 68 samples, corresponding to the optimal values of θ\theta for the 68 images of the BSD68 test set, for each noise level.

We observe from the expectations in Table II that our method does not reduce to the TV one (θ=0\theta=0). The optimal convex combination of \nabla-based and [DD][\mathrm{D}\otimes\mathrm{D}]-based regularizations is non-trivial and appears necessary for efficient denoising of natural images. Furthermore, the column |θ15θ25||\theta^{\star}_{15}-\theta^{\star}_{25}| shows that the choice of the optimal θ\theta is stable with respect to σ\sigma, which, in turn, implies that the optimal θ\theta depends mostly on the geometry of the image rather than on the noise level.

We observe that images in BSD68 for which MTV significantly outperforms TV, by more than 0.50.5 dB, typically feature corner-like structures, such as in Image 1.

Refer to caption
Figure 1: Image 21 in BSD68 dataset. For σ=25\sigma=25, the PSNR for MTV and TV are, respectively, 25.9625.96 and 25.2925.29.

To better illustrate this behaviour, we consider a 722×1920722\times 1920 image of New York City, free of rights, as shown in Figure 2. For the noise level σ=25/255\sigma=25/255, the TV and MTV methods yield PSNR values of 24.70 and 25.52, respectively. We observe that bands of constant values and corners are reconstructed better with MTV.

Refer to caption
Figure 2: From top to bottom, each line corresponds to the ground truth, the noisy image, the TV-denoised image, and the MTV-denoised image. Columns 2 and 4 are zooms of the ground truth. Columns 3 and 5 are zooms of Columns 2 and 4.

V Conclusion

In this paper, we have shown that the convex combination of the TV regularization with the tensor product term [DD]{}\left\lVert[\mathrm{D}\otimes\mathrm{D}]\{\cdot\}\right\rVert_{\mathcal{M}} promotes piecewise constant functions with edges parallel to the x1x_{1} or the x2x_{2} axis, as solutions to a continuous-domain optimization problem. In the particular case of denoising, we revealed that the optimization problem discretized on the canonical, pixel-based grid has a unique solution, which is also a solution of the continuous-domain problem. Finally, we demonstrated that the discretized problem can be solved with the same algorithm as for TV, with an additional filter, and that our new regularization significantly improves the reconstruction quality when the image has the appropriate underlying geometry.

Acknowledgment

Vincent Guillemet was supported by the Swiss National Science Foundation (SNSF) under Grant 200020_219356.

References

  • [1] M. Bertero, P. Boccacci, and C. De Mol, Introduction to Inverse Problems in Imaging. CRC press, 2021.
  • [2] M. T. McCann, M. Unser et al., “Biomedical image reconstruction: the foundations to deep neural networks,” Foundations and Trends® in Signal Processing, vol. 13, no. 3, pp. 283–359, 2019.
  • [3] H. Gupta, J. Fageot, and M. Unser, “Continuous-domain solutions of linear inverse problems with Tikhonov versus generalized TV regularization,” IEEE Transactions on Signal Processing, vol. 66, no. 17, pp. 4670–4684, 2018.
  • [4] M. Unser and J. Fageot, “Native Banach spaces for splines and variational inverse problems,” arXiv preprint arXiv:1904.10818, 2019.
  • [5] L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal algorithms,” Physica D: Nonlinear Phenomena, vol. 60, no. 1-4, pp. 259–268, 1992.
  • [6] L. Ambrosio, N. Fusco, and D. Pallara, Functions of Bounded Variation and Free Discontinuity Problems. Oxford University Press, 2000.
  • [7] L. Evans, Measure Theory and Fine Properties of Functions. Routledge, 2018.
  • [8] W. Rudin, Real and Complex Analysis, 3rd ed. New York: McGraw-Hill, 1987.
  • [9] D. L. Donoho, “Compressed sensing,” IEEE Transactions on Information Theory, vol. 52, no. 4, pp. 1289–1306, 2006.
  • [10] E. Candes and J. Romberg, “Sparsity and incoherence in compressive sampling,” Inverse Problems, vol. 23, no. 3, p. 969, 2007.
  • [11] A. M. Bruckstein, D. L. Donoho, and M. Elad, “From sparse solutions of systems of equations to sparse modeling of signals and images,” SIAM Review, vol. 51, no. 1, pp. 34–81, 2009.
  • [12] A. Beck and M. Teboulle, “Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems,” IEEE Transactions on Image Processing, vol. 18, no. 11, pp. 2419–2434, 2009.
  • [13] M. Zhu and T. Chan, “An efficient primal-dual hybrid gradient algorithm for total variation image restoration,” Ucla Cam Report, vol. 34, no. 2, 2008.
  • [14] E. Esser, X. Zhang, and T. F. Chan, “A general framework for a class of first order primal-dual algorithms for convex optimization in imaging science,” SIAM Journal on Imaging Sciences, vol. 3, no. 4, pp. 1015–1046, 2010.
  • [15] A. Chambolle and T. Pock, “A first-order primal-dual algorithm for convex problems with applications to imaging,” Journal of Mathematical Imaging and Vision, vol. 40, pp. 120–145, 2011.
  • [16] L. Condat, “A primal–dual splitting method for convex optimization involving lipschitzian, proximable and linear composite terms,” Journal of Optimization Theory and Applications, vol. 158, no. 2, pp. 460–479, 2013.
  • [17] R. Caccioppoli, “Misura e integrazione sugli insiemi dimensionalmente orientati,” Atti Accad. Naz. Lincei. Rend. Cl. Sci. Fis. Mat. Nat.(8), vol. 12, pp. 3–11, 1952.
  • [18] E. De Giorgi, “Su una teoria generale della misura (r- 1)-dimensionale in uno spazio ad r dimensioni,” Annali di Matematica Pura ed Applicata, vol. 36, pp. 191–213, 1954.
  • [19] A. Flinth and P. Weiss, “Exact solutions of infinite dimensional total-variation regularized problems,” Information and Inference: A Journal of the IMA, vol. 8, no. 3, pp. 407–443, 2019.
  • [20] M. Unser, J. Fageot, and J. P. Ward, “Splines are universal solutions of linear inverse problems with generalized TV regularization,” SIAM Review, vol. 59, no. 4, pp. 769–793, 2017.
  • [21] M. Unser, “A unifying representer theorem for inverse problems and machine learning,” Foundations of Computational Mathematics, vol. 21, no. 4, pp. 941–960, 2021.
  • [22] C. Boyer, A. Chambolle, Y. D. Castro, V. Duval, F. De Gournay, and P. Weiss, “On representer theorems and convex regularization,” SIAM Journal on Optimization, vol. 29, no. 2, pp. 1260–1281, 2019.
  • [23] K. Bredies and M. Carioni, “Sparsity of solutions for variational inverse problems with finite-dimensional data,” Calculus of Variations and Partial Differential Equations, vol. 59, no. 1, p. 14, 2020.
  • [24] W. H. Fleming, “Functions with generalized gradient and generalized surfaces,” Annali di Matematica Pura ed Applicata, vol. 44, no. 1, pp. 93–103, 1957.
  • [25] ——, “Functions whose partial derivatives are measures,” Illinois Journal of Mathematics, vol. 4, no. 3, pp. 452–478, 1960.
  • [26] L. Ambrosio, V. Caselles, S. Masnou, and J.-M. Morel, “Connected components of sets of finite perimeter and applications to image processing,” Journal of the European Mathematical Society, vol. 3, no. 1, pp. 39–92, 2001.
  • [27] M. Unser, “Splines: A perfect fit for signal and image processing,” IEEE Signal Processing Magazine, vol. 16, no. 6, pp. 22–38, 1999.
  • [28] T. Debarre, J. Fageot, H. Gupta, and M. Unser, “B-spline-based exact discretization of continuous-domain inverse problems with generalized TV regularization,” IEEE Transactions on Information Theory, vol. 65, no. 7, pp. 4457–4470, 2019.
  • [29] T. Debarre, Q. Denoyelle, and J. Fageot, “On the uniqueness of solutions for the basis pursuit in the continuum,” Inverse Problems, vol. 38, no. 12, p. 125005, 2022.
  • [30] V. Guillemet, J. Fageot, and M. Unser, “Convergence analysis of the discretization of continuous-domain inverse problems,” Inverse Problems, vol. 41, no. 4, p. 045008, 2025.
  • [31] J. Fageot, “Variational seasonal-trend decomposition with sparse continuous-domain regularization,” arXiv preprint arXiv:2505.10486, 2025.
  • [32] V. Guillemet and M. Unser, “Variational tensor product splines,” En Preparation, 2025.
  • [33] K. Bredies, K. Kunisch, and T. Pock, “Total generalized variation,” SIAM Journal on Imaging Sciences, vol. 3, no. 3, pp. 492–526, 2010.
  • [34] K. Bredies and M. Holler, “Regularization of linear inverse problems with total generalized variation,” Journal of Inverse and Ill-Posed Problems, vol. 22, no. 6, pp. 871–913, 2014.
  • [35] F. Demengel, “Fonctions à hessien borné,” in Annales de l’institut Fourier, vol. 34, no. 2, 1984, pp. 155–190.
  • [36] S. Lefkimmiatis, A. Bourquard, and M. Unser, “Hessian-based norm regularization for image restoration with biomedical applications,” IEEE Transactions on Image Processing, vol. 21, no. 3, pp. 983–995, 2011.
  • [37] S. Lefkimmiatis, J. P. Ward, and M. Unser, “Hessian Schatten-norm regularization for linear inverse problems,” IEEE Transactions on Image Processing, vol. 22, no. 5, pp. 1873–1888, 2013.
  • [38] S. Aziznejad, J. Campos, and M. Unser, “Measuring complexity of learning schemes using Hessian-Schatten total variation,” SIAM Journal on Mathematics of Data Science, vol. 5, no. 2, pp. 422–445, 2023.
  • [39] J. Campos, S. Aziznejad, and M. Unser, “Learning of continuous and piecewise-linear functions with Hessian total-variation regularization,” IEEE Open Journal of Signal Processing, vol. 3, pp. 36–48, 2022.
  • [40] M. Pourya, A. Goujon, and M. Unser, “Delaunay-triangulation-based learning with Hessian total-variation regularization,” IEEE Open Journal of Signal Processing, vol. 4, pp. 167–178, 2023.
  • [41] M. Pourya, A. Boquet-Pujadas, and M. Unser, “A box-spline framework for inverse problems with continuous-domain sparsity constraints,” IEEE Transactions on Computational Imaging, 2024.
  • [42] L. Ambrosio, C. Brena, and S. Conti, “Functions with bounded Hessian–Schatten variation: Density, variational, and extremality properties,” Archive for Rational Mechanics and Analysis, vol. 247, no. 6, p. 111, 2023.
  • [43] L. Ambrosio, S. Aziznejad, C. Brena, and M. Unser, “Linear inverse problems with Hessian–Schatten total variation,” Calculus of Variations and Partial Differential Equations, vol. 63, no. 1, p. 9, 2024.
  • [44] M. Unser and S. Aziznejad, “Convex optimization in sums of Banach spaces,” Applied and Computational Harmonic Analysis, vol. 56, pp. 1–25, 2022.
  • [45] L. E. Dubins, “On extreme points of convex sets,” Journal of Mathematical Analysis and Applications, vol. 5, no. 2, pp. 237–244, 1962.
  • [46] R. Parhi and M. Unser, “The sparsity of cycle spinning for wavelet-based solutions of linear inverse problems,” IEEE Signal Processing Letters, vol. 30, pp. 568–572, 2023.
  • [47] M. Unser and S. Ducotterd, “Universal architectures for the learning of polyhedral norms and convex regularizers,” arXiv preprint arXiv:2503.19190, 2025.
  • [48] G. Bellettini, V. Caselles, and M. Novaga, “The total variation flow in rn,” Journal of Differential Equations, vol. 184, no. 2, pp. 475–525, 2002.
  • [49] F. Andreu, C. Ballester, V. Caselles, and J. M. Mazón, “Minimizing total variation flow,” 2001.
  • [50] F. Andreu, V. Caselles, J. I. Díaz, and J. M. Mazón, “Some qualitative properties for the total variation flow,” Journal of Functional Analysis, vol. 188, no. 2, pp. 516–547, 2002.
  • [51] V. Caselles, A. Chambolle, and M. Novaga, “Regularity for solutions of the total variation denoising problem,” Revista Matemática Iberoamericana, vol. 27, no. 1, pp. 233–252, 2011.
  • [52] M.-J. Lai, B. Lucier, and J. Wang, “The convergence of a central-difference discretization of Rudin-Osher-Fatemi model for image denoising,” in Scale Space and Variational Methods in Computer Vision: Second International Conference, SSVM 2009, Voss, Norway, June 1-5, 2009. Proceedings 2. Springer, 2009, pp. 514–526.
  • [53] S. Bartels, “Total variation minimization with finite elements: convergence and iterative solution,” SIAM Journal on Numerical Analysis, vol. 50, no. 3, pp. 1162–1180, 2012.
  • [54] J. Wang and B. J. Lucier, “Error bounds for finite-difference methods for Rudin–Osher–Fatemi image smoothing,” SIAM Journal on Numerical Analysis, vol. 49, no. 2, pp. 845–868, 2011.
  • [55] A. Chambolle and T. Pock, “Approximating the total variation with finite differences or finite elements,” in Handbook of Numerical Analysis. Elsevier, 2021, vol. 22, pp. 383–417.
  • [56] D. L. Cohn, Measure Theory. Springer, 2013, vol. 2.
  • [57] O. Kallenberg et al., Random Measures, Theory and Applications. Springer, 2017, vol. 1.
  • [58] H. Federer, “Curvature measures,” Transactions of the American Mathematical Society, vol. 93, no. 3, pp. 418–491, 1959.
  • [59] W. H. Fleming and R. Rishel, “An integral formula for total gradient variation,” Archiv der Mathematik, vol. 11, pp. 218–222, 1960.
  • [60] L. Rotem, “The anisotropic total variation and surface area measures,” in Geometric Aspects of Functional Analysis: Israel Seminar (GAFA) 2020-2022. Springer, 2023, pp. 297–312.
  • [61] G. Dolzmann and S. Müller, “Microstructures with finite surface energy: the two-well problem,” Archive for Rational Mechanics and Analysis, vol. 132, pp. 101–141, 1995.
  • [62] S. Bonettini, S. Rebegoldi, and V. Ruggiero, “Inertial variable metric techniques for the inexact forward–backward algorithm,” SIAM Journal on Scientific Computing, vol. 40, no. 5, pp. A3180–A3210, 2018.
  • [63] A. Goujon, S. Neumayer, P. Bohra, S. Ducotterd, and M. Unser, “A neural-network-based convex regularizer for inverse problems,” IEEE Transactions on Computational Imaging, vol. 9, pp. 781–795, 2023.

-A Proofs

Proof of Lemma 1.

Construction. Consider the partition (Akn)k=1n(A_{k}^{n})_{k=1}^{n} of [0,1][0,1] with

Akn={]k1n,kn],k>1,[k1n,kn],k=1,A_{k}^{n}=\begin{cases}\left]\frac{k-1}{n},\frac{k}{n}\right],&k>1,\\ \left[\frac{k-1}{n},\frac{k}{n}\right],&k=1,\end{cases} (161)

and the partition (Ak,kn)k,k=1n,n(A_{k,k^{\prime}}^{n})_{k,k^{\prime}=1}^{n,n} of K\mathrm{K} with Ak,kn=Akn×Akn.A_{k,k^{\prime}}^{n}=A_{k}^{n}\times A_{k^{\prime}}^{n}. We define m=[DD]{f}m=[\mathrm{D}\otimes\mathrm{D}]\{f\} and

fn=k=1nk=1nm(Ak,kn)u(kn)u(kn).f_{n}=\sum_{k=1}^{n}\sum_{k^{\prime}=1}^{n}m(A_{k,k^{\prime}}^{n})\,u\left(\cdot-\frac{k}{n}\right)\otimes u\left(\cdot-\frac{k^{\prime}}{n}\right). (162)

We calculate that

mn=[DD]{fn}=k=1nk=1nm(Ak,kn)δknδkn(K),m_{n}=[\mathrm{D}\otimes\mathrm{D}]\{f_{n}\}=\sum_{k=1}^{n}\sum_{k^{\prime}=1}^{n}m(A_{k,k^{\prime}}^{n})\,\delta_{\frac{k}{n}}\otimes\delta_{\frac{k^{\prime}}{n}}\in\mathcal{M}(\mathrm{K}), (163)

and observe that, by construction, fnf_{n} is compactly supported in K\mathrm{K}. Therefore, fnDD,0(K)f_{n}\in\mathcal{M}_{\mathrm{D}\otimes\mathrm{D},0}(\mathrm{K}).
Item 1. For [DD]{ϕ}[DD]{𝒞0(2)}[\mathrm{D}\otimes\mathrm{D}]\{\phi\}\in[\mathrm{D}\otimes\mathrm{D}]\{\mathcal{C}_{0}(\mathbb{R}^{2})\}, we calculate that

|ffn,[DD]{ϕ}|\displaystyle|\langle f-f_{n},[\mathrm{D}\otimes\mathrm{D}]\{\phi\}\rangle|
=\displaystyle= |mmn,ϕ|\displaystyle|\langle m-m_{n},\phi\rangle|
\displaystyle\leq\; k,k=1,1n,n|Ak,knϕ(𝐱)dm(𝐱)m(Ank)ϕ(kn,kn)|\displaystyle\sum_{k,k^{\prime}=1,1}^{n,n}\left|\int_{A_{k,k^{\prime}}^{n}}\phi(\mathbf{x})\,\mathrm{d}m(\mathbf{x})-m(A_{n}^{k})\phi\left(\frac{k}{n},\frac{k^{\prime}}{n}\right)\right|
\displaystyle\leq\; k,k=1n,nAk,kn|ϕ(𝐱)ϕ(kn,kn)|dm(𝐱)\displaystyle\sum_{k^{\prime},k=1}^{n,n}\int_{A_{k,k^{\prime}}^{n}}\left|\phi(\mathbf{x})-\phi\left(\frac{k}{n},\frac{k^{\prime}}{n}\right)\right|\mathrm{d}m(\mathbf{x})
\displaystyle\leq\; msup𝐱𝐭2n|ϕ(𝐱)ϕ(𝐭)|n0,\displaystyle\left\lVert m\right\rVert_{\mathcal{M}}\underset{\left\lVert\mathbf{x}-\mathbf{t}\right\rVert\leq\frac{\sqrt{2}}{n}}{\text{sup}}|\phi(\mathbf{x})-\phi(\mathbf{t})|\underset{n\to\infty}{\to}0, (164)

because ϕ\phi is uniformly continuous. This shows the weak\mathrm{weak}^{\star}-convergence.
Item 2. The observation that mnm\left\lVert m_{n}\right\rVert_{\mathcal{M}}\leq\left\lVert m\right\rVert_{\mathcal{M}} and the weak\mathrm{weak}^{\star}-convergence proved in Item 1 imply that

mliminfnmnlimsupnmnm.\left\lVert m\right\rVert_{\mathcal{M}}\leq\underset{n\to\infty}{\text{liminf}}\left\lVert m_{n}\right\rVert_{\mathcal{M}}\leq\underset{n\to\infty}{\text{limsup}}\left\lVert m_{n}\right\rVert_{\mathcal{M}}\leq\left\lVert m\right\rVert_{\mathcal{M}}. (165)

Item 3. We calculate that

fn(x1,x2)=0x10x2dmn(t1,t2)mnm.\displaystyle f_{n}(x_{1},x_{2})=\int_{0}^{x_{1}}\int_{0}^{x_{2}}\mathrm{d}m_{n}(t_{1},t_{2})\leq\left\lVert m_{n}\right\rVert_{\mathcal{M}}\leq\left\lVert m\right\rVert_{\mathcal{M}}. (166)

It follows that the family {(fn)n=1,f}\{(f_{n})_{n=1}^{\infty},f\} is upper-bounded by m𝟙K\left\lVert m\right\rVert_{\mathcal{M}}\mathbbm{1}_{\mathrm{K}}. In addition, for (x1,x2)Ak,kn(x_{1},x_{2})\in A_{k,k^{\prime}}^{n}, we calculate that

f(x1,x2)fn(x1,x2)\displaystyle f(x_{1},x_{2})-f_{n}(x_{1},x_{2})
=\displaystyle= m([0,x1[×[0,x2[)m([0,k1n]×[0,k1n])\displaystyle m([0,x_{1}[\times[0,x_{2}[)-m\left(\left[0,\frac{k-1}{n}\right]\times\left[0,\frac{k^{\prime}-1}{n}\right]\right)
=\displaystyle= m([0,k1n]×]k1n,x2[)\displaystyle m\left(\left[0,\frac{k-1}{n}\right]\times\left]\frac{k^{\prime}-1}{n},x_{2}\right[\right)
+m(]k1n,x1[×[0,k1n])\displaystyle+m\left(\left]\frac{k-1}{n},x_{1}\right[\times\left[0,\frac{k^{\prime}-1}{n}\right]\right)
+m(]x1,k1n[×]x2,k1n[)n0.\displaystyle+m\left(\left]x_{1},\frac{k-1}{n}\right[\times\left]x_{2},\frac{k^{\prime}-1}{n}\right[\right)\underset{n\to\infty}{\to}0. (167)

In (-A), we used the observation that taking the measure mm on the closed square or the half-open square yields functions that belong to the same equivalence class. The convergence follows from the continuity from above of measures. Consequently, since the sequence (fn)n=1(f_{n})_{n=1}^{\infty} is upper-bounded by an 1(K)\mathcal{L}_{1}(\mathrm{K}) function and converges pointwise almost everywhere to ff, we conclude by the Lebesgue dominated convergence theorem (LDC) that it converges in 1(K)\mathcal{L}_{1}(\mathrm{K}) to ff.
Item 4: We first recall that, for KΩ\mathrm{K}\subset\Omega,

{fn}2={fn}(K)2={fn}(Ω)2.\left\lVert\nabla\{f_{n}\}\right\rVert_{\mathcal{M}^{2}}=\left\lVert\nabla\{f_{n}\}\right\rVert_{\mathcal{M}(\mathrm{K})^{2}}=\left\lVert\nabla\{f_{n}\}\right\rVert_{\mathcal{M}(\Omega)^{2}}. (168)

Since fnBV(Ω)f_{n}\in\mathrm{BV}(\Omega), one has from the theory of BV functions that {fn}(Ω)2\left\lVert\nabla\{f_{n}\}\right\rVert_{\mathcal{M}(\Omega)^{2}} is equal to

sup{Ω{fn}(𝐱)ϕ(𝐱)d𝐱:ϕ𝒞c1(Ω,2),ϕ1}.\displaystyle\text{sup}\left\{\int_{\Omega}\nabla\{f_{n}\}(\mathbf{x})\cdot\phi(\mathbf{x})\,\mathrm{d}\mathbf{x}:\phi\in\mathcal{C}_{c}^{1}(\Omega,\mathbb{R}^{2}),\left\lVert\phi\right\rVert\leq 1\right\}. (169)

For ϕ𝒞c1(Ω)\phi\in\mathcal{C}_{c}^{1}(\Omega), we calculate that

[ID]{ffn},ϕ\displaystyle\langle[\mathrm{I}\otimes\mathrm{D}]\{f-f_{n}\},\phi\rangle =(uδ0)(mmn),ϕ\displaystyle=\langle(u\otimes\delta_{0})\ast(m-m_{n}),\phi\rangle
=mmn,(uδ0)ϕn0,\displaystyle=\langle m-m_{n},(u^{\vee}\otimes\delta_{0})\ast\phi\rangle\underset{n\to\infty}{\to}0, (170)

because, since mmnm-m_{n} is compactly supported in K\mathrm{K}, we can use the same argument as in (164) with (uδ0)ϕ𝒞(2)(u^{\vee}\otimes\delta_{0})\ast\phi\in\mathcal{C}(\mathbb{R}^{2}). It follows that, ϕ𝒞c1(Ω,2)\forall\phi\in\mathcal{C}_{c}^{1}(\Omega,\mathbb{R}^{2}),

{fnf},ϕn0{f}2liminfn{fn}2.\langle\nabla\{f_{n}-f\},\phi\rangle\underset{n\to\infty}{\to}0\Rightarrow\left\lVert\nabla\{f\}\right\rVert_{\mathcal{M}^{2}}\leq\underset{n\to\infty}{\mathrm{liminf}}\left\lVert\nabla\{f_{n}\}\right\rVert_{\mathcal{M}^{2}}. (171)

To continue, we observe that

[ID]{fn}=(uδ0)mn=01mn([0,t[×{})()dt.\left\lVert[\mathrm{I}\otimes\mathrm{D}]\{f_{n}\}\right\rVert_{\mathcal{M}}=\left\lVert(u\otimes\delta_{0})\ast m_{n}\right\rVert_{\mathcal{M}}=\\ \int_{0}^{1}\left\lVert m_{n}([0,t[\times\{\cdot\})\right\rVert_{\mathcal{M}(\mathbb{R})}\,\mathrm{d}t. (172)

Equation (172) also holds with fnf_{n} replaced by ff, and mnm_{n} by mm. If tAkn,t\in A_{k}^{n}, then

mn([0,t[×{})()\displaystyle\left\lVert m_{n}([0,t[\times\{\cdot\})\right\rVert_{\mathcal{M}(\mathbb{R})}
=\displaystyle= k=1n|mn([0,t[×{kn})|\displaystyle\sum_{k^{\prime}=1}^{n}\left|m_{n}\left([0,t[\times\left\{\frac{k^{\prime}}{n}\right\}\right)\right|
=\displaystyle= k=1n|=1k1m(A,kn)|\displaystyle\sum_{k^{\prime}=1}^{n}\left|\sum_{\ell=1}^{k-1}m(A_{\ell,k^{\prime}}^{n})\right|
=\displaystyle= k=1n|m([0,k1n]×Akn)|\displaystyle\sum_{k^{\prime}=1}^{n}\left|m\left(\left[0,\frac{k-1}{n}\right]\times A_{k^{\prime}}^{n}\right)\right|
\displaystyle\leq k=1n|m([0,t[×Akn)|+k=1n|m(]k1n,t[×Akn)|\displaystyle\sum_{k^{\prime}=1}^{n}\left|m\left(\left[0,t\right[\times A_{k^{\prime}}^{n}\right)\right|+\sum_{k^{\prime}=1}^{n}\left|m\left(\left]\frac{k-1}{n},t\right[\times A_{k^{\prime}}^{n}\right)\right|
\displaystyle\leq m([0,t[×{})()+m(]k1n,t[×{})().\displaystyle\left\lVert m([0,t[\times\{\cdot\})\right\rVert_{\mathcal{M}(\mathbb{R})}+\left\lVert m\left(\left]\frac{k-1}{n},t\right[\times\{\cdot\}\right)\right\rVert_{\mathcal{M}(\mathbb{R})}. (173)

It follows that

[ID]{fn}[ID]{f}+01m(]projn(t),t[×{})()dt,\left\lVert[\mathrm{I}\otimes\mathrm{D}]\{f_{n}\}\right\rVert_{\mathcal{M}}\leq\left\lVert[\mathrm{I}\otimes\mathrm{D}]\{f\}\right\rVert_{\mathcal{M}}\\ +\int_{0}^{1}\left\lVert m(]\text{proj}_{n}(t),t[\times\{\cdot\})\right\rVert_{\mathcal{M}(\mathbb{R})}\,\mathrm{d}t, (174)

where projn(t)\text{proj}_{n}(t) returns the largest (k1)(k-1) such that k1nt\frac{k-1}{n}\leq t. We write Tn=]projn(t),t[T_{n}=]\text{proj}_{n}(t),t[ and observe that a similar argumentation will show that

[DI]{fn}[DI{f}+01m({}×Tn)()dt.\left\lVert[\mathrm{D}\otimes\mathrm{I}]\{f_{n}\}\right\rVert_{\mathcal{M}}\leq\left\lVert[\mathrm{D}\otimes\mathrm{I}\{f\}\right\rVert_{\mathcal{M}}\\ +\int_{0}^{1}\left\lVert m(\{\cdot\}\times T_{n})\right\rVert_{\mathcal{M}(\mathbb{R})}\,\mathrm{d}t. (175)

Next, the combination of (171), (174), and (175) yields that

{f}2limsupn{fn}2{f}2+C\left\lVert\nabla\{f\}\right\rVert_{\mathcal{M}^{2}}\leq\underset{n\to\infty}{\mathrm{limsup}}\left\lVert\nabla\{f_{n}\}\right\rVert_{\mathcal{M}^{2}}\leq\left\lVert\nabla\{f\}\right\rVert_{\mathcal{M}^{2}}+C (176)

with

C=limsup n01(m(Tn×{})()+m({}×Tn)())dt.C=\underset{n\to\infty}{\text{limsup }}\int_{0}^{1}\Big(\left\lVert m(T_{n}\times\{\cdot\})\right\rVert_{\mathcal{M}(\mathbb{R})}\\ +\left\lVert m(\{\cdot\}\times T_{n})\right\rVert_{\mathcal{M}(\mathbb{R})}\Big)\,\mathrm{d}t. (177)

Finally, we observe that

(m(Tn×{})()+m({}×Tn)())2m,\Big(\left\lVert m(T_{n}\times\{\cdot\})\right\rVert_{\mathcal{M}(\mathbb{R})}+\left\lVert m(\{\cdot\}\times T_{n})\right\rVert_{\mathcal{M}(\mathbb{R})}\Big)\leq 2\left\lVert m\right\rVert_{\mathcal{M}}, (178)

and that

limn(m(Tn×{})()+m({}×Tn)())=0.\underset{n\to\infty}{\text{lim}}\Big(\left\lVert m(T_{n}\times\{\cdot\})\right\rVert_{\mathcal{M}(\mathbb{R})}+\left\lVert m(\{\cdot\}\times T_{n})\right\rVert_{\mathcal{M}(\mathbb{R})}\Big)=0. (179)

because

nTn=.\bigcap_{n}T_{n}=\emptyset. (180)

Equation (179) follows from the continuity from above of the total variation measure. It follows from (178) and (179) that one can apply the LDC to find that C=0C=0. The combination of Equations (171) and (176) imply the desired equality. ∎