SlideShare a Scribd company logo
Compressive
  Sensing
   Gabriel Peyré
 www.numerical-tours.com
Overview


•Compressive Sensing Acquisition

•Theoretical Guarantees

•Fourier Domain Measurements

•Parameters Selection
Single Pixel Camera (Rice)
˜
f
Single Pixel Camera (Rice)
˜
f




y[i] = f,   i
                  P measures   N micro-mirrors
Single Pixel Camera (Rice)
˜
f




y[i] = f,   i
                  P measures   N micro-mirrors




                P/N = 1    P/N = 0.16   P/N = 0.02
CS Hardware Model
                                              ˜
CS is about designing hardware: input signals f    L2 (R2 ).
Physical hardware resolution limit: target resolution f   RN .

                 array                       micro
  ˜
  f   L2                       f    RN       mirrors       y     RP
               resolution
                                               K
            CS hardware
CS Hardware Model
                                              ˜
CS is about designing hardware: input signals f    L2 (R2 ).
Physical hardware resolution limit: target resolution f   RN .

                 array                       micro
  ˜
  f   L2                       f    RN       mirrors       y     RP
               resolution
                                               K
            CS hardware


                     ,
                     ,
                ...




                     ,
CS Hardware Model
                                              ˜
CS is about designing hardware: input signals f    L2 (R2 ).
Physical hardware resolution limit: target resolution f   RN .

                 array                       micro
  ˜
  f   L2                       f    RN       mirrors       y     RP
               resolution
                                               K
            CS hardware


                     ,
                                                       Operator K
                     ,                                                f
                ...




                     ,
Sparse CS Recovery
                                f0   RN
f0   RN sparse in ortho-basis




                                         N
                                x0   R
Sparse CS Recovery
                                       f0   RN
f0   RN sparse in ortho-basis

(Discretized) sampling acquisition:
      y = Kf0 + w = K      (x0 ) + w
                    =




                                                N
                                       x0   R
Sparse CS Recovery
                                               f0   RN
f0   RN sparse in ortho-basis

(Discretized) sampling acquisition:
      y = Kf0 + w = K           (x0 ) + w
                    =

K drawn from the Gaussian matrix ensemble
        Ki,j    N (0, P   1/2
                                ) i.i.d.
     drawn from the Gaussian matrix ensemble
                                                        N
                                               x0   R
Sparse CS Recovery
                                                       f0   RN
f0     RN sparse in ortho-basis

(Discretized) sampling acquisition:
        y = Kf0 + w = K            (x0 ) + w
                      =

K drawn from the Gaussian matrix ensemble
           Ki,j   N (0, P    1/2
                                   ) i.i.d.
       drawn from the Gaussian matrix ensemble
                                                                N
                                                       x0   R
     Sparse recovery:           min           ||x||1
                            || x y|| ||w||
CS Simulation Example




Original f0
               = translation invariant
                 wavelet frame
Overview


•Compressive Sensing Acquisition

•Theoretical Guarantees

•Fourier Domain Measurements

•Parameters Selection
CS with RIP

 1
     recovery:
                                                   y=      x0 + w
         x⇥
               argmin ||x||1        where
                  || x y||                         ||w||


Restricted Isometry Constants:
     ⇥ ||x||0     k,   (1    k )||x||2   || x||2    (1 +    k )||x||2
CS with RIP

 1
     recovery:
                                                   y=      x0 + w
         x⇥
               argmin ||x||1        where
                  || x y||                         ||w||


Restricted Isometry Constants:
     ⇥ ||x||0     k,   (1    k )||x||2   || x||2    (1 +    k )||x||2


Theorem:          If   2k2 1, then           [Candes 2009]
                          C0
            ||x0 x || ⇥ ||x0 xk ||1 + C1
                           k
     where xk is the best k-term approximation of x0 .
Singular Values Distributions
Eigenvalues of           I    I with |I| = k are essentially in [a, b]
 a = (1      ) 2
                             and b = (1        )2
                                       P=200, k=10
                                                      where      = k/P
    1.5



When k = P
     1
                             + , the eigenvalue distribution tends to
    0.5
                    1
     0
          f (⇥) =                 (⇥        b)+ (a           ⇥)+       [Marcenko-Pastur]
                  2⇤ ⇥
          0        0.5            1                  1.5           2         2.5

                                       P=200, k=30



          f ( )
     1

    0.8

    0.6
                                                           P = 200, k = 30
    0.4

    0.2

     0
          0        0.5            1                  1.5           2         2.5




              Large deviation inequality [Ledoux]
                                       P=200, k=50


    0.8

    0.6

    0.4

    0.2

     0
          0        0.5            1                  1.5           2         2.5
Singular Values Distributions
Eigenvalues of           I       with |I| = k are essentially in [a, b]
                                  I
 a = (1      ) 2
                              and b = (1        )2
                                           P=200, k=10
                                                       where      = k/P
    1.5



When k = P
     1
                             + , the eigenvalue distribution tends to
    0.5
                    1
     0
          f (⇥) =                     (⇥        b)+ (a           ⇥)+       [Marcenko-Pastur]
                  2⇤ ⇥
          0        0.5                1                  1.5           2         2.5

                                           P=200, k=30



          f ( )
     1

    0.8

    0.6
                                                               P = 200, k = 30
    0.4

    0.2

     0
          0        0.5                1                  1.5           2         2.5




              Large deviation inequality [Ledoux]
                                           P=200, k=50


    0.8

    0.6

    0.4
                                          C
 Theorem:
    0.2
                   If         k                 P
     0
          0        0.5
                                      log(N/P )
                                      1                  1.5           2         2.5




               then      2k           2       1 with high probability.
Numerics with RIP
Stability constant of A:
      (1   ⇥1 (A))|| ||2   ||A ||2   (1 + ⇥2 (A))|| ||2

           smallest / largest eigenvalues of A A
Numerics with RIP
Stability constant of A:
      (1       ⇥1 (A))|| ||2        ||A ||2   (1 + ⇥2 (A))|| ||2

               smallest / largest eigenvalues of A A

Upper/lower RIC:
                                                                   ˆ2
                                                                   k
           i
           k   = max     i(    I)
                 |I|=k
                                                    2   1          ˆ2
                                                                   k
           k   = min(    k, k)
                         1 2



Monte-Carlo estimation:
         ˆk    k                                                   k
                                                 N = 4000, P = 1000
Polytope Noiseless Recovery
Counting faces of random polytopes:      [Donoho]
  All x0 such that ||x0 ||0 Call (P/N )P are identifiable.
   Most x0 such that ||x0 ||0   Cmost (P/N )P are identifiable.

   Call (1/4)   0.065
                                  1

                                0.9


 Cmost (1/4)    0.25            0.8

                                0.7

                                0.6

   Sharp constants.             0.5

                                0.4

   No noise robustness.         0.3

                                0.2

                                0.1

                                  0
                                      50   100   150   200   250   300   350   400




                            RIP
                                       All                   Most
Polytope Noiseless Recovery
Counting faces of random polytopes:      [Donoho]
  All x0 such that ||x0 ||0 Call (P/N )P are identifiable.
   Most x0 such that ||x0 ||0   Cmost (P/N )P are identifiable.

   Call (1/4)   0.065
                                  1

                                0.9


 Cmost (1/4)    0.25            0.8

                                0.7

                                0.6

   Sharp constants.             0.5

                                0.4

   No noise robustness.         0.3

                                0.2
   Computation of               0.1


 “pathological” signals           0
                                      50   100   150   200   250   300   350   400


[Dossal, P, Fadili, 2010]
                            RIP
                                       All                   Most
Overview


•Compressive Sensing Acquisition

•Theoretical Guarantees

•Fourier Domain Measurements

•Parameters Selection
Tomography and Fourier Measures
Tomography and Fourier Measures
                                                      ˆ
                                                      f = FFT2(f )




                                                             k

Fourier slice theorem:      ˆ       ˆ
                            p (⇥) = f (⇥ cos( ), ⇥ sin( ))
                             1D          2D Fourier

                                             t R
Partial Fourier measurements: {p       k
                                         (t)}0 k<K
    Equivalent to:             ˆ
                         Kf = (f [!])!2⌦
Regularized Inversion
Noisy measurements:     ⇥                    ˆ
                                    , y[ ] = f0 [ ] + w[ ].
      Noise:   w[⇥]   N (0, ), white noise.
1
    regularization:
                    1                 ˆ[⇤]|2 +
         f = argmin
           ⇥
                            |y[⇤]     f               |⇥f, ⇥m ⇤|.
                  f 2                             m
MRI Imaging
              From [Lutsig et al.]
MRI Reconstruction
   From [Lutsig et al.]
                                        randomization
Fourier sub-sampling pattern:




High resolution    Low resolution   Linear              Sparsity
Structured Measurements
Gaussian matrices: intractable for large N .

Random partial orthogonal matrix:     {   } orthogonal basis.
 Kf = (h'! , f i)!2⌦     where |⌦| = P uniformly random.

Fast measurements: (e.g. Fourier basis)
Structured Measurements
Gaussian matrices: intractable for large N .

Random partial orthogonal matrix:       {     } orthogonal basis.
 Kf = (h'! , f i)!2⌦     where |⌦| = P uniformly random.

Fast measurements: (e.g. Fourier basis)
                            ⌅                               ⌅
Mutual incoherence:    µ=       N max |⇥⇥ ,    m ⇤|   [1,       N]
                                   ,m
Structured Measurements
Gaussian matrices: intractable for large N .

Random partial orthogonal matrix:       {     } orthogonal basis.
 Kf = (h'! , f i)!2⌦     where |⌦| = P uniformly random.

Fast measurements: (e.g. Fourier basis)
                            ⌅                               ⌅
Mutual incoherence:    µ=       N max |⇥⇥ ,    m ⇤|   [1,       N]
                                   ,m




   Theorem: with high probability on ,          =K
                   CP                     p
         If k 6 2          , then 2k 6 2 1
                µ log(N )4        [Rudelson, Vershynin, 2006]
            not universal: requires incoherence.
Overview


•Compressive Sensing Acquisition

•Theoretical Guarantees

•Fourier Domain Measurements

•Parameter Selection
Risk Minimization
                                1        2
Estimator: e.g.   x (y) 2 argmin ||y   x|| + ||x||1
                             x  2
I     >0                   a regularization parameter.
                   Risk Minimization
                         How to choose the value of the parameter      ?
                                               1                       2
  Estimator: e.g. x (y) 2 argmin ||y
Risk-based selection of
                                                            x|| + ||x||1
                                      x        2
                                                ?(y, ) ||2 )x ,
 Risk associated risk: R(of ) = Ew (||x (y)x x0 wrt 0
I Average to : measure the expected quality of
           ?                           R( ) = Ew ||x?(y, )   x0||2 .
               (y) = argmin R( )                      Plugin-estimator: x   ? (y)   (y)
I   The optimal (theoretical)   minimizes the risk.




                          The risk is unknown since it depends on x0.
                         Can we estimate the risk solely from x?(y, )?
I     >0                   a regularization parameter.
                   Risk Minimization
                         How to choose the value of the parameter      ?
                                               1                       2
  Estimator: e.g. x (y) 2 argmin ||y
Risk-based selection of
                                                            x|| + ||x||1
                                      x        2
                                                ?(y, ) ||2 )x ,
 Risk associated risk: R(of ) = Ew (||x (y)x x0 wrt 0
I Average to : measure the expected quality of
           ?                           R( ) = Ew ||x?(y, )   x0||2 .
               (y) = argmin R( )                      Plugin-estimator: x   ? (y)   (y)
I   The optimal (theoretical)   minimizes the risk.




                Ew is not accessible ! use one observation.
    But:                  The risk is unknown since it depends on x0.
                         Can we estimate the risk solely from x?(y, )?
I     >0                   a regularization parameter.
                   Risk Minimization
                         How to choose the value of the parameter      ?
                                               1                       2
  Estimator: e.g. x (y) 2 argmin ||y
Risk-based selection of
                                                            x|| + ||x||1
                                      x        2
                                                ?(y, ) ||2 )x ,
 Risk associated risk: R(of ) = Ew (||x (y)x x0 wrt 0
I Average to : measure the expected quality of
           ?                           R( ) = Ew ||x?(y, )   x0||2 .
               (y) = argmin R( )                      Plugin-estimator: x   ? (y)   (y)
I   The optimal (theoretical)   minimizes the risk.




                Ew is not accessible ! use one observation.
    But:               The risk is unknown since it depends on x0.
                x0 is not accessible ! needs risk estimators.
                                                            ?
                         Can we estimate the risk solely from x (y, )?
Prediction Risk Estimation
Prediction: µ (y) = x (y)
Sensitivity analysis: if µ is weakly di↵erentiable
                                            2
    µ (y + ) = µ (y) + @µ (y) · + O(|| || )
Prediction Risk Estimation
Prediction: µ (y) = x (y)
Sensitivity analysis: if µ is weakly di↵erentiable
                                            2
    µ (y + ) = µ (y) + @µ (y) · + O(|| || )
Stein Unbiased Risk Estimator:
                                2   2          2
    SURE (y) = ||y    µ (y)||           P +2       df (y)
    df (y) = tr(@µ (y)) = div(µ )(y)
Prediction Risk Estimation
 Prediction: µ (y) = x (y)
 Sensitivity analysis: if µ is weakly di↵erentiable
                                             2
     µ (y + ) = µ (y) + @µ (y) · + O(|| || )
 Stein Unbiased Risk Estimator:
                                 2   2          2
     SURE (y) = ||y    µ (y)||           P +2       df (y)
     df (y) = tr(@µ (y)) = div(µ )(y)

                                                                   2
Theorem: [Stein, 1981] Ew (SURE (y)) = Ew (|| x0             µ (y)|| )
Prediction Risk Estimation
 Prediction: µ (y) = x (y)
 Sensitivity analysis: if µ is weakly di↵erentiable
                                             2
     µ (y + ) = µ (y) + @µ (y) · + O(|| || )
 Stein Unbiased Risk Estimator:
                                 2   2          2
     SURE (y) = ||y    µ (y)||           P +2       df (y)
     df (y) = tr(@µ (y)) = div(µ )(y)

                                                                   2
Theorem: [Stein, 1981] Ew (SURE (y)) = Ew (|| x0             µ (y)|| )

 Other estimators: GCV, BIC, AIC, . . .
Prediction Risk Estimation
 Prediction: µ (y) = x (y)
 Sensitivity analysis: if µ is weakly di↵erentiable
                                             2
     µ (y + ) = µ (y) + @µ (y) · + O(|| || )
 Stein Unbiased Risk Estimator:
                                 2   2          2
     SURE (y) = ||y    µ (y)||           P +2       df (y)
     df (y) = tr(@µ (y)) = div(µ )(y)

                                                                       2
Theorem: [Stein, 1981] Ew (SURE (y)) = Ew (|| x0             µ (y)|| )

 Other estimators: GCV, BIC, AIC, . . .
 Generalized SURE: estimate Ew (||Pker(     )?   (x0     x (y))||2 )
Computation for L1 Regularization
                                1        2
Sparse estimator: x (y) 2 argmin ||y   x|| + ||x||1
                             x  2
Computation for L1 Regularization
                                1              2
Sparse estimator: x (y) 2 argmin ||y         x|| + ||x||1
                             x  2

  Theorem: for all y, there exists x? s.t. I injective.
     df (y) = div ( x ) (y) = ||x? ||0    [Dossal et al. 2011]
(a) y

                               Computation for L1 Regularization Regulariz
                                                                                                                    6




                                                          Quadratic lo
                                           ?                                                                     x 10
                                     (b) x (y, ) at the optimal
                                                          2    4        6                                  2.5
                                                                                                                                                                        Projection Risk
                                                                                                                                                                        GSURE

                                             1                                                                                                                          True Risk

          Sparse estimator: x (y) 21.5 using multi-scale 2 + ||x||thresholding
                                     argmin 2||y
                    Compressed-sensing                x|| wavelet 1
                                                      (a) y
                                             2




                                                                                          Quadratic loss
                                        x
 )
                                                                                                                                                          6
                                                                                                           1.5                                        x 10
  )
      ⌘
          ?
              Theorem: for all y, there exists x? s.t. I injective.
                                                        2.5

                 df (y) = div ( x ) (y)1 = ||x? ||01  [Dossal et al. 2011]
      (b) x?(y, ) at the optimal                                                      2                                 4          66
                                          (b) x?(y, ) at the optimal                                                    2               4       8 8 10    10
                                                                                                                                                          12
                                                                                                                            Regularization parameter λ
                                                                                                                               Regularization parameter λ


          2R         P ⇥N          Compressed-sensing using multi-scale wavelet thresholding
                   realization of a random vector. P = N/4                             2




                                                                                                                             Quadratic loss
pressed-sensing using multi-scale wavelet thresholding
 are indexed by I,                                                                                                  6
                                                x 10
         : TI wavelets.        (c) xM L
                                            2.5
                                                                                                                                                                        Projection Risk
on GJ :                                                                           6                                                                                     GSURE
                                                                               x 10                                                                                     True Risk




                                                                                          Quadratic loss
                                                                         2.5
                                                                                                            2
                                                                                                                                              1.5                                  Projecti
                                                                                          Quadratic loss
                                                     (c) xM L
                                                                                                                                                                                   GSURE
                                                                                                                                                                                   True Ri
                                                                                                           1.5
      A[J]DI sI .
                                                                          2
                                                          atic loss




 )?                                                                                                                                              1
                    +                                xat ? (y)                                                                                               2                4          6
                        y
                    (c) xM L         (d) x?(y,
                                         (d) x?(y,
                                                     ) the optimal
                                                     ) at the optimal
                                                                                                            1
                                                                                                                        2               4         ?  6         8
                                                                                                                                              Regularization parameter λ
                                                                                                                                                                         10        12
                                                                                                                                                                                  Regulariz
where, for any z 2 RP , ⌫ = ⌫(z) solves the following linear system
                  Anisotropic Total-Variation
                                       ✓ ⇤
                                               DJ
                                                   ◆✓ ◆ ✓ ⇤ ◆ 6
                                                       ⌫           z 10 6
                                                                    x 10
                                          DJ⇤ 0        ⌫
                                                       ˜
                                                           =        x .
                                                                  0 1
                                                              2.5
                                                               2.5
                                             Extension to ` analysis, TV.
 I   In practice, with law of large number, the empirical mean is replaced for the expectation.
 I   The computation of ⌫(z) is achieved by solving the [Vaiter et al. conjugate gradient solver
                                                          linear system with a 2012]
                                                                  : vertical sub-sampling.
 Numerical example
                                                                 Finite di↵erences gradient:
Super-resolution using (anisotropic) Total-Variation
      Observations y                                                           2
                                                                               2
                                                                          D = [@1 , @2 ]
           (a) y




                                                                           Quadratic loss
           (a) y




                                                                           Quadratic loss
                                                                  6
                                                               x 10
                                                    2.5
                                                                                                       Projection Risk
                                                                                                       GSURE
                                                                                                       True Risk
                                       Quadratic loss
                                                           2
                   (a) y
                                          Quadratic loss


                                                                                            1.5
                                                                                            1.5

                                                    1.5




                                                           1                                 1
                                                                                             1
       ?
       ?
           x (y)
        (b) x?(y, ? at the optimal
                   )                                                  2            4          ?6   8   10        12
Conclusion
Sparsity: approximate signals with few atoms.


         dictionary
Conclusion
 Sparsity: approximate signals with few atoms.


          dictionary



Compressed sensing ideas:
      Randomized sensors + sparse recovery.
      Number of measurements signal complexity.
      CS is about designing new hardware.
Conclusion
 Sparsity: approximate signals with few atoms.


           dictionary



Compressed sensing ideas:
       Randomized sensors + sparse recovery.
       Number of measurements signal complexity.
       CS is about designing new hardware.

The devil is in the constants:
       Worse case analysis is problematic.
       Designing good signal models.

More Related Content

PDF
Signal Processing Course : Compressed Sensing
PDF
Sparsity and Compressed Sensing
PDF
Signal Processing Course : Sparse Regularization of Inverse Problems
PPT
Chapter14
PPT
16 fft
PDF
UCB 2012-02-28
PPTX
The FFT And Spectral Analysis
PDF
A Novel Methodology for Designing Linear Phase IIR Filters
Signal Processing Course : Compressed Sensing
Sparsity and Compressed Sensing
Signal Processing Course : Sparse Regularization of Inverse Problems
Chapter14
16 fft
UCB 2012-02-28
The FFT And Spectral Analysis
A Novel Methodology for Designing Linear Phase IIR Filters

What's hot (20)

PDF
Color Img at Prisma Network meeting 2009
PPTX
Fourier transformation
PDF
Practical Spherical Harmonics Based PRT Methods
PDF
SPU Optimizations-part 1
PDF
SPU Optimizations - Part 2
PDF
An evaluation of gnss code and phase solutions
PDF
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
PDF
Dsp U Lec07 Realization Of Discrete Time Systems
PDF
A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)
PPTX
fourier series and fourier transform
PDF
Dsp U Lec08 Fir Filter Design
PDF
Bouguet's MatLab Camera Calibration Toolbox
PDF
Deblurring in ct
PDF
Modern features-part-1-detectors
PDF
Bouguet's MatLab Camera Calibration Toolbox for Stereo Camera
PDF
Estimating Human Pose from Occluded Images (ACCV 2009)
PPTX
DISTINGUISH BETWEEN WALSH TRANSFORM AND HAAR TRANSFORMDip transforms
PDF
Note on Coupled Line Cameras for Rectangle Reconstruction (ACDDE 2012)
PDF
Quantum Probabilities and Quantum-inspired Information Retrieval
PDF
Detection Tracking and Recognition of Human Poses for a Real Time Spatial Game
Color Img at Prisma Network meeting 2009
Fourier transformation
Practical Spherical Harmonics Based PRT Methods
SPU Optimizations-part 1
SPU Optimizations - Part 2
An evaluation of gnss code and phase solutions
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Dsp U Lec07 Realization Of Discrete Time Systems
A Physical Approach to Moving Cast Shadow Detection (ICASSP 2009)
fourier series and fourier transform
Dsp U Lec08 Fir Filter Design
Bouguet's MatLab Camera Calibration Toolbox
Deblurring in ct
Modern features-part-1-detectors
Bouguet's MatLab Camera Calibration Toolbox for Stereo Camera
Estimating Human Pose from Occluded Images (ACCV 2009)
DISTINGUISH BETWEEN WALSH TRANSFORM AND HAAR TRANSFORMDip transforms
Note on Coupled Line Cameras for Rectangle Reconstruction (ACDDE 2012)
Quantum Probabilities and Quantum-inspired Information Retrieval
Detection Tracking and Recognition of Human Poses for a Real Time Spatial Game
Ad

Similar to Compressed Sensing and Tomography (20)

PDF
Signal Processing Course : Theory for Sparse Recovery
PDF
Tro07 sparse-solutions-talk
PDF
A hand-waving introduction to sparsity for compressed tomography reconstruction
PDF
omp-and-k-svd - Gdc2013
PDF
A Review of Proximal Methods, with a New One
PDF
Signal Processing Course : Convex Optimization
PDF
Learning Sparse Representation
PDF
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
PDF
PDF
Signal Processing Course : Fourier
PDF
Robust Sparse Analysis Recovery
PDF
A Compressed Sensing Approach to Image Reconstruction
PDF
Introduction to compressive sensing
PDF
PDF
IGARSS2011 FR3.T08.3 BenDavid.pdf
PDF
Direct tall-and-skinny QR factorizations in MapReduce architectures
PDF
CVPR2010: Advanced ITinCVPR in a Nutshell: part 7: Future Trend
PDF
quantization
PDF
quantization
PDF
quantization
Signal Processing Course : Theory for Sparse Recovery
Tro07 sparse-solutions-talk
A hand-waving introduction to sparsity for compressed tomography reconstruction
omp-and-k-svd - Gdc2013
A Review of Proximal Methods, with a New One
Signal Processing Course : Convex Optimization
Learning Sparse Representation
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
Signal Processing Course : Fourier
Robust Sparse Analysis Recovery
A Compressed Sensing Approach to Image Reconstruction
Introduction to compressive sensing
IGARSS2011 FR3.T08.3 BenDavid.pdf
Direct tall-and-skinny QR factorizations in MapReduce architectures
CVPR2010: Advanced ITinCVPR in a Nutshell: part 7: Future Trend
quantization
quantization
quantization
Ad

More from Gabriel Peyré (20)

PDF
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
PDF
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
PDF
Low Complexity Regularization of Inverse Problems - Course #1 Inverse Problems
PDF
Low Complexity Regularization of Inverse Problems
PDF
Model Selection with Piecewise Regular Gauges
PDF
Signal Processing Course : Inverse Problems Regularization
PDF
Proximal Splitting and Optimal Transport
PDF
Geodesic Method in Computer Vision and Graphics
PDF
Adaptive Signal and Image Processing
PDF
Mesh Processing Course : Mesh Parameterization
PDF
Mesh Processing Course : Multiresolution
PDF
Mesh Processing Course : Introduction
PDF
Mesh Processing Course : Geodesics
PDF
Mesh Processing Course : Geodesic Sampling
PDF
Mesh Processing Course : Differential Calculus
PDF
Mesh Processing Course : Active Contours
PDF
Signal Processing Course : Presentation of the Course
PDF
Signal Processing Course : Orthogonal Bases
PDF
Signal Processing Course : Denoising
PDF
Signal Processing Course : Approximation
Low Complexity Regularization of Inverse Problems - Course #3 Proximal Splitt...
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Low Complexity Regularization of Inverse Problems - Course #1 Inverse Problems
Low Complexity Regularization of Inverse Problems
Model Selection with Piecewise Regular Gauges
Signal Processing Course : Inverse Problems Regularization
Proximal Splitting and Optimal Transport
Geodesic Method in Computer Vision and Graphics
Adaptive Signal and Image Processing
Mesh Processing Course : Mesh Parameterization
Mesh Processing Course : Multiresolution
Mesh Processing Course : Introduction
Mesh Processing Course : Geodesics
Mesh Processing Course : Geodesic Sampling
Mesh Processing Course : Differential Calculus
Mesh Processing Course : Active Contours
Signal Processing Course : Presentation of the Course
Signal Processing Course : Orthogonal Bases
Signal Processing Course : Denoising
Signal Processing Course : Approximation

Recently uploaded (20)

PPTX
Digestion and Absorption of Carbohydrates, Proteina and Fats
PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Computing-Curriculum for Schools in Ghana
PDF
What if we spent less time fighting change, and more time building what’s rig...
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
PPTX
History, Philosophy and sociology of education (1).pptx
PPTX
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
PPTX
UNIT III MENTAL HEALTH NURSING ASSESSMENT
PPTX
Unit 4 Skeletal System.ppt.pptxopresentatiom
PDF
advance database management system book.pdf
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PDF
RMMM.pdf make it easy to upload and study
PDF
Complications of Minimal Access Surgery at WLH
PDF
1_English_Language_Set_2.pdf probationary
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PPTX
Lesson notes of climatology university.
Digestion and Absorption of Carbohydrates, Proteina and Fats
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
Final Presentation General Medicine 03-08-2024.pptx
Computing-Curriculum for Schools in Ghana
What if we spent less time fighting change, and more time building what’s rig...
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
History, Philosophy and sociology of education (1).pptx
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
UNIT III MENTAL HEALTH NURSING ASSESSMENT
Unit 4 Skeletal System.ppt.pptxopresentatiom
advance database management system book.pdf
Chinmaya Tiranga quiz Grand Finale.pdf
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
RMMM.pdf make it easy to upload and study
Complications of Minimal Access Surgery at WLH
1_English_Language_Set_2.pdf probationary
202450812 BayCHI UCSC-SV 20250812 v17.pptx
Paper A Mock Exam 9_ Attempt review.pdf.
Lesson notes of climatology university.

Compressed Sensing and Tomography

  • 1. Compressive Sensing Gabriel Peyré www.numerical-tours.com
  • 2. Overview •Compressive Sensing Acquisition •Theoretical Guarantees •Fourier Domain Measurements •Parameters Selection
  • 3. Single Pixel Camera (Rice) ˜ f
  • 4. Single Pixel Camera (Rice) ˜ f y[i] = f, i P measures N micro-mirrors
  • 5. Single Pixel Camera (Rice) ˜ f y[i] = f, i P measures N micro-mirrors P/N = 1 P/N = 0.16 P/N = 0.02
  • 6. CS Hardware Model ˜ CS is about designing hardware: input signals f L2 (R2 ). Physical hardware resolution limit: target resolution f RN . array micro ˜ f L2 f RN mirrors y RP resolution K CS hardware
  • 7. CS Hardware Model ˜ CS is about designing hardware: input signals f L2 (R2 ). Physical hardware resolution limit: target resolution f RN . array micro ˜ f L2 f RN mirrors y RP resolution K CS hardware , , ... ,
  • 8. CS Hardware Model ˜ CS is about designing hardware: input signals f L2 (R2 ). Physical hardware resolution limit: target resolution f RN . array micro ˜ f L2 f RN mirrors y RP resolution K CS hardware , Operator K , f ... ,
  • 9. Sparse CS Recovery f0 RN f0 RN sparse in ortho-basis N x0 R
  • 10. Sparse CS Recovery f0 RN f0 RN sparse in ortho-basis (Discretized) sampling acquisition: y = Kf0 + w = K (x0 ) + w = N x0 R
  • 11. Sparse CS Recovery f0 RN f0 RN sparse in ortho-basis (Discretized) sampling acquisition: y = Kf0 + w = K (x0 ) + w = K drawn from the Gaussian matrix ensemble Ki,j N (0, P 1/2 ) i.i.d. drawn from the Gaussian matrix ensemble N x0 R
  • 12. Sparse CS Recovery f0 RN f0 RN sparse in ortho-basis (Discretized) sampling acquisition: y = Kf0 + w = K (x0 ) + w = K drawn from the Gaussian matrix ensemble Ki,j N (0, P 1/2 ) i.i.d. drawn from the Gaussian matrix ensemble N x0 R Sparse recovery: min ||x||1 || x y|| ||w||
  • 13. CS Simulation Example Original f0 = translation invariant wavelet frame
  • 14. Overview •Compressive Sensing Acquisition •Theoretical Guarantees •Fourier Domain Measurements •Parameters Selection
  • 15. CS with RIP 1 recovery: y= x0 + w x⇥ argmin ||x||1 where || x y|| ||w|| Restricted Isometry Constants: ⇥ ||x||0 k, (1 k )||x||2 || x||2 (1 + k )||x||2
  • 16. CS with RIP 1 recovery: y= x0 + w x⇥ argmin ||x||1 where || x y|| ||w|| Restricted Isometry Constants: ⇥ ||x||0 k, (1 k )||x||2 || x||2 (1 + k )||x||2 Theorem: If 2k2 1, then [Candes 2009] C0 ||x0 x || ⇥ ||x0 xk ||1 + C1 k where xk is the best k-term approximation of x0 .
  • 17. Singular Values Distributions Eigenvalues of I I with |I| = k are essentially in [a, b] a = (1 ) 2 and b = (1 )2 P=200, k=10 where = k/P 1.5 When k = P 1 + , the eigenvalue distribution tends to 0.5 1 0 f (⇥) = (⇥ b)+ (a ⇥)+ [Marcenko-Pastur] 2⇤ ⇥ 0 0.5 1 1.5 2 2.5 P=200, k=30 f ( ) 1 0.8 0.6 P = 200, k = 30 0.4 0.2 0 0 0.5 1 1.5 2 2.5 Large deviation inequality [Ledoux] P=200, k=50 0.8 0.6 0.4 0.2 0 0 0.5 1 1.5 2 2.5
  • 18. Singular Values Distributions Eigenvalues of I with |I| = k are essentially in [a, b] I a = (1 ) 2 and b = (1 )2 P=200, k=10 where = k/P 1.5 When k = P 1 + , the eigenvalue distribution tends to 0.5 1 0 f (⇥) = (⇥ b)+ (a ⇥)+ [Marcenko-Pastur] 2⇤ ⇥ 0 0.5 1 1.5 2 2.5 P=200, k=30 f ( ) 1 0.8 0.6 P = 200, k = 30 0.4 0.2 0 0 0.5 1 1.5 2 2.5 Large deviation inequality [Ledoux] P=200, k=50 0.8 0.6 0.4 C Theorem: 0.2 If k P 0 0 0.5 log(N/P ) 1 1.5 2 2.5 then 2k 2 1 with high probability.
  • 19. Numerics with RIP Stability constant of A: (1 ⇥1 (A))|| ||2 ||A ||2 (1 + ⇥2 (A))|| ||2 smallest / largest eigenvalues of A A
  • 20. Numerics with RIP Stability constant of A: (1 ⇥1 (A))|| ||2 ||A ||2 (1 + ⇥2 (A))|| ||2 smallest / largest eigenvalues of A A Upper/lower RIC: ˆ2 k i k = max i( I) |I|=k 2 1 ˆ2 k k = min( k, k) 1 2 Monte-Carlo estimation: ˆk k k N = 4000, P = 1000
  • 21. Polytope Noiseless Recovery Counting faces of random polytopes: [Donoho] All x0 such that ||x0 ||0 Call (P/N )P are identifiable. Most x0 such that ||x0 ||0 Cmost (P/N )P are identifiable. Call (1/4) 0.065 1 0.9 Cmost (1/4) 0.25 0.8 0.7 0.6 Sharp constants. 0.5 0.4 No noise robustness. 0.3 0.2 0.1 0 50 100 150 200 250 300 350 400 RIP All Most
  • 22. Polytope Noiseless Recovery Counting faces of random polytopes: [Donoho] All x0 such that ||x0 ||0 Call (P/N )P are identifiable. Most x0 such that ||x0 ||0 Cmost (P/N )P are identifiable. Call (1/4) 0.065 1 0.9 Cmost (1/4) 0.25 0.8 0.7 0.6 Sharp constants. 0.5 0.4 No noise robustness. 0.3 0.2 Computation of 0.1 “pathological” signals 0 50 100 150 200 250 300 350 400 [Dossal, P, Fadili, 2010] RIP All Most
  • 23. Overview •Compressive Sensing Acquisition •Theoretical Guarantees •Fourier Domain Measurements •Parameters Selection
  • 25. Tomography and Fourier Measures ˆ f = FFT2(f ) k Fourier slice theorem: ˆ ˆ p (⇥) = f (⇥ cos( ), ⇥ sin( )) 1D 2D Fourier t R Partial Fourier measurements: {p k (t)}0 k<K Equivalent to: ˆ Kf = (f [!])!2⌦
  • 26. Regularized Inversion Noisy measurements: ⇥ ˆ , y[ ] = f0 [ ] + w[ ]. Noise: w[⇥] N (0, ), white noise. 1 regularization: 1 ˆ[⇤]|2 + f = argmin ⇥ |y[⇤] f |⇥f, ⇥m ⇤|. f 2 m
  • 27. MRI Imaging From [Lutsig et al.]
  • 28. MRI Reconstruction From [Lutsig et al.] randomization Fourier sub-sampling pattern: High resolution Low resolution Linear Sparsity
  • 29. Structured Measurements Gaussian matrices: intractable for large N . Random partial orthogonal matrix: { } orthogonal basis. Kf = (h'! , f i)!2⌦ where |⌦| = P uniformly random. Fast measurements: (e.g. Fourier basis)
  • 30. Structured Measurements Gaussian matrices: intractable for large N . Random partial orthogonal matrix: { } orthogonal basis. Kf = (h'! , f i)!2⌦ where |⌦| = P uniformly random. Fast measurements: (e.g. Fourier basis) ⌅ ⌅ Mutual incoherence: µ= N max |⇥⇥ , m ⇤| [1, N] ,m
  • 31. Structured Measurements Gaussian matrices: intractable for large N . Random partial orthogonal matrix: { } orthogonal basis. Kf = (h'! , f i)!2⌦ where |⌦| = P uniformly random. Fast measurements: (e.g. Fourier basis) ⌅ ⌅ Mutual incoherence: µ= N max |⇥⇥ , m ⇤| [1, N] ,m Theorem: with high probability on , =K CP p If k 6 2 , then 2k 6 2 1 µ log(N )4 [Rudelson, Vershynin, 2006] not universal: requires incoherence.
  • 32. Overview •Compressive Sensing Acquisition •Theoretical Guarantees •Fourier Domain Measurements •Parameter Selection
  • 33. Risk Minimization 1 2 Estimator: e.g. x (y) 2 argmin ||y x|| + ||x||1 x 2
  • 34. I >0 a regularization parameter. Risk Minimization How to choose the value of the parameter ? 1 2 Estimator: e.g. x (y) 2 argmin ||y Risk-based selection of x|| + ||x||1 x 2 ?(y, ) ||2 )x , Risk associated risk: R(of ) = Ew (||x (y)x x0 wrt 0 I Average to : measure the expected quality of ? R( ) = Ew ||x?(y, ) x0||2 . (y) = argmin R( ) Plugin-estimator: x ? (y) (y) I The optimal (theoretical) minimizes the risk. The risk is unknown since it depends on x0. Can we estimate the risk solely from x?(y, )?
  • 35. I >0 a regularization parameter. Risk Minimization How to choose the value of the parameter ? 1 2 Estimator: e.g. x (y) 2 argmin ||y Risk-based selection of x|| + ||x||1 x 2 ?(y, ) ||2 )x , Risk associated risk: R(of ) = Ew (||x (y)x x0 wrt 0 I Average to : measure the expected quality of ? R( ) = Ew ||x?(y, ) x0||2 . (y) = argmin R( ) Plugin-estimator: x ? (y) (y) I The optimal (theoretical) minimizes the risk. Ew is not accessible ! use one observation. But: The risk is unknown since it depends on x0. Can we estimate the risk solely from x?(y, )?
  • 36. I >0 a regularization parameter. Risk Minimization How to choose the value of the parameter ? 1 2 Estimator: e.g. x (y) 2 argmin ||y Risk-based selection of x|| + ||x||1 x 2 ?(y, ) ||2 )x , Risk associated risk: R(of ) = Ew (||x (y)x x0 wrt 0 I Average to : measure the expected quality of ? R( ) = Ew ||x?(y, ) x0||2 . (y) = argmin R( ) Plugin-estimator: x ? (y) (y) I The optimal (theoretical) minimizes the risk. Ew is not accessible ! use one observation. But: The risk is unknown since it depends on x0. x0 is not accessible ! needs risk estimators. ? Can we estimate the risk solely from x (y, )?
  • 37. Prediction Risk Estimation Prediction: µ (y) = x (y) Sensitivity analysis: if µ is weakly di↵erentiable 2 µ (y + ) = µ (y) + @µ (y) · + O(|| || )
  • 38. Prediction Risk Estimation Prediction: µ (y) = x (y) Sensitivity analysis: if µ is weakly di↵erentiable 2 µ (y + ) = µ (y) + @µ (y) · + O(|| || ) Stein Unbiased Risk Estimator: 2 2 2 SURE (y) = ||y µ (y)|| P +2 df (y) df (y) = tr(@µ (y)) = div(µ )(y)
  • 39. Prediction Risk Estimation Prediction: µ (y) = x (y) Sensitivity analysis: if µ is weakly di↵erentiable 2 µ (y + ) = µ (y) + @µ (y) · + O(|| || ) Stein Unbiased Risk Estimator: 2 2 2 SURE (y) = ||y µ (y)|| P +2 df (y) df (y) = tr(@µ (y)) = div(µ )(y) 2 Theorem: [Stein, 1981] Ew (SURE (y)) = Ew (|| x0 µ (y)|| )
  • 40. Prediction Risk Estimation Prediction: µ (y) = x (y) Sensitivity analysis: if µ is weakly di↵erentiable 2 µ (y + ) = µ (y) + @µ (y) · + O(|| || ) Stein Unbiased Risk Estimator: 2 2 2 SURE (y) = ||y µ (y)|| P +2 df (y) df (y) = tr(@µ (y)) = div(µ )(y) 2 Theorem: [Stein, 1981] Ew (SURE (y)) = Ew (|| x0 µ (y)|| ) Other estimators: GCV, BIC, AIC, . . .
  • 41. Prediction Risk Estimation Prediction: µ (y) = x (y) Sensitivity analysis: if µ is weakly di↵erentiable 2 µ (y + ) = µ (y) + @µ (y) · + O(|| || ) Stein Unbiased Risk Estimator: 2 2 2 SURE (y) = ||y µ (y)|| P +2 df (y) df (y) = tr(@µ (y)) = div(µ )(y) 2 Theorem: [Stein, 1981] Ew (SURE (y)) = Ew (|| x0 µ (y)|| ) Other estimators: GCV, BIC, AIC, . . . Generalized SURE: estimate Ew (||Pker( )? (x0 x (y))||2 )
  • 42. Computation for L1 Regularization 1 2 Sparse estimator: x (y) 2 argmin ||y x|| + ||x||1 x 2
  • 43. Computation for L1 Regularization 1 2 Sparse estimator: x (y) 2 argmin ||y x|| + ||x||1 x 2 Theorem: for all y, there exists x? s.t. I injective. df (y) = div ( x ) (y) = ||x? ||0 [Dossal et al. 2011]
  • 44. (a) y Computation for L1 Regularization Regulariz 6 Quadratic lo ? x 10 (b) x (y, ) at the optimal 2 4 6 2.5 Projection Risk GSURE 1 True Risk Sparse estimator: x (y) 21.5 using multi-scale 2 + ||x||thresholding argmin 2||y Compressed-sensing x|| wavelet 1 (a) y 2 Quadratic loss x ) 6 1.5 x 10 ) ⌘ ? Theorem: for all y, there exists x? s.t. I injective. 2.5 df (y) = div ( x ) (y)1 = ||x? ||01 [Dossal et al. 2011] (b) x?(y, ) at the optimal 2 4 66 (b) x?(y, ) at the optimal 2 4 8 8 10 10 12 Regularization parameter λ Regularization parameter λ 2R P ⇥N Compressed-sensing using multi-scale wavelet thresholding realization of a random vector. P = N/4 2 Quadratic loss pressed-sensing using multi-scale wavelet thresholding are indexed by I, 6 x 10 : TI wavelets. (c) xM L 2.5 Projection Risk on GJ : 6 GSURE x 10 True Risk Quadratic loss 2.5 2 1.5 Projecti Quadratic loss (c) xM L GSURE True Ri 1.5 A[J]DI sI . 2 atic loss )? 1 + xat ? (y) 2 4 6 y (c) xM L (d) x?(y, (d) x?(y, ) the optimal ) at the optimal 1 2 4 ? 6 8 Regularization parameter λ 10 12 Regulariz
  • 45. where, for any z 2 RP , ⌫ = ⌫(z) solves the following linear system Anisotropic Total-Variation ✓ ⇤ DJ ◆✓ ◆ ✓ ⇤ ◆ 6 ⌫ z 10 6 x 10 DJ⇤ 0 ⌫ ˜ = x . 0 1 2.5 2.5 Extension to ` analysis, TV. I In practice, with law of large number, the empirical mean is replaced for the expectation. I The computation of ⌫(z) is achieved by solving the [Vaiter et al. conjugate gradient solver linear system with a 2012] : vertical sub-sampling. Numerical example Finite di↵erences gradient: Super-resolution using (anisotropic) Total-Variation Observations y 2 2 D = [@1 , @2 ] (a) y Quadratic loss (a) y Quadratic loss 6 x 10 2.5 Projection Risk GSURE True Risk Quadratic loss 2 (a) y Quadratic loss 1.5 1.5 1.5 1 1 1 ? ? x (y) (b) x?(y, ? at the optimal ) 2 4 ?6 8 10 12
  • 46. Conclusion Sparsity: approximate signals with few atoms. dictionary
  • 47. Conclusion Sparsity: approximate signals with few atoms. dictionary Compressed sensing ideas: Randomized sensors + sparse recovery. Number of measurements signal complexity. CS is about designing new hardware.
  • 48. Conclusion Sparsity: approximate signals with few atoms. dictionary Compressed sensing ideas: Randomized sensors + sparse recovery. Number of measurements signal complexity. CS is about designing new hardware. The devil is in the constants: Worse case analysis is problematic. Designing good signal models.