Software guide 3.20.0

The ORFEO Tool Box Software Guide
Updated for OTB-3.20
OTB Development Team
November 12, 2013
http://www.orfeo-toolbox.org
e-mail: otb@cnes.fr

The ORFEO Toolbox is not a black box.
Ch.D.

FOREWORD
Beside the Pleiades (PHR) and Cosmo-Skymed (CSK) systems developments forming ORFEO,
the dual and bilateral system (France - Italy) for Earth Observation, the ORFEO Accompaniment
Program was set up, to prepare, accompany and promote the use and the exploitation of the images
derived from these sensors.
The creation of a preparatory program1 is needed because of:
• the new capabilities and performances of the ORFEO systems (optical and radar high resolu-
tion, access capability, data quality, possibility to acquire simultaneously in optic and radar),
• the implied need of new methodological developments : new processing methods, or adapta-
tion of existing methods,
• the need to realise those new developments in very close cooperation with the ﬁnal users for
better integration of new products in their systems.
This program was initiated by CNES mid-2003 and will last until mid 2013. It consists in two parts,
between which it is necessary to keep a strong interaction:
• A Thematic part,
• A Methodological part.
The Thematic part covers a large range of applications (civil and defence), and aims at specifying and
validating value added products and services required by end users. This part includes consideration
about products integration in the operational systems or processing chains. It also includes a careful
thought on intermediary structures to be developed to help non-autonomous users. Lastly, this part
aims at raising future users awareness, through practical demonstrations and validations.
1http://smsc.cnes.fr/PLEIADES/A prog accomp.htm

iv
The Methodological part objective is the definition and the development of tools for the operational
exploitation of the submetric optic and radar images (tridimensional aspects, changes detection,
texture analysis, pattern matching, optic radar complementarities). It is mainly based on R&D
studies and doctorate and post-doctorate researches.
In this context, CNES2 decided to develop the ORFEO ToolBox (OTB), a set of algorithms encapsu-
lated in a software library. The goals of the OTB is to capitalise a methological savoir faire in order
to adopt an incremental development approach aiming to efficiently exploit the results obtained in
the frame of methodological R&D studies.
All the developments are based on FLOSS (Free/Libre Open Source Software)
or existing CNES developments. OTB is distributed under the CéCILL licence,
http://www.cecill.info/licences/Licence_CeCILL_V2-en.html.
OTB is implemented in C++ and is mainly based on ITK3 (Insight Toolkit).
2http://www.cnes.fr
3http://www.itk.org

CONTENTS
I Introduction 1
1 Welcome 3
1.1 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 How to Learn OTB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Software Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.1 Obtaining the Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Downloading OTB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4.1 Join the Mailing List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4.2 Directory Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4.3 Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.4 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 The OTB Community and Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6 A Brief History of OTB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6.1 ITK’s history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Installation 11
2.1 Installing binary packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.1 Windows 2000/XP/Vista/Seven . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.2 MacOS X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.3 Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

vi CONTENTS
Ubuntu 10.04 and higher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
OpenSuse 11.X and higher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
CentOS 5.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Building from sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.1 Getting the OTB source code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.2 External Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.3 Conﬁguring OTB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Preparing CMake . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Compiling OTB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3 Getting Started With OTB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3.1 Hello World ! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3 System Overview 25
3.1 System Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2 Essential System Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2.1 Generic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2.2 Include Files and Class Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2.3 Object Factories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2.4 Smart Pointers and Memory Management . . . . . . . . . . . . . . . . . . . . . . . 28
3.2.5 Error Handling and Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2.6 Event Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.2.7 Multi-Threading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3 Numerics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4 Data Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.5 Data Processing Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.6 Spatial Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
II Tutorials 37
4 Building Simple Applications with OTB 39
4.1 Hello world . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2 Pipeline basics: read and write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

CONTENTS vii
4.3 Filtering pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.4 Handling types: scaling output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.5 Working with multispectral or color images . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.6 Parsing command line arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.7 Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.8 Going from raw satellite images to useful products . . . . . . . . . . . . . . . . . . . . . . . 56
III User’s guide 61
5 Data Representation 63
5.1 Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.1.1 Creating an Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.1.2 Reading an Image from a File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.1.3 Accessing Pixel Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.1.4 Deﬁning Origin and Spacing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.1.5 Accessing Image Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.1.6 RGB Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.1.7 Vector Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.1.8 Importing Image Data from a Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.1.9 Image Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.2 PointSet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.2.1 Creating a PointSet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.2.2 Getting Access to Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.2.3 Getting Access to Data in Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.2.4 Vectors as Pixel Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.3 Mesh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.3.1 Creating a Mesh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.3.2 Inserting Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.3.3 Managing Data in Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.4 Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.4.1 Creating a PolyLineParametricPath . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

viii CONTENTS
6 Reading and Writing Images 99
6.1 Basic Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.2 Pluggable Factories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.3 IO Streaming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.3.1 Implicit Streaming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.3.2 Explicit Streaming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.4 Reading and Writing RGB Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.5 Reading, Casting and Writing Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.6 Extracting Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.7 Reading and Writing Vector Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.7.1 Reading and Writing Complex Images . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.8 Reading and Writing Multiband Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.8.1 Extracting ROIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.9 Reading Image Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.10 Extended ﬁlename for reader and writer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.10.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.10.2 Reader options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.10.3 Writer options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
7 Reading and Writing Auxilary Data 123
7.1 Reading DEM Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.2 Elevation management with OTB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
7.3 Lidar data Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
7.4 Reading and Writing Shapeﬁles and KML . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
7.5 Handling large vector data through OGR . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8 Basic Filtering 139
8.1 Thresholding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
8.1.1 Binary Thresholding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
8.1.2 General Thresholding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
8.1.3 Threshold to Point Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
8.2 Mathematical operations on images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

CONTENTS ix
8.3 Gradients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
8.3.1 Gradient Magnitude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
8.3.2 Gradient Magnitude With Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . 152
8.3.3 Derivative Without Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
8.4 Second Order Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
8.4.1 Laplacian Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Laplacian Filter Recursive Gaussian . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
8.5 Edge Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
8.5.1 Canny Edge Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
8.5.2 Ratio of Means Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
8.6 Neighborhood Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
8.6.1 Mean Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
8.6.2 Median Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
8.6.3 Mathematical Morphology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
Binary Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
Grayscale Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
8.7 Smoothing Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
8.7.1 Blurring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
Discrete Gaussian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.7.2 Edge Preserving Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Introduction to Anisotropic Diffusion . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Gradient Anisotropic Diffusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
Mean Shift ﬁltering and clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
8.7.3 Edge Preserving Speckle Reduction Filters . . . . . . . . . . . . . . . . . . . . . . 180
8.7.4 Edge preserving Markov Random Field . . . . . . . . . . . . . . . . . . . . . . . . 183
8.8 Distance Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
8.9 Rasterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
9 Image Registration 191
9.1 Registration Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
9.2 ”Hello World” Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
9.3 Features of the Registration Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

x CONTENTS
9.3.1 Direction of the Transform Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . 200
9.3.2 Registration is done in physical space . . . . . . . . . . . . . . . . . . . . . . . . . 202
9.4 Multi-Modality Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
9.4.1 Viola-Wells Mutual Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
9.5 Centered Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
9.5.1 Rigid Registration in 2D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
9.5.2 Centered Afﬁne Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
9.6 Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
9.6.1 Geometrical Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
9.6.2 Transform General Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
9.6.3 Identity Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
9.6.4 Translation Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
9.6.5 Scale Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
9.6.6 Scale Logarithmic Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
9.6.7 Euler2DTransform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
9.6.8 CenteredRigid2DTransform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
9.6.9 Similarity2DTransform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
9.6.10 QuaternionRigidTransform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
9.6.11 VersorTransform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
9.6.12 VersorRigid3DTransform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
9.6.13 Euler3DTransform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
9.6.14 Similarity3DTransform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
9.6.15 Rigid3DPerspectiveTransform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
9.6.16 AfﬁneTransform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
9.6.17 BSplineDeformableTransform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
9.6.18 KernelTransforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
9.7 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
9.7.1 Mean Squares Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
Exploring a Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
9.7.2 Normalized Correlation Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
9.7.3 Mean Reciprocal Square Differences . . . . . . . . . . . . . . . . . . . . . . . . . . 241

CONTENTS xi
9.7.4 Mutual Information Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Parzen Windowing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Viola and Wells Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
Mattes et al. Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
9.7.5 Kullback-Leibler distance metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
9.7.6 Normalized Mutual Information Metric . . . . . . . . . . . . . . . . . . . . . . . . 245
9.7.7 Mean Squares Histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
9.7.8 Correlation Coefficient Histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
9.7.9 Cardinality Match Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
9.7.10 Kappa Statistics Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
9.7.11 Gradient Difference Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
9.8 Optimizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
9.9 Landmark-based registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
10 Disparity Map Estimation 255
10.1 Disparity Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
10.1.1 Geometric deformation modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
10.1.2 Similarity measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
10.1.3 The correlation coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
10.2 Regular grid disparity map estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
10.3 Irregular grid disparity map estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
10.4 Stereo reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
11 Orthorectification and Map Projection 277
11.1 Sensor Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
11.1.1 Types of Sensor Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
11.1.2 Using Sensor Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
11.1.3 Evaluating Sensor Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
11.1.4 Limits of the Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
11.2 Map Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
11.3 Orthorectification with OTB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
11.4 Vector data projection manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293

xii CONTENTS
11.5 Geometries projection manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
11.6 Elevation management with OTB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
11.7 Vector data area extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
12 Radiometry 301
12.1 Radiometric Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
12.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
12.1.2 NDVI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
12.1.3 ARVI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
12.1.4 AVI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
12.2 Atmospheric Corrections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
13 Image Fusion 319
13.1 Simple Pan Sharpening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
13.2 Bayesian Data Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
14 Feature Extraction 327
14.1 Textures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
14.1.1 Haralick Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
14.1.2 PanTex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
14.1.3 Structural Feature Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
14.2 Interest Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
14.2.1 Harris detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
14.2.2 SIFT detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
14.2.3 SURF detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
14.3 Alignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
14.4 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
14.4.1 Line Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
14.4.2 Segment Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
Local Hough Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
Line Segment Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
14.4.3 Right Angle Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
14.5 Density Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358

CONTENTS xiii
14.5.1 Edge Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
14.5.2 SIFT Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
14.6 Geometric Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
14.6.1 Complex Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
Complex Moments for Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
Complex Moments for Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
14.6.2 Hu Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
Hu Moments for Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
14.6.3 Flusser Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
Flusser Moments for Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
14.7 Road extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
14.7.1 Road extraction ﬁlter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
14.7.2 Step by step road extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
14.8 Cloud Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
15 Multi-scale Analysis 381
15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
15.2 Morphological Pyramid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
15.2.1 Morphological Pyramid Exploitation . . . . . . . . . . . . . . . . . . . . . . . . . . 388
16 Image Segmentation 395
16.1 Region Growing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
16.1.1 Connected Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396
16.1.2 Otsu Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
16.1.3 Neighborhood Connected . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
16.1.4 Conﬁdence Connected . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406
16.2 Segmentation Based on Watersheds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
16.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
16.2.2 Using the ITK Watershed Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
16.3 Level Set Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
16.3.1 Fast Marching Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
17 Image Simulation 425

xiv CONTENTS
17.1 PROSAIL model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
17.2 Image Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
17.2.1 LAI image estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
17.2.2 Sensor RSR Image Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431
18 Dimension Reduction 443
18.1 Principal Component Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
18.2 Noise-Adjusted Principal Components Analysis . . . . . . . . . . . . . . . . . . . . . . . . 445
18.3 Maximum Noise Fraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
18.4 Fast Independant Component Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
18.5 Maximum Autocorrelation Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
19 Classification 455
19.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
19.2 Unsupervised classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
19.2.1 K-Means Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
Simple version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
General approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
k-d Tree Based k-Means Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
19.2.2 Kohonen’s Self Organizing Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466
Building a color table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
SOM Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
Multi-band, streamed classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 474
19.2.3 Bayesian Plug-In Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
19.2.4 Expectation Maximization Mixture Model Estimation . . . . . . . . . . . . . . . . . 480
19.2.5 Statistical Segmentations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484
Stochastic Expectation Maximization . . . . . . . . . . . . . . . . . . . . . . . . . . 484
19.2.6 Classification using Markov Random Fields . . . . . . . . . . . . . . . . . . . . . . 487
ITK framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
OTB framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492
19.3 Supervised classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501
19.3.1 Generic machine learning framework . . . . . . . . . . . . . . . . . . . . . . . . . 501

CONTENTS xv
19.3.2 An example of supervised classification method: Support Vector Machines . . . . . 502
SVM general description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502
SVM mathematical formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502
19.3.3 Learning from samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504
19.3.4 Learning from images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506
19.3.5 Multi-band, streamed classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 508
19.3.6 Generic Kernel SVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509
Learning with User Defined Kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . 510
Classification with user defined kernel . . . . . . . . . . . . . . . . . . . . . . . . . . 512
19.4 Fusion of Classification maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
19.4.1 General approach of image fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
19.4.2 Majority voting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
General description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
An example of majority voting fusion . . . . . . . . . . . . . . . . . . . . . . . . . . 513
19.4.3 Dempster Shafer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514
General description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514
Mathematical formulation of the combination algorithm . . . . . . . . . . . . . . . . 515
An example of Dempster Shafer fusion . . . . . . . . . . . . . . . . . . . . . . . . . 515
19.5 Classification map regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
20 Object-based Image Analysis 521
20.1 From Images to Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522
20.2 Object Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523
20.3 Object Filtering based on radiometric and statistics attributes . . . . . . . . . . . . . . . . . . 525
20.4 Hoover metrics to compare segmentations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528
21 Change Detection 531
21.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
21.1.1 Surface-based approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532
21.2 Change Detection Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
21.3 Simple Detectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536
21.3.1 Mean Difference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536

xvi CONTENTS
21.3.2 Ratio Of Means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
21.4 Statistical Detectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542
21.4.1 Distance between local distributions . . . . . . . . . . . . . . . . . . . . . . . . . . 542
21.4.2 Local Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544
21.5 Multi-Scale Detectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547
21.5.1 Kullback-Leibler Distance between distributions . . . . . . . . . . . . . . . . . . . 547
21.6 Multi-components detectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549
21.6.1 Multivariate Alteration Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549
22 Geospatial analysis 553
22.1 Reading from and Writing to Geospatial DBs . . . . . . . . . . . . . . . . . . . . . . . . . . 553
23 Hyperspectral 555
23.1 Unmixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556
23.1.1 Linear mixing model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556
23.1.2 Simplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558
23.1.3 State of the art unmixing algorithms selection . . . . . . . . . . . . . . . . . . . . . 559
Family 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559
Family 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559
Family 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561
Further remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561
Basic hyperspectral unmixing example . . . . . . . . . . . . . . . . . . . . . . . . . 562
23.2 Dimensionality reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563
23.3 Anomaly detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564
24 Image Visualization and output 569
24.1 Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569
24.2 Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570
24.2.1 Grey Level Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570
24.2.2 Multiband Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572
24.2.3 Indexed Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573
24.2.4 Altitude Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574

CONTENTS xvii
25 Online data 579
25.1 Name to Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579
25.2 Open Street Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 580
IV Developer’s guide 585
26 Iterators 587
26.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587
26.2 Programming Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588
26.2.1 Creating Iterators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588
26.2.2 Moving Iterators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588
26.2.3 Accessing Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590
26.2.4 Iteration Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591
26.3 Image Iterators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592
26.3.1 ImageRegionIterator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592
26.3.2 ImageRegionIteratorWithIndex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594
26.3.3 ImageLinearIteratorWithIndex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596
26.4 Neighborhood Iterators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598
26.4.1 NeighborhoodIterator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604
Basic neighborhood techniques: edge detection . . . . . . . . . . . . . . . . . . . . . 604
Convolution ﬁltering: Sobel operator . . . . . . . . . . . . . . . . . . . . . . . . . . 607
Optimizing iteration speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 608
Separable convolution: Gaussian ﬁltering . . . . . . . . . . . . . . . . . . . . . . . . 610
Random access iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611
26.4.2 ShapedNeighborhoodIterator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613
Shaped neighborhoods: morphological operations . . . . . . . . . . . . . . . . . . . . 615
27 Image Adaptors 619
27.1 Image Casting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 620
27.2 Adapting RGB Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 622
27.3 Adapting Vector Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624
27.4 Adaptors for Simple Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626

xviii CONTENTS
27.5 Adaptors and Writers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628
28 Streaming and Threading 629
28.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629
28.2 Streaming and threading in OTB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629
28.3 Division strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 630
29 How To Write A Filter 631
29.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 631
29.2 Overview of Filter Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632
29.3 Streaming Large Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633
29.3.1 Overview of Pipeline Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634
29.3.2 Details of Pipeline Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636
UpdateOutputInformation() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636
PropagateRequestedRegion() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
UpdateOutputData() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 638
29.4 Threaded Filter Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 638
29.5 Filter Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 639
29.5.1 Optional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 640
29.5.2 Useful Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 640
29.6 How To Write A Composite Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641
29.6.1 Implementing a Composite Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641
29.6.2 A Simple Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 642
30 Persistent filters 647
30.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647
30.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647
30.2.1 The persistent filter class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 648
30.2.2 The streaming decorator class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 648
30.3 An end-to-end example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649
30.3.1 First step: writing a persistent filter . . . . . . . . . . . . . . . . . . . . . . . . . . . 649
30.3.2 Second step: Decorating the filter and using it . . . . . . . . . . . . . . . . . . . . . 651
30.3.3 Third step: one class to rule them all . . . . . . . . . . . . . . . . . . . . . . . . . . 651

CONTENTS xix
31 How to write an application 653
31.1 Application design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653
31.2 Architecture of the class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654
31.2.1 DoInit() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654
31.2.2 DoUpdateParameters() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654
31.2.3 DoExecute() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654
31.2.4 Parameters selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655
31.2.5 Parameters description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 656
31.3 Compile your application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 656
31.4 Execute your application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 657
31.5 Testing your application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 657
31.6 Application Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 657
V Appendix 661
32 Frequently Asked Questions 663
32.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663
32.1.1 What is OTB? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663
32.1.2 What is ORFEO? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664
Where can I get more information about ORFEO? . . . . . . . . . . . . . . . . . . . 664
32.1.3 What is the ORFEO Accompaniment Program? . . . . . . . . . . . . . . . . . . . . 664
Where can I get more information about the ORFEO Accompaniment Program? . . . 665
32.1.4 Who is responsible for the OTB development? . . . . . . . . . . . . . . . . . . . . . 665
32.2 Licence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665
32.2.1 Which is the OTB licence? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665
32.2.2 If I write an application using OTB am I forced to distribute that application? . . . . 666
32.2.3 If I write an application using OTB am I forced to contribute the code into the ofﬁcal repositories?666
32.2.4 If I wanted to distribute an application using OTB what license would I need to use? 666
32.2.5 I am a commercial user. Is there any restriction on the use of OTB? . . . . . . . . . . 666
32.3 Getting OTB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
32.3.1 Who can download the OTB? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
32.3.2 Where can I download the OTB? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666

xx CONTENTS
32.3.3 How to get the latest bleeding-edge version? . . . . . . . . . . . . . . . . . . . . . . 667
32.4 Compiling and installing OTB from source . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
32.4.1 Which platforms are supported? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
32.4.2 Which libraries/packages are needed before compiling and installing OTB? . . . . . 667
32.4.3 Main steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 669
Unix/Linux Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 669
Microsoft Visual Studio (Express 2008/Express 2010) . . . . . . . . . . . . . . . . . 671
32.4.4 Specific platform issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 671
Visual Studio 2010/Express 2010 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 671
MacOSX 10.6 Snow Leopard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 671
Debian Linux / Ubuntu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 671
32.5 Using OTB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 672
32.5.1 Where to start ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 672
32.5.2 What is the image size limitation of OTB ? . . . . . . . . . . . . . . . . . . . . . . 672
32.6 Getting help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673
32.6.1 Is there any mailing list? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673
32.6.2 Which is the main source of documentation? . . . . . . . . . . . . . . . . . . . . . . 673
32.7 Contributing to OTB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673
32.7.1 I want to contribute to OTB, where to begin? . . . . . . . . . . . . . . . . . . . . . 673
32.7.2 What are the benefits of contributing to OTB? . . . . . . . . . . . . . . . . . . . . . 674
32.7.3 What functionality can I contribute? . . . . . . . . . . . . . . . . . . . . . . . . . . 674
32.8 Running the tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674
32.8.1 What are the tests? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674
32.8.2 How to run the tests? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675
32.8.3 How to get the test data? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675
32.8.4 How to submit the results? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676
32.9 OTB’s Roadmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676
32.9.1 Which will be the next version of OTB? . . . . . . . . . . . . . . . . . . . . . . . . 676
What is a major version? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676
What is a minor version? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676
What is a bugfix version? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 677

CONTENTS xxi
32.9.2 When will the next version of OTB be available? . . . . . . . . . . . . . . . . . . . 677
32.9.3 What features will the OTB include and when? . . . . . . . . . . . . . . . . . . . . 677
33 Release Notes 679
34 Wrappings to other languages 733
34.1 OTB-Wrapping: bindings to Java language . . . . . . . . . . . . . . . . . . . . . . . . . . . 733
34.1.1 Mangling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733
34.1.2 How to Use OTB-Wrapping in Java . . . . . . . . . . . . . . . . . . . . . . . . . . 734
Import OTB classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 734
Compile Java programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 734
Java programs execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735
34.1.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736
34.1.4 Use OTB-Wrapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737
Download sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737
Required Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737
Compile OTB-Wrapping sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737
Download binaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 738
Use OTB-Wrapping with Eclipse . . . . . . . . . . . . . . . . . . . . . . . . . . . . 738
34.2 Java tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 738
34.2.1 Hello World . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 738
34.2.2 Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 739
34.2.3 Filtering pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 739
34.2.4 Smarter Filtering Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 740
34.2.5 Scaling Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 742
34.2.6 MultiSpectral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 743
34.2.7 OrthoFusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 745
34.3 Python tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 748
34.3.1 Hello World . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 748
34.3.2 Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 749
34.3.3 Filtering pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 749
34.3.4 Smarter Filtering Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 750

xxii CONTENTS
34.3.5 Scaling Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 752
34.3.6 MultiSpectral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 753
34.4 Developer Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756
34.4.1 Add a new Library : Creating a new CMakeList.txt file . . . . . . . . . . . . . . . . 756
34.4.2 Add a new cmake file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756
A simple Example : otb::StreamingShrinkImageFilter . . . . . . . . . . . . . . . . . 757
34.4.3 Predefined Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757
OTB-Wrapping predefined variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 758
OTB-Wrapping predefined lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 758
Macro MANGLE NAME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 760
34.4.4 HTML JavaDoc generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 760
Generate JavaDoc documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 760
Generate JavaDoc while compilation . . . . . . . . . . . . . . . . . . . . . . . . . . 761
35 Contributors 763
Index 777

LIST OF FIGURES
2.1 Cmake user interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.1 OTB Image Geometrical Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.2 PointSet with Vectors as PixelType . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.1 Collaboration diagram of the ImageIO classes . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.2 Use cases of ImageIO factories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.3 Class diagram of ImageIO factories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.4 Initial SPOT 5 image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.5 ROI of a SPOT5 image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.1 DEM To Image generator Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
7.2 Image from lidar data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
8.1 BinaryThresholdImageFilter transfer function . . . . . . . . . . . . . . . . . . . . . . . . . . 140
8.2 BinaryThresholdImageFilter output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
8.3 ThresholdImageFilter using the threshold-below mode. . . . . . . . . . . . . . . . . . . . . . 142
8.4 ThresholdImageFilter using the threshold-above mode . . . . . . . . . . . . . . . . . . . . . 143
8.5 ThresholdImageFilter using the threshold-outside mode . . . . . . . . . . . . . . . . . . . . . 143
8.6 Band Math . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
8.7 GradientMagnitudeImageFilter output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

xxiv List of Figures
8.8 GradientMagnitudeRecursiveGaussianImageFilter output . . . . . . . . . . . . . . . . . . . . 153
8.9 Effect of the Derivative ﬁlter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
8.10 Output of the RecursiveGaussianImageFilter. . . . . . . . . . . . . . . . . . . . . . . . . . . 159
8.11 Output of the LaplacianRecursiveGaussianImageFilter. . . . . . . . . . . . . . . . . . . . . . 160
8.12 CannyEdgeDetectorImageFilter output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
8.13 Touzi Edge Detector Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
8.14 Effect of the MedianImageFilter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
8.15 Effect of the Median ﬁlter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
8.16 Effect of erosion and dilation in a binary image. . . . . . . . . . . . . . . . . . . . . . . . . . 170
8.17 Effect of erosion and dilation in a grayscale image. . . . . . . . . . . . . . . . . . . . . . . . 172
8.18 DiscreteGaussianImageFilter Gaussian diagram. . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.19 DiscreteGaussianImageFilter output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
8.20 GradientAnisotropicDiffusionImageFilter output . . . . . . . . . . . . . . . . . . . . . . . . 178
8.21 Mean Shift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
8.22 Lee Filter Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
8.23 Frost Filter Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
8.24 MRF restauration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
8.25 DanielssonDistanceMapImageFilter output . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
8.26 Rasterized SRTM water bodies near Arcachon, France. . . . . . . . . . . . . . . . . . . . . . 190
9.1 Image Registration Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
9.2 Registration Framework Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
9.3 Fixed and Moving images in registration framework . . . . . . . . . . . . . . . . . . . . . . . 197
9.4 HelloWorld registration output images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
9.5 Pipeline structure of the registration example . . . . . . . . . . . . . . . . . . . . . . . . . . 199
9.6 Registration Coordinate Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
9.7 Multi-Modality Registration Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
9.8 Multi-Modality Registration outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
9.9 Rigid2D Registration input images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
9.10 Rigid2D Registration output images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
9.11 Rigid2D Registration input images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
9.12 Rigid2D Registration output images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

List of Figures xxv
9.13 AffineTransform registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
9.14 AffineTransform output images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
9.15 Geometrical representation objects in ITK . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
9.16 Parzen Windowing in Mutual Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
9.17 Class diagram of the Optimizer hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
9.18 Estimation of affine transformation using least square optimisation from SIFT points . . . . . 253
10.1 Estimation of the correlation surface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
10.2 Deformation field and resampling from fine registration . . . . . . . . . . . . . . . . . . . . . 263
10.3 Deformation field and resampling from disparity map estimation . . . . . . . . . . . . . . . . 270
10.4 From stereo pair to elevation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
11.1 Image Ortho-registration Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
12.1 ARVI Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
12.2 ARVI Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
12.3 AVI Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
13.1 Simple pan-sharpening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
13.2 Pan sharpening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
13.3 Bayesian Data Fusion Example inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
13.4 Bayesian Data Fusion results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
14.1 Results of applying Haralick contrast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
14.2 PanTex Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
14.3 Right Angle Detection Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
14.4 Harris Filter Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
14.5 SIFT Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
14.6 SURF Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
14.7 Alignment Detection Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
14.8 Line Ratio Detector Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
14.9 Line Correlation Detector Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
14.10Line Correlation Detector Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353

xxvi List of Figures
14.11Line Correlation Detector Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
14.12LSD Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
14.13Right Angle Detection Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
14.14Edge Density Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
14.15SIFT Density Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
14.16Road extraction filter application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
14.17Spectral Angle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
14.20Cloud Detection Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
15.1 Morphological pyramid analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
15.7 Morphological pyramid analysis and synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . 389
16.1 ConnectedThreshold segmentation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
16.2 OtsuThresholdImageFilter output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
16.3 OtsuThresholdImageFilter output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
16.4 NeighborhoodConnectedThreshold segmentation results . . . . . . . . . . . . . . . . . . . . 405
16.5 ConfidenceConnected segmentation results . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
16.6 Watershed Catchment Basins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410
16.7 Watersheds Hierarchy of Regions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
16.8 Watersheds filter composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
16.9 Watershed segmentation output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
16.10Zero Set Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
16.11Grid position of the embedded level-set surface. . . . . . . . . . . . . . . . . . . . . . . . . . 416
16.12FastMarchingImageFilter collaboration diagram . . . . . . . . . . . . . . . . . . . . . . . . . 417

List of Figures xxvii
16.13FastMarchingImageFilter intermediate output . . . . . . . . . . . . . . . . . . . . . . . . . . 423
16.14FastMarchingImageFilter segmentations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
17.1 LAIFromNDVIImageTransform Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
18.1 PCA Filter (forward trasnformation) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
18.5 Maximum Autocorrelation Factor results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
19.1 Output of the KMeans classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
19.2 Two normal distributions plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
19.3 Kohonen’s Self Organizing Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466
19.4 SOM Image Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
19.5 SOM Image Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474
19.6 Bayesian plug-in classifier for two Gaussian classes . . . . . . . . . . . . . . . . . . . . . . . 476
19.7 SEM Classification results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
19.8 Output of the ScalarImageMarkovRandomField . . . . . . . . . . . . . . . . . . . . . . . . . 492
19.9 OTB Markov Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
19.10MRF restauration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496
20.1 Image to Label Object Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523
20.2 Object based extraction based on . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
21.1 Spot Images for Change Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
21.2 Difference Change Detection Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
21.3 Radarsat Images for Change Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540
21.4 Ratio Change Detection Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541
21.5 Kullback-Leibler Change Detection Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 544

xxviii List of Figures
21.6 ERS Images for Change Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545
21.7 Correlation Change Detection Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547
21.8 Kullback-Leibler profile Change Detection Results . . . . . . . . . . . . . . . . . . . . . . . 549
21.9 Multivariate Alteration Detection Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
23.1 Hyperspectral cube . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555
23.2 Linear mixing model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556
23.3 Decomposition of the LMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556
23.4 Hyperspectral cube vectorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557
23.5 Simplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558
23.6 Unmixing Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564
23.7 Concept of detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565
23.8 Anomaly detection block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566
23.9 Sliding window and parameters definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567
24.1 Image visualization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571
24.2 Scaling images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572
24.3 Scaling images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574
24.4 Scaling images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575
24.5 Grayscale to color . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576
24.6 Hill shading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577
25.1 Open street map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 583
26.1 ITK image iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589
26.2 Copying an image subregion using ImageRegionIterator . . . . . . . . . . . . . . . . . . . . 595
26.3 Using the ImageRegionIteratorWithIndex . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596
26.4 Neighborhood iterator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599
26.5 Some possible neighborhood iterator shapes . . . . . . . . . . . . . . . . . . . . . . . . . . . 600
26.6 Sobel edge detection results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607
26.7 Gaussian blurring by convolution filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612
26.8 Finding local minima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613
26.9 Binary image morphology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 618

List of Figures xxix
27.1 ImageAdaptor concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 620
27.2 Image Adaptor for performing computations . . . . . . . . . . . . . . . . . . . . . . . . . . . 627
29.1 Relationship between DataObjects and ProcessObjects . . . . . . . . . . . . . . . . . . . . . 632
29.2 The Data Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634
29.3 Sequence of the Data Pipeline updating mechanism . . . . . . . . . . . . . . . . . . . . . . . 635
29.4 Composite Filter Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641
29.5 Composite Filter Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 642

LIST OF TABLES
2.1 Available installation procedures with respect to system conﬁguration and target usage . . . . 12
9.1 Geometrical Elementary Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
9.2 Identity Transform Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
9.3 Translation Transform Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
9.4 Scale Transform Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
9.5 Scale Logarithmic Transform Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . 226
9.6 Euler2D Transform Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
9.7 CenteredRigid2D Transform Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
9.8 Similarity2D Transform Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
9.9 QuaternionRigid Transform Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
9.10 Versor Transform Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
9.11 Versor Rigid3D Transform Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
9.12 Euler3D Transform Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
9.13 Similarity3D Transform Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
9.14 Rigid3DPerspective Transform Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . 235
9.15 Afﬁne Transform Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
9.16 BSpline Deformable Transform Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . 237
10.1 Characterization of the geometric deformation sources . . . . . . . . . . . . . . . . . . . . . 257
10.2 Approaches to image registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258

xxxii List of Tables
12.1 Vegetation indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
12.2 Soil indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
12.3 Water indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
12.4 Built-up indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
14.1 Haralick features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
16.1 ConnectedThreshold example parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
16.2 NeighborhoodConnectedThreshold example parameters . . . . . . . . . . . . . . . . . . . . . 405
16.3 ConnectedThreshold example parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
16.4 FastMarching segmentation example parameters . . . . . . . . . . . . . . . . . . . . . . . . . 422
32.1 Libraries used in the OTB. In the column Where, the default behavior during the conﬁguration of OTB is indicated by when the
34.1 Mangling of the basic types used. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 734
34.2 Mangling of the usually types used. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 734
34.3 C++ code translation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 736

CHAPTER
ONE
WELCOME
Welcome to the ORFEO ToolBox (OTB) Software Guide.
This document presents the essential concepts used in OTB. It will guide you through the road
of learning and using OTB. The Doxygen documentation for the OTB application programming
interface is available on line at http://orfeo-toolbox.sourceforge.net/Doxygen/html.
1.1 Organization
This software guide is divided into several parts, each of which is further divided into several chap-
ters. Part I is a general introduction to OTB, with—in the next chapter—a description of how to
install the ORFEO Toolbox on your computer. Part I also introduces basic system concepts such
as an overview of the system architecture, and how to build applications in the C++ programming
language. Part II is a short guide with gradual difficulty to get you start programming with OTB.
Part III describes the system from the user point of view. Dozens of examples are used to illustrate
important system features. Part IV is for the OTB developer. It explains how to create your own
classes and extend the system.
1.2 How to Learn OTB
There are two broad categories of users of OTB. First are class developers, those who create classes
in C++. The second, users, employ existing C++ classes to build applications. Class developers
must be proficient in C++, and if they are extending or modifying OTB, they must also be familiar
with OTB’s internal structures and design (material covered in Part IV).
The key to learning how to use OTB is to become familiar with its palette of objects and the ways
of combining them. We recommend that you learn the system by studying the examples and then,
if you are a class developer, study the source code. Start by the first few tutorials in Part II to get

4 Chapter 1. Welcome
familiar with the build process and the general program organization, follow by reading Chapter 3,
which provides an overview of some of the key concepts in the system, and then review the exam-
ples in Part III. You may also wish to compile and run the dozens of examples distributed with the
source code found in the directory OTB/Examples. (Please see the file OTB/Examples/README.txt
for a description of the examples contained in the various subdirectories.) There are also several
hundreds of tests found in the source distribution in OTB/Testing/Code, most of which are mini-
mally documented testing code. However, they may be useful to see how classes are used together
in OTB, especially since they are designed to exercise as much of the functionality of each class as
possible.
1.3 Software Organization
The following sections describe the directory contents, summarize the software functionality in each
directory, and locate the documentation and data.
1.3.1 Obtaining the Software
Periodic releases of the software are available on the OTB Web site. These official releases are
available a few times a year and announced on the ORFEO Web pages and mailing lists.
This software guide assumes that you are working with the latest official OTB release (available on
the OTB Web site).
1.4 Downloading OTB
OTB can be downloaded without cost from the following web site:
http://www.orfeo-toolbox.org/
In order to track the kind of applications for which OTB is being used, you will be asked to complete
a form prior to downloading the software. The information you provide in this form will help
developers to get a better idea of the interests and skills of the toolkit users.
Once you fill out this form you will have access to the download page. This page can be book
marked to facilitate subsequent visits to the download site without having to complete any form
again.
Then choose the tarball that better fits your system. The options are .zip and .tgz files. The
first type is better suited for MS-Windows while the second one is the preferred format for UNIX
systems.

1.4. Downloading OTB 5
Once you unzip or untar the file, a directory called OTB will be created in your disk and you will be
ready for starting the configuration process described in Section 2.2.3 on page 18.
You can also get the current version following instructions in Section 32.3.3, on page 667.
1.4.1 Join the Mailing List
It is strongly recommended that you join the users mailing list. This is one of the primary resources
for guidance and help regarding the use of the toolkit. You can subscribe to the users list online at
http://groups.google.com/group/otb-users
The otb-users mailing list is also the best mechanism for expressing your opinions about the toolbox
and to let developers know about features that you find useful, desirable or even unnecessary. OTB
developers are committed to creating a self-sustaining open-source OTB community. Feedback from
users is fundamental to achieving this goal.
1.4.2 Directory Structure
To begin your OTB odyssey, you will first need to know something about OTB’s software organiza-
tion and directory structure. It is helpful to know enough to navigate through the code base to find
examples, code, and documentation.
OTB is organized into several different modules. There are three: the OTB, OTB-Documents and
OTB-Wrapping modules. The source code, examples and applications are found in the OTB module;
documentation, tutorials, and material related to the design and marketing of OTB are found in
OTB-Documents; and resources to wrap OTB classes into other languages (such as Java or Python)
are available from OTB-Wrapping. Usually you will work with the OTB module unless you are a
developer, are teaching a course, or are looking at the details of various design documents.
The OTB module contains the following subdirectories:
• OTB/Code—the heart of the software; the location of the majority of the source code.
• OTB/Applications—a set of applications modules that can be launched in different ways
(command-line, graphical interface, Python/Java), refer to the OTB Cookbook for more infor-
mation
• OTB/CMake—internal files used during the configuration process
• OTB/Copyright—the copyright information of OTB and all the dependencies included in the
OTB source tree
• OTB/Examples—a suite of simple, well-documented examples used by this guide and to il-
lustrate important OTB concepts.

• OTB/Testing—a large number of small programs used to test OTB. These examples tend to
be minimally documented but may be useful to demonstrate various system concepts.
• OTB/Utilities—supporting software for the OTB source code. For example, libraries such
as ITK.GDAL.
The source code directory structure—found in OTB/Code—is important to understand since other
directory structures (such as the Testing directory) shadow the structure of the OTB/Code directory.
• OTB/Code/ApplicationEngine—the core library for building applications based on OTB
• OTB/Code/Common—core classes, macro definitions, typedefs, and other software constructs
central to OTB.
• OTB/Code/BasicFilters—basic image processing filters.
• OTB/Code/Fusion—image fusion algorithms, as for instance, pansharpening.
• OTB/Code/FeatureExtraction—the location of many feature extraction algorithms.
• OTB/Code/ChangeDetection—a set of remote sensing image change detection algorithms.
• OTB/Code/DisparityMap—tools for estimating disparities – deformations – between im-
ages.
• OTB/Code/Fuzzy—fuzzy logic based algorithms, with Dempster-Shaffer theory related
classes
• OTB/Code/GeospatialAnalysis—classes allowing to connect to geospatial database like
PostGIS.
• OTB/Code/Gui—very basic widgets for building graphical user interfaces, such as progress
bars for filters, etc.
• OTB/Code/Hyperspectral—hyperspectral images analysis
• OTB/Code/IO—classes that support the reading and writing of data.
• OTB/Code/Learning—several functionnalities for supervised learning and classification.
• OTB/Code/Markov—implementation of Markov Random Fields regularization and segmen-
tation.
• OTB/Code/MultiScale—a set of functionalities for multiscale image analysis and synthesis.
• OTB/Code/MultiTemporal—time series interpolation related algorithms
• OTB/Code/OBIA—Object Based Image Analysis filters and data structures
• OTB/Code/ObjectDetection—Object detection chain based on local feature extraction

1.4. Downloading OTB 7
• OTB/Code/Projections—classes allowing to deal with sensor models and cartographic pro-
jections.
• OTB/Code/Radiometry—classes allowing to compute vegetation indices and radiometric
corrections.
• OTB/Code/SARPolarimetry—some add-ons for SAR polarimetry synthesis and analysis.
• OTB/Code/Segmentation—several functionnalities for image segmentation.
• OTB/Code/Simulation—Sensor simulator
• OTB/Code/SpatialReasoning—several functionnalities high level image analysis using
spatial reasoning techniques.
• OTB/Code/Testing—internal classes used in the used in the testing framework
• OTB/Code/Visualization—utilities for simple image visualization.
• OTB/Code/Wrappers—wrappers of applications in several access points (command-line, QT
Gui, SWIG...).
The OTB-Documents module contains the following subdirectories:
• OTB-Documents/CourseWare—material related to teaching OTB.
• OTB-Documents/Latex—LATEX styles to produce this work as well as other documents.
• OTB-Documents/SoftwareGuide—LATEX files used to create this guide. (Note that the code
found in OTB/Examples is used in conjunction with these LATEX files.)
1.4.3 Documentation
Besides this text, there are other documentation resources that you should be aware of.
Doxygen Documentation. The Doxygen documentation is an essential resource when working
with OTB. These extensive Web pages describe in detail every class and method in the
system. The documentation also contains inheritance and collaboration diagrams, listing
of event invocations, and data members. The documentation is heavily hyper-linked to
other classes and to the source code. The Doxygen documentation is available on-line at
http://www.orfeo-toolbox.org/doxygen/.
Header Files. Each OTB class is implemented with a .h and .cxx/.txx file (.txx file for templated
classes). All methods found in the .h header files are documented and provide a quick way to
find documentation for a particular method. (Indeed, Doxygen uses the header documentation
to produces its output.)

1.4.4 Data
The OTB Toolkit was designed to support the ORFEO Acompaniment Program and its associated
data. This data is available at http://smsc.cnes.fr/PLEIADES/index.htm.
1.5 The OTB Community and Support
OTB was created from its inception as a collaborative, community effort. Research, teaching, and
commercial uses of the toolkit are expected. If you would like to participate in the community, there
are a number of possibilities.
• Users may actively report bugs, defects in the system API, and/or submit feature requests.
Currently the best way to do this is through the OTB users mailing list.
• Developers may contribute classes or improve existing classes. If you are a developer, you
may request permission to join the OTB developers mailing list. Please do so by sending
email to otb “at” cnes.fr. To become a developer you need to demonstrate both a level of
competence as well as trustworthiness. You may wish to begin by submitting ﬁxes to the OTB
users mailing list.
• Research partnerships with members of the ORFEO Acompaniment Program are encouraged.
CNES will encourage the use of OTB in proposed work and research projects.
• Educators may wish to use OTB in courses. Materials are being developed for this purpose,
e.g., a one-day, conference course and semester-long graduate courses. Watch the OTB web
pages or check in the OTB-Documents/CourseWare directory for more information.
1.6 A Brief History of OTB
Beside the Pleiades (PHR) and Cosmo-Skymed (CSK) systems developments forming ORFEO,
the dual and bilateral system (France - Italy) for Earth Observation, the ORFEO Accompaniment
Program was set up, to prepare, accompany and promote the use and the exploitation of the images
derived from these sensors.
The creation of a preparatory program1 is needed because of :
• the new capabilities and performances of the ORFEO systems (optical and radar high resolu-
tion, access capability, data quality, possibility to acquire simultaneously in optic and radar),
• the implied need of new methodological developments : new processing methods, or adapta-
tion of existing methods,
1http://smsc.cnes.fr/PLEIADES/A prog accomp.htm

1.6. A Brief History of OTB 9
• the need to realise those new developments in very close cooperation with the final users for
better integration of new products in their systems.
This program was initiated by CNES mid-2003 and will last until 2010 at least It consists in two
parts, between which it is necessary to keep a strong interaction :
• A Thematic part
• A Methodological part.
The Thematic part covers a large range of applications (civil and defence ones), and aims at spec-
ifying and validating value added products and services required by end users. This part includes
consideration about products integration in the operational systems or processing lines. It also in-
cludes a careful thought on intermediary structures to be developed to help non-autonomous users.
Lastly, this part aims at raising future users awareness, through practical demonstrations and valida-
tions.
The Methodological part objective is the definition and the development of tools for the operational
exploitation of the future submetric optic and radar images (tridimensional aspects, change detec-
tion, texture analysis, pattern matching, optic radar complementarities). It is mainly based on R&D
studies and doctorate and post-doctorate research.
In this context, CNES2 decided to develop the ORFEO ToolBox (OTB), a set of algorithms encapsu-
lated in a software library. The goals of the OTB is to capitalise a methological savoir faire in order
to adopt an incremental development approach aimin to efficiently exploit the results obtained in the
frame of methodological R&D studies.
All the developments are based on FLOSS (Free/Libre Open Source Software) or existing CNES
developments.
OTB is implemented in C++ and is mainly based on ITK3 (Insight Toolkit):
• ITK is used as the core element of OTB
• OTB classes inherit from ITK classes
• The software development procedure of OTB is strongly inspired from ITK’s (Extreme Pro-
gramming, test-based coding, Generic Programming, etc.)
• The documentation production procedure is the same as for ITK
• Several chapters of the Software Guide are litterally copied from ITK’s Software Guide (with
permission).
• Many examples are taken from ITK.
2http://www.cnes.fr
3http://www.itk.org

1.6.1 ITK’s history
In 1999 the US National Library of Medicine of the National Institutes of Health awarded six
three-year contracts to develop an open-source registration and segmentation toolkit, that eventu-
ally came to be known as the Insight Toolkit (ITK) and formed the basis of the Insight Software
Consortium. ITK’s NIH/NLM Project Manager was Dr. Terry Yoo, who coordinated the six prime
contractors composing the Insight consortium. These consortium members included three com-
mercial partners—GE Corporate R&D, Kitware, Inc., and MathSoft (the company name is now
Insightful)—and three academic partners—University of North Carolina (UNC), University of Ten-
nessee (UT) (Ross Whitaker subsequently moved to University of Utah), and University of Penn-
sylvania (UPenn). The Principle Investigators for these partners were, respectively, Bill Lorensen
at GE CRD, Will Schroeder at Kitware, Vikram Chalana at Insightful, Stephen Aylward with Luis
Ibáñez at UNC (Luis is now at Kitware), Ross Whitaker with Josh Cates at UT (both now at Utah),
and Dimitri Metaxas at UPenn (now at Rutgers). In addition, several subcontractors rounded out the
consortium including Peter Raitu at Brigham & Women’s Hospital, Celina Imielinska and Pat Mol-
holt at Columbia University, Jim Gee at UPenn’s Grasp Lab, and George Stetten at the University of
Pittsburgh.
In 2002 the first official public release of ITK was made available.

CHAPTER
TWO
INSTALLATION
This section describes the process for installing OTB on your system. OTB is a toolbox, and as such,
once it is installed in your computer, provides by default a set of useful libraries. You can use these
libraries to build your own applications based on it. What OTB does provide, besides the toolbox,
is a large set of test files and examples that will introduce you to OTB concepts and will show you
how to use OTB in your own projects.
Since the release 3.12, OTB embeds a specific framework to generate applications in a more user-
friendly way. If you activate the specific option BUILD APPLICATIONS, OTB builds for each appli-
cation one shared library (also known as plugin). This plugin can be auto-loaded into appropriate
tools without recompiling, and is able to fully describe its own parameters, behaviour and documen-
tation.
The tools to use these plugins can be extended, but OTB is shipped with the following:
• A command-line launcher,
• A graphical launcher, with an auto-generated Qt interface, providing ergonomic parameters
setting, display of documentation, and progress reporting,
• A C and SWIG interface, which means that any application can be loaded, set up and executed
into a high-level language such as Python for instance.
The classical OTB powered applications are still available and will still be maintained:
Monteverdi An integrated application giving graphical access to a lot of OTB functionalities.
OTB-Wrapping Calling OTB from Python or Java (see chapter 34).
There are two ways to install OTB on your system: installing from a binary distribution, or compiling
from sources (Table 2.1). What you need depends on your system, and on what you intend to do.

12 Chapter 2. Installation
OTB library and
internal applica-
tions
Monteverdi Wrapping (Java
and Python)
Any platform Build from source
(see section 2.2)
Build from source
(see section 2.2)
Build from source
(see section 2.2)
Windows 2000/X-
P/Vista/Seven
OSGeo4W installer
for internal applica-
tions
Windows or OS-
Geo4W installer
(see section 2.1.1)
OSGeo4W installer
(see section 2.1.1)
MacOS X 10.6 and
higher
MacPorts (see sec-
tion 2.1.2)
app’s DMG ﬁle (see
section 2.1.2)
Ubuntu Linux
10.04 and higher
(32/64 bits)
APT repository (see
section 2.1.3)
APT repository (see
section 2.1.3)
OpenSuse 11.X
and higher (32/64
bits)
RPM package (see
section 2.1.3)
RPM package (see
section 2.1.3)
CentOS 6.X and
higher (32/64 bits)
RPM package (see
section 2.1.3)
RPM package (see
section 2.1.3)
RPM package (see
section 2.1.3)
Table 2.1: Available installation procedures with respect to system conﬁguration and target usage
2.1 Installing binary packages
2.1.1 Windows 2000/XP/Vista/Seven
For Windows 2000/XP/Vista/Seven users, there is a classical standalone installation program for
Monteverdi, available from the OTB download page. There are another way to install Monteverdi
directly through through OSGeo4W. With this new way, you can install other OTB packages as
described in the following paragraph.
Since version 3.12, we provide OTB packages through OSGeo4W for Windows 2000/XP/Vista/-
Seven users. Packages for the OTB applications, Monteverdi and OTB-Wrapping are available
directly in the OSGeo4W installer with the following names:
• otb-bin for command line and QT applications
• otb-python for python applications
• otb-monteverdi for the Monteverdi application
• otb-wrapping for low level Python/Java bindings.
Follow the instructions in the installer and select the packages you want to add. The installer will
proceed with the installation of selected packages and all their dependencies. In the case of otb-
monteverdi, Monteverdi will be directly installed in the OSGeo4W repository and a shortcut will

2.1. Installing binary packages 13
be added to your desktop and in the start menu (in the OSGeo4W folder). You can now use directly
Monteverdi from your desktop, from the start menu and from an OSGeo4W shell with command
monteverdi. For the otb-bin packages, it will be available directly in the OSGeo4W shell, for
example run
otbgui_BandMath.
We recommended to use this way to install internal applications of OTB on Windows. If you want
to use the otb-python package, for example in QGIS, please following the install note found in the
OTB wiki. You can simply check from an OSGeo4W shell the list of available applications:
python
import otbApplication
print str( otbApplication.Registry.GetAvailableApplications() )
For more information you can read the OTB Cookbook which provides useful information about
these OTB applications.
2.1.2 MacOS X
Orfeo Toolbox is now available on MacPorts. The port name is called orfeotoolbox. You can follow
the MacPorts documentation to install MacPorts ﬁrst, then install the orfeotoolbox port.
A standard DMG package is available for Monteverdi for MacOS X 10.8. Please go the the
OTB download page. Click on the ﬁle to launch Monteverdi.
2.1.3 Linux
Ubuntu 10.04 and higher
For Ubuntu 10.04 and higher, OTB, OTB applications, Monteverdi and OTB-Wrapping packages
may be available as Debian packages through APT repositories.
Since release 3.14.1, OTB packages are available in the ubuntugis-unstable repository.
You can add it by using these command-lines:
sudo aptitude install add-apt-repository
sudo apt-add-repository ppa:ubuntugis/ubuntugis-unstable
Now run:
sudo aptitude update

If you want only OTB libraries, please use:
sudo aptitude install libotb
If you want OTB Applications, please use:
sudo aptitude install otb-bin otb-bin-qt python-otb
If you want monteverdi application, please use:
sudo aptitude install monteverdi
If you want Python/Java wrapping, please use:
sudo aptitude install otb-wrapping-python otb-wrapping-java
If you are using Synaptic, you can add the repositories, update and install the packages through the
graphical interface.
For further informations about Ubuntu packages go to ubuntugis-unstable launchpad page and click
on Read about installing.
Moreover, nightly builds for all OTB projects are also available through APT repositories. However
you should use it carefully. If you are using apt command-line tools, use these command-lines to
add the otb repository to apt sources:
sudo aptitude install add-apt-repository
sudo add-apt-repository ppa:otb/orfeotoolbox-nightly
Now run:
sudo aptitude update
As with the previous cases, select which packages you will install.
sudo aptitude install libotb otb-bin otb-bin-qt python-otb monteverdi
Be careful not to add the two repositories, since they may cause incompatibilities.
apt-add-repository will try to retrieve the GPG keys of the repositories to certify the origin of the
packages. If you are behind a http proxy, this step won’t work and apt-add-repository will stall and
eventually quit. You can temporarily ignore this error and proceed with the update step. Following
this, aptitude update will issue a warning about a signature problem. This warning won’t prevent
you from installing the packages.

2.1. Installing binary packages 15
OpenSuse 11.X and higher
For OpenSuse 11.X and higher, OTB, Monteverdi and OTB-Wrapping packages are available
through zypper.
First, you need to add the appropriate repositories with these command-lines (please replace 11.4
by your OpenSuse version):
sudo zypper ar
http://download.opensuse.org/repositories/games/openSUSE_11.4/ Games
sudo zypper ar
http://download.opensuse.org/repositories/Application:/Geo/openSUSE_11.4/ GEO
sudo zypper ar
http://download.opensuse.org/repositories/home:/tzotsos/openSUSE_11.4/ tzotsos
Now run:
sudo zypper refresh
sudo zypper install OrfeoToolbox
sudo zypper install OrfeoToolbox-devel
sudo zypper install Monteverdi
sudo zypper install OrfeoToolbox-python
Alternatively you can use the One-Click Installer from the openSUSE Download page or add the
above repositories and install through Yast Package Management.
In case you wish to test the latest version of the packages, you can run:
sudo zypper refresh
sudo zypper install OrfeoToolbox-beta
sudo zypper install OrfeoToolbox-beta-devel
sudo zypper install Monteverdi-beta
There is also support for the recently introduced ’rolling’ openSUSE distribution named ’Tumble-
weed’. For Tumbleweed you need to add the following repositories with these command-lines:
sudo zypper ar
http://download.opensuse.org/repositories/games/openSUSE_Tumbleweed/ Games
sudo zypper ar
http://download.opensuse.org/repositories/Application:/Geo/openSUSE_Tumbleweed/ GEO
sudo zypper ar
http://download.opensuse.org/repositories/home:/tzotsos/openSUSE_Tumbleweed/ tzotsos
and then add the OTB packages as shown above.

CentOS 5.5
Since 3.12 version, OTB packages (OTB, monteverdi) are available on CentOS 5.5 distribution.
In the following lines, we will describe the way to install OTB and monteverdi packages on this
distribution. If you want use OTB-Wrapping, please following the instructions found in this file.
Preliminary steps (common to all packages):
1. Configure your system to pass through your HTTP(S) proxy,
2. Add required extra repositories: EPEL, ELGIS and centosplus,
can be found in the previous file.
So if you want OTB and monteverdi packages, you need to install a standard RPM packages which
is available on OTB website:
wget http://www.orfeo-toolbox.org/packages/centos/5.5/i386/3.11.0/
gdal-1.8.0-5.i386.rpm
yum --nogpgcheck install gdal-1.8.0-5.i386.rpm
We can now get and install (replace version number by the current):
1. OTB libraries and applications package:
OrfeoToolbox-3.11.0-1.i386.rpm
yum --nogpgcheck install OrfeoToolbox-3.11.0-1.i386.rpm
2. monteverdi package:
Monteverdi-1.9.0-0.i386.rpm
yum --nogpgcheck install Monteverdi-1.9.0-0.i386.rpm
2.2 Building from sources
OTB has been developed and tested across different combinations of operating systems, compil-
ers, and hardware platforms including MS-Windows, Linux on Intel-compatible hardware and Mac
OSX. It is known to work with the following compilers:

2.2. Building from sources 17
• Visual Studio C++ 9.0 .NET 2008, Visual Studio Express 2008 and 2010 on MS-Windows
• GCC (4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7) on Unix/Linux systems
• GCC on MacOSX (10.6 and higher) systems
Given the advanced usage of C++ features in the toolbox, some compilers may have difﬁculties
processing the code. If you are currently using an outdated compiler this may be an excellent excuse
for upgrading this old piece of software!
Please note that though this section only describes how to compile OTB from sources, Monteverdi
and OTB-Wrapping can be compiled in a similar way.
2.2.1 Getting the OTB source code
There are three ways of getting the OTB source code:
• Download the latest current release from the OTB download page,
• Clone the current release with Mercurial from the OTB mercurial server,
• Clone the latest revision with Mercurial from the OTB mercurial server.
These last two options need a proper Mercurial installation. To get source code from Mercurial, do:
hg clone http://hg.orfeo-toolbox.org/OTB
Using Mercurial, you can easily navigate through the different version. For instance, this brings you
to the 3.16.0 source code version:
hg update -r 3.16.0
And this brings you to the latest development version:
hg update
2.2.2 External Libraries
The OTB depends on several libraries:
• ITK: The medical image processing library is embedded in the OTB, it is considered as an
internal library.

• GDAL: The support of remote sensing imagery formats is ensured through the use of the
GDAL library. Please see http://www.remotesensing.org/gdal/ for informations on
how to install.
• OSSIM: Image geometry library, it handles for example the sensor geometric models. Please
see http://www.ossim.org for informations on how to install. download and install this
library on your system. OTB requires at least the version 1.6.
• Fltk: this library is used for the visualization functionnalities. See http://www.fltk.org/
for details about dowloading and installing Fltk. OTB has been tested with version 1.1.9,
1.1.10 and 1.3. You also have the choice to use OTB’s internal version (which is 1.1.9) if you
don’t have FLTK already installed.
• gettext (optional): Since version 3.2, OTB includes experimental support for internationaliza-
tion. Gettext is required to activate this functionnality.
See section 32.4 for quick installation guidelines for GDAL or Fltk if you don’t have access to
standard packages.
2.2.3 Configuring OTB
The challenge of supporting OTB across platforms has been solved through the use of CMake,
a cross-platform, open-source build system. CMake is used to control the software compilation
process using simple platform and compiler independent configuration files. CMake generates native
makefiles and workspaces that can be used in the compiler environment of your choice. CMake is
quite sophisticated—it supports complex environments requiring system configuration, compiler
feature testing, and code generation.
CMake generates Makefiles under UNIX systems and generates Visual Studio workspaces under
Windows (and appropriate build files for other compilers like Borland). The information used by
CMake is provided by CMakeLists.txt files that are present in every directory of the OTB source
tree. These files contain information that the user provides to CMake at configuration time. Typical
information includes paths to utilities in the system and the selection of software options specified
by the user.
Preparing CMake
CMake can be downloaded at no cost from
http://www.cmake.org
OTB requires at least CMake version 2.8. You can download binary versions for most of the popular
platforms including Windows, Solaris, IRIX, HP, Mac and Linux. Alternatively you can download

2.2. Building from sources 19
the source code and build CMake on your system. Follow the instructions in the CMake Web page
for downloading and installing the software.
For Unix/Linux systems, an additional help can be found, to compile Cmake from source, in the
32.4 section, subsection Unix/Linux Platforms.
Running CMake initially requires that you provide two pieces of information:
• Where the source code directory is located (ex.: OTB SOURCE DIR),
• Where the object code is to be produced: (ex.: OTB BINARY DIR).
These are referred to as the source directory and the binary directory. We recommend setting the
binary directory to be different than the source directory (an out-of-source build), but OTB will still
build if they are set to a directory inside the source directory (an in-source build).
Compiling OTB
CMake runs in an interactive mode in that you iteratively select options and configure according to
these options. The iteration proceeds until no more options remain to be selected. At this point, a
generation step produces the appropriate build files for your configuration.
This interactive configuration process can be better understood if you imagine that you are walking
through a decision tree. Every option that you select introduces the possibility that new, dependent
options may become relevant. These new options are presented by CMake at the top of the options
list in its interface. Only when no new options appear after a configuration iteration can you be sure
that the necessary decisions have all been made. At this point build files are generated for the current
configuration.
As the following figures display it, CMake has a different interface according to your systems.
On Windows: The OTB needs some external libraries to work, as described in the table 32.1 of
section 32.4. On Windows, we recommended to use the OSGeo4W tool to get those libraries.
However, as several of these libraries are optional or internal to OTB, the minimum requirement for
compiling OTB is to install Gdal.
1. Download and start the OSGeo4w setup in advanced mode
2. Select the gdal package located in Libs folder
3. Optionally select the other libraries you want to use as external
4. Proceed with the installation.
On windows, you will first create a directory to hold the binaries to be built. Your directory structure
may look as follows:

* OTB
|- OTB_SOURCE_DIR
|- OTB_BINARY_DIR
where OTB SOURCE DIR contains is the base directory of the OTB source code, and
OTB BINARY DIR is empty for now.
Launch the CMake executable. On the top of the GUI (Figure 2.1), select the source and binary
directories, then press Configure. Variables appear at the center of the CMake GUI. You must set
some of these variables to tell CMake about your system. The first thing to do is to enable the
Advanced checkbox and change the value of the OSGEO4W ROOT variable to the directory where you
installed OSGeo4W. Then press Configure again.
The Generate button will be accessible only when all required CMake variables are correctly set. At
this time, push Generate, wait for the end of the generation process then close the CMake window.
Now, you can go in your binary directory (here OTB BINARY DIR). CMake has generated Makefiles
or Visual Studio project files. You can open the solution and build the ALL BUILD project. If you
plan to run executables from within the Visual Studio environment, you should open the Visual
Studio solution file from the OSGeo4W shell. To do this, simply drag-and-drop the ”.sln” file to an
OSGeo4W shell and press Enter.
Windows platforms didn’t support to build with shared libraries. If you experience any problem
with that, you can set BUILD SHARED LIBS to OFF but the built size might reach 1 GB.
That should be all! Otherwise, subscribe to otb-users@googlegroups.com and you will get some
help.
On Linux/Unix: On Unix, the binary directory is created by the user and CMake is invoked with
the path to the source directory. For example:
mkdir OTB_BINARY_DIR
cd OTB_BINARY_DIR
ccmake ../OTB_SOURCE_DIR
Once the interface open, you can push c (for Configure), or/and t for Toggle (to have access to the
advanced variables). If you have follow the section 32.4 every needed libraries are in the /usr and
will be found automatically by CMake. Set the following recommended CMake variables:
• OTB USE EXTERNAL BOOST:BOOL=ON
• OTB USE CURL:BOOL=ON
• OTB USE EXTERNAL EXPAT:BOOL=ON
• OTB USE LIBLAS:BOOL=ON
• OTB USE EXTERNAL LIBLAS:BOOL=ON

2.3. Getting Started With OTB 21
• OTB USE GETTEXT:BOOL=OFF
• OTB USE JPEG2000:BOOL=ON
Push c to update CMake untill the option Generate is accessible in the GUI, then push g fot
Generate. The CMake interface will close itself. Then, go into your build directory (here
OTB BINARY DIR) and do make to launch the compilation.
If you want to put OTB in a standard location, you can proceed with:
make install
Compilation awarness The build process will typically take anywhere from 15 to 30 minutes de-
pending on the performance of your system. If you decide to enable testing as part of the normal
build process, about 2500 small tests programs will be compiled. This will verify that the basic
components of OTB have been correctly built on your system.
Set the CMake variables BUILD TESTING and BUILD EXAMPLES to ON will activate the compilation
of the examples and the tests and slow down the build process. The examples distributed with the
toolbox are a helpful resource for learning how to use OTB components but are not essential for the
use of the toolbox itself. The testing section includes a large number of small programs that exercise
the capabilities of OTB classes. Due to the large number of tests, enabling the testing option will
considerably increase the build time. It is not desirable to enable this option for a first build of the
toolbox.
Set the CMake variable BUILD APPLICATION to ON will activate the compilation of the applications
and their launchers needed to use it. A complete list of these applications, described in the OTB
Cookbook, can be found on OTB Website.
2.3 Getting Started With OTB
The simplest way to create a new project with OTB is to create a new directory somewhere in your
disk and create two files in it. The first one is a CMakeLists.txt file that will be used by CMake
to generate a Makefile (if you are using UNIX) or a Visual Studio workspace (if you are using MS-
Windows). The second file is an actual C++ program that will exercise some of the large number of
classes available in OTB. The details of these files are described in the following section.
Once both files are in your directory you can run CMake in order to configure your project. Un-
der UNIX, you can cd to your newly created directory and type “ccmake .”. Note the “.” in the
command line for indicating that the CMakeLists.txt file is in the current directory. The curses in-
terface will require you to provide the directory where OTB was built. This is the same path that you
indicated for the OTB BINARY DIR variable at the time of configuring OTB. Under Windows you
can run CMakeSetup and provide your newly created directory as being both the source directory
and the binary directory for your new project (i.e., an in-source build). Then CMake will require

you to provide the path to the binary directory where OTB was built. The OTB binary directory will
contain a file named OTBConfig.cmake generated during the configuration process at the time OTB
was built. From this file, CMake will recover all the information required to configure your new
OTB project.
2.3.1 Hello World !
Here is the content of the two files to write in your new project. These two files can be found in the
OTB/Examples/Installation directory. The CMakeLists.txt file contains the following lines:
PROJECT(HelloWorld)
FIND_PACKAGE(OTB REQUIRED)
INCLUDE(${OTB_USE_FILE})
ADD_EXECUTABLE(HelloWorld HelloWorld.cxx )
TARGET_LINK_LIBRARIES(HelloWorld OTBIO)
The first line defines the name of your project as it appears in Visual Studio (it will have no effect
under UNIX). The second line loads a CMake file with a predefined strategy for finding OTB 1. If
the strategy for finding OTB fails, CMake will prompt you for the directory where OTB is installed
in your system. In that case you will write this information in the OTB DIR variable. The line
INCLUDE(${USE OTB FILE}) loads the UseOTB.cmake file to set all the configuration information
from OTB.
The line ADD EXECUTABLE defines as its first argument the name of the executable that will be
produced as result of this project. The remaining arguments of ADD EXECUTABLE are the names
of the source files to be compiled and linked. Finally, the TARGET LINK LIBRARIES line specifies
which OTB libraries will be linked against this project.
The source code for this example can be found in the file
Examples/Installation/HelloWorld.cxx.
The following code is an implementation of a small OTB program. It tests including header files
and linking with OTB libraries.
#include "otbImage.h"
#include <iostream >
int main()
{
typedef otb::Image <unsigned short, 2> ImageType ;
ImageType ::Pointer image = ImageType ::New ();
1Similar files are provided in CMake for other commonly used libraries, all of them named Find*.cmake

2.3. Getting Started With OTB 23
std::cout << "OTB Hello World !" << std::endl;
return EXIT_SUCCESS;
}
This code instantiates an image whose pixels are represented with type unsigned short. The
image is then constructed and assigned to a itk::SmartPointer. Although later in the text we
will discuss SmartPointer’s in detail, for now think of it as a handle on an instance of an object
(see section 3.2.4 for more information). The itk::Image class will be described in Section 5.1.
At this point you have successfully installed and compiled OTB, and created your ﬁrst simple pro-
gram. If you have difﬁculties, please join the otb-users mailing list (Section 1.4.1 on page 5) and
post questions there.

Figure 2.1: CMake interface. Top) ccmake, the UNIX version based on curses. Bottom) CMakeSetup, the
MS-Windows version based on MFC.

CHAPTER
THREE
SYSTEM OVERVIEW
The purpose of this chapter is to provide you with an overview of the ORFEO Toolbox system. We
recommend that you read this chapter to gain an appreciation for the breadth and area of application
of OTB. In this chapter, we will make reference either to OTB features or ITK features without
distinction. Bear in mind that OTB uses ITK as its core element, so all the fundamental elements
of OTB come from ITK. OTB extends the functionalities of ITK for the remote sensing image
processing comunity. We beneﬁt from the Open Source development approach chosen for ITK,
which allows us to provide an impressive set of functionalities with much lesser effort than it would
have been the case in a closed source universe!
3.1 System Organization
The ORFEO Toolbox consists of several subsystems. A brief description of these subsystems
follows. Later sections in this chapter—and in some cases additional chapters—cover these con-
cepts in more detail. (Note: in the previous chapter two other modules—OTB-Documents and
OTB-Wrapping were brieﬂy described.)
Essential System Concepts. Like any software system, OTB is built around some core design con-
cepts. OTB uses those of ITK. Some of the more important concepts include generic program-
ming, smart pointers for memory management, object factories for adaptable object instanti-
ation, event management using the command/observer design paradigm, and multithreading
support.
Numerics OTB, as ITK uses VXL’s VNL numerics libraries. These are easy-to-use C++ wrappers
around the Netlib Fortran numerical analysis routines (http://www.netlib.org).
Data Representation and Access. Two principal classes are used to represent data: the
otb::Image and itk::Mesh classes. In addition, various types of iterators and contain-
ers are used in ITK to hold and traverse the data. Other important but less popular classes are
also used to represent data such as histograms.

26 Chapter 3. System Overview
ITK’s Data Processing Pipeline. The data representation classes (known as data objects) are op-
erated on by filters that in turn may be organized into data flow pipelines. These pipelines
maintain state and therefore execute only when necessary. They also support multi-threading,
and are streaming capable (i.e., can operate on pieces of data to minimize the memory foot-
print).
IO Framework. Associated with the data processing pipeline are sources, filters that initiate the
pipeline, and mappers, filters that terminate the pipeline. The standard examples of sources
and mappers are readers and writers respectively. Readers input data (typically from a file),
and writers output data from the pipeline. Viewers are another example of mappers.
Spatial Objects. Geometric shapes are represented in OTB using the ITK spatial object hierarchy.
These classes are intended to support modeling of anatomical structures in ITK. OTB uses
them in order to model cartographic elements. Using a common basic interface, the spatial
objects are capable of representing regions of space in a variety of different ways. For ex-
ample: mesh structures, image masks, and implicit equations may be used as the underlying
representation scheme. Spatial objects are a natural data structure for communicating the re-
sults of segmentation methods and for introducing geometrical priors in both segmentation
and registration methods.
ITK’s Registration Framework. A flexible framework for registration supports four different
types of registration: image registration, multiresolution registration, PDE-based registration,
and FEM (finite element method) registration.
FEM Framework. ITK includes a subsystem for solving general FEM problems, in particular non-
rigid registration. The FEM package includes mesh definition (nodes and elements), loads,
and boundary conditions.
Level Set Framework. The level set framework is a set of classes for creating filters to solve partial
differential equations on images using an iterative, finite difference update scheme. The level
set framework consists of finite difference solvers including a sparse level set solver, a generic
level set segmentation filter, and several specific subclasses including threshold, Canny, and
Laplacian based methods.
Wrapping. ITK uses a unique, powerful system for producing interfaces (i.e., “wrappers”) to inter-
preted languages such as Tcl and Python. The GCC XML tool is used to produce an XML
description of arbitrarily complex C++ code; CSWIG is then used to transform the XML
description into wrappers using the SWIG package. OTB does not use this system at present.
3.2 Essential System Concepts
This section describes some of the core concepts and implementation features found in ITK and
therefore also in OTB.

3.2. Essential System Concepts 27
3.2.1 Generic Programming
Generic programming is a method of organizing libraries consisting of generic—or reusable—
software components [98]. The idea is to make software that is capable of “plugging together”
in an efficient, adaptable manner. The essential ideas of generic programming are containers to hold
data, iterators to access the data, and generic algorithms that use containers and iterators to create
efficient, fundamental algorithms such as sorting. Generic programming is implemented in C++
with the template programming mechanism and the use of the STL Standard Template Library [8].
C++ templating is a programming technique allowing users to write software in terms of one or
more unknown types T. To create executable code, the user of the software must specify all types T
(known as template instantiation) and successfully process the code with the compiler. The T may
be a native type such as float or int, or T may be a user-defined type (e.g., class). At compile-
time, the compiler makes sure that the templated types are compatible with the instantiated code and
that the types are supported by the necessary methods and operators.
ITK uses the techniques of generic programming in its implementation. The advantage of this
approach is that an almost unlimited variety of data types are supported simply by defining the
appropriate template types. For example, in OTB it is possible to create images consisting of almost
any type of pixel. In addition, the type resolution is performed at compile-time, so the compiler can
optimize the code to deliver maximal performance. The disadvantage of generic programming is that
many compilers still do not support these advanced concepts and cannot compile OTB. And even
if they do, they may produce completely undecipherable error messages due to even the simplest
syntax errors. If you are not familiar with templated code and generic programming, we recommend
the two books cited above.
3.2.2 Include Files and Class Definitions
In ITK and OTB classes are defined by a maximum of two files: a header .h file and an implemen-
tation file—.cxx if a non-templated class, and a .txx if a templated class. The header files contain
class declarations and formatted comments that are used by the Doxygen documentation system to
automatically produce HTML manual pages.
In addition to class headers, there are a few other important header files.
itkMacro.h is found in the Utilities/ITK/Code/Common directory and defines standard
system-wide macros (such as Set/Get, constants, and other parameters).
itkNumericTraits.h is found in the Utilities/ITK/Code/Common directory and defines
numeric characteristics for native types such as its maximum and minimum possible values.
itkWin32Header.h is found in the Utilities/ITK/Code/Common and is used to define oper-
ating system parameters to control the compilation process.

3.2.3 Object Factories
Most classes in OTB are instantiated through an object factory mechanism. That is, rather than using
the standard C++ class constructor and destructor, instances of an OTB class are created with the
static class New() method. In fact, the constructor and destructor are protected: so it is generally
not possible to construct an OTB instance on the heap. (Note: this behavior pertains to classes
that are derived from itk::LightObject. In some cases the need for speed or reduced memory
footprint dictates that a class not be derived from LightObject and in this case instances may be
created on the heap. An example of such a class is itk::EventObject.)
The object factory enables users to control run-time instantiation of classes by registering one or
more factories with itk::ObjectFactoryBase. These registered factories support the method
CreateInstance(classname) which takes as input the name of a class to create. The factory can
choose to create the class based on a number of factors including the computer system configura-
tion and environment variables. For example, in a particular application an OTB user may wish to
deploy their own class implemented using specialized image processing hardware (i.e., to realize a
performance gain). By using the object factory mechanism, it is possible at run-time to replace the
creation of a particular OTB filter with such a custom class. (Of course, the class must provide the
exact same API as the one it is replacing.) To do this, the user compiles her class (using the same
compiler, build options, etc.) and inserts the object code into a shared library or DLL. The library is
then placed in a directory referred to by the OTB AUTOLOAD PATH environment variable. On instan-
tiation, the object factory will locate the library, determine that it can create a class of a particular
name with the factory, and use the factory to create the instance. (Note: if the CreateInstance()
method cannot find a factory that can create the named class, then the instantiation of the class falls
back to the usual constructor.)
In practice object factories are used mainly (and generally transparently) by the OTB input/output
(IO) classes. For most users the greatest impact is on the use of the New() method to create a class.
Generally the New() method is declared and implemented via the macro itkNewMacro() found in
Utilities/ITK/Common/itkMacro.h.
3.2.4 Smart Pointers and Memory Management
By their nature object-oriented systems represent and operate on data through a variety of object
types, or classes. When a particular class is instantiated to produce an instance of that class, mem-
ory allocation occurs so that the instance can store data attribute values and method pointers (i.e.,
the vtable). This object may then be referenced by other classes or data structures during normal
operation of the program. Typically during program execution all references to the instance may
disappear at which point the instance must be deleted to recover memory resources. Knowing when
to delete an instance, however, is difficult. Deleting the instance too soon results in program crashes;
deleting it too late and memory leaks (or excessive memory consumption) will occur. This process
of allocating and releasing memory is known as memory management.
In ITK, memory management is implemented through reference counting. This compares to another

3.2. Essential System Concepts 29
popular approach—garbage collection—used by many systems including Java. In reference count-
ing, a count of the number of references to each instance is kept. When the reference goes to zero,
the object destroys itself. In garbage collection, a background process sweeps the system identifying
instances no longer referenced in the system and deletes them. The problem with garbage collection
is that the actual point in time at which memory is deleted is variable. This is unacceptable when
an object size may be gigantic (think of a large 3D volume gigabytes in size). Reference counting
deletes memory immediately (once all references to an object disappear).
Reference counting is implemented through a Register()/Delete() member function interface.
All instances of an OTB object have a Register() method invoked on them by any other object that
references an them. The Register() method increments the instances’ reference count. When the
reference to the instance disappears, a Delete() method is invoked on the instance that decrements
the reference count—this is equivalent to an UnRegister() method. When the reference count
returns to zero, the instance is destroyed.
This protocol is greatly simpliﬁed by using a helper class called a itk::SmartPointer. The smart
pointer acts like a regular pointer (e.g. supports operators -> and *) but automagically performs a
Register() when referring to an instance, and an UnRegister() when it no longer points to the
instance. Unlike most other instances in OTB, SmartPointers can be allocated on the program stack,
and are automatically deleted when the scope that the SmartPointer was created is closed. As a
result, you should rarely if ever call Register() or Delete() in OTB. For example:
MyRegistrationFunction()
{ <----- Start of scope
// here an interpolator is created and associated to the
// SmartPointer "interp".
InterpolatorType::Pointer interp = InterpolatorType::New();
} <------ End of scope
In this example, reference counted objects are created (with the New() method) with a reference
count of one. Assignment to the SmartPointer interp does not change the reference count. At the
end of scope, interp is destroyed, the reference count of the actual interpolator object (referred to
by interp) is decremented, and if it reaches zero, then the interpolator is also destroyed.
Note that in ITK SmartPointers are always used to refer to instances of classes derived from
itk::LightObject. Method invocations and function calls often return “real” pointers to instances,
but they are immediately assigned to a SmartPointer. Raw pointers are used for non-LightObject
classes when the need for speed and/or memory demands a smaller, faster class.
3.2.5 Error Handling and Exceptions
In general, OTB uses exception handling to manage errors during program execution. Exception
handling is a standard part of the C++ language and generally takes the form as illustrated below:

try
{
//...try executing some code here...
}
catch ( itk::ExceptionObject exp )
{
//...if an exception is thrown catch it here
}
where a particular class may throw an exceptions as demonstrated below (this code snippet is taken
from itk::ByteSwapper:
switch ( sizeof(T) )
{
//non-error cases go here followed by error case
default:
ByteSwapperError e(__FILE__, __LINE__);
e.SetLocation("SwapBE");
e.SetDescription("Cannot swap number of bytes requested");
throw e;
}
Note that itk::ByteSwapperError is a subclass of itk::ExceptionObject. (In fact in OTB all
exceptions should be derived from itk::ExceptionObject.) In this example a special constructor
and C++ preprocessor variables FILE and LINE are used to instantiate the exception
object and provide additional information to the user. You can choose to catch a particular exception
and hence a specific OTB error, or you can trap any OTB exception by catching ExceptionObject.
3.2.6 Event Handling
Event handling in OTB is implemented using the Subject/Observer design pattern [49] (sometimes
referred to as the Command/Observer design pattern). In this approach, objects indicate that they are
watching for a particular event—invoked by a particular instance–by registering with the instance
that they are watching. For example, filters in OTB periodically invoke the itk::ProgressEvent.
Objects that have registered their interest in this event are notified when the event occurs. The notifi-
cation occurs via an invocation of a command (i.e., function callback, method invocation, etc.) that
is specified during the registration process. (Note that events in OTB are subclasses of EventObject;
look in itkEventObject.h to determine which events are available.)
To recap via example: various objects in OTB will invoke specific events as they execute (from
ProcessObject):
this->InvokeEvent( ProgressEvent() );
To watch for such an event, registration is required that associates a command (e.g., callback func-
tion) with the event: Object::AddObserver() method:

3.3. Numerics 31
unsigned long progressTag =
filter->AddObserver(ProgressEvent(), itk::Command*);
When the event occurs, all registered observers are notified via invocation of the associated
Command::Execute() method. Note that several subclasses of Command are available supporting
const and non-const member functions as well as C-style functions. (Look in Common/Command.h
to find pre-defined subclasses of Command. If nothing suitable is found, derivation is another pos-
sibility.)
3.2.7 Multi-Threading
Multithreading is handled in OTB through ITK’s high-level design abstraction. This approach pro-
vides portable multithreading and hides the complexity of differing thread implementations on the
many systems supported by OTB. For example, the class itk::MultiThreader provides support
for multithreaded execution using sproc() on an SGI, or pthread create on any platform sup-
porting POSIX threads.
Multithreading is typically employed by an algorithm during its execution phase. MultiThreader
can be used to execute a single method on multiple threads, or to specify a method per thread.
For example, in the class itk::ImageSource (a superclass for most image processing filters) the
GenerateData() method uses the following methods:
multiThreader->SetNumberOfThreads(int);
multiThreader->SetSingleMethod(ThreadFunctionType, void* data);
multiThreader->SingleMethodExecute();
In this example each thread invokes the same method. The multithreaded filter takes care to divide
the image into different regions that do not overlap for write operations.
The general philosophy in ITK regarding thread safety is that accessing different instances of a class
(and its methods) is a thread-safe operation. Invoking methods on the same instance in different
threads is to be avoided.
3.3 Numerics
OTB; as ITK, uses the VNL numerics library to provide resources for numerical programming com-
bining the ease of use of packages like Mathematica and Matlab with the speed of C and the elegance
of C++. It provides a C++ interface to the high-quality Fortran routines made available in the pub-
lic domain by numerical analysis researchers. ITK extends the functionality of VNL by including
interface classes between VNL and ITK proper.
The VNL numerics library includes classes for

Matrices and vectors. Standard matrix and vector support and operations on these types.
Specialized matrix and vector classes. Several special matrix and vector class with special nu-
merical properties are available. Class vnl diagonal matrix provides a fast and con-
venient diagonal matrix, while fixed size matrices and vectors allow ”fast-as-C” compu-
tations (see vnl matrix fixed<T,n,m> and example subclasses vnl double 3x3 and
vnl double 3).
Matrix decompositions. Classes vnl svd<T>, vnl symmetric eigensystem<T>, and
vnl generalized eigensystem.
Real polynomials. Class vnl real polynomial stores the coefficients of a real polynomial, and
provides methods of evaluation of the polynomial at any x, while class vnl rpoly roots
provides a root finder.
Optimization. Classes vnl levenberg marquardt, vnl amoeba, vnl conjugate gradient,
vnl lbfgs allow optimization of user-supplied functions either with or without user-supplied
derivatives.
Standardized functions and constants. Class vnl math defines constants (pi, e, eps...) and
simple functions (sqr, abs, rnd...). Class numeric limits is from the ISO stan-
dard document, and provides a way to access basic limits of a type. For example
numeric limits<short>::max() returns the maximum value of a short.
Most VNL routines are implemented as wrappers around the high-quality Fortran routines
that have been developed by the numerical analysis community over the last forty years and
placed in the public domain. The central repository for these programs is the ”netlib” server
http://www.netlib.org/. The National Institute of Standards and Technology (NIST) provides
an excellent search interface to this repository in its Guide to Available Mathematical Software
(GAMS) at http://gams.nist.gov, both as a decision tree and a text search.
ITK also provides additional numerics functionality. A suite of optimizers, that use VNL under
the hood and integrate with the registration framework are available. A large collection of statistics
functions—not available from VNL—are also provided in the Insight/Numerics/Statistics
directory. In addition, a complete finite element (FEM) package is available, primarily to support
the deformable registration in ITK.
3.4 Data Representation
There are two principal types of data represented in OTB: images and meshes. This functionality is
implemented in the classes Image and Mesh, both of which are subclasses of itk::DataObject.
In OTB, data objects are classes that are meant to be passed around the system and may participate
in data flow pipelines (see Section 3.5 on page 34 for more information).

3.4. Data Representation 33
otb::Image represents an n-dimensional, regular sampling of data. The sampling direction is paral-
lel to each of the coordinate axes, and the origin of the sampling, inter-pixel spacing, and the number
of samples in each direction (i.e., image dimension) can be specified. The sample, or pixel, type in
OTB is arbitrary—a template parameter TPixel specifies the type upon template instantiation. (The
dimensionality of the image must also be specified when the image class is instantiated.) The key is
that the pixel type must support certain operations (for example, addition or difference) if the code is
to compile in all cases (for example, to be processed by a particular filter that uses these operations).
In practice the OTB user will use a C++ simple type (e.g., int, float) or a pre-defined pixel type
and will rarely create a new type of pixel class.
One of the important ITK concepts regarding images is that rectangular, continuous pieces of the
image are known as regions. Regions are used to specify which part of an image to process, for
example in multithreading, or which part to hold in memory. In ITK there are three common types
of regions:
1. LargestPossibleRegion—the image in its entirety.
2. BufferedRegion—the portion of the image retained in memory.
3. RequestedRegion—the portion of the region requested by a filter or other class when oper-
ating on the image.
The otb::Image class extends the functionalities of the itk::Image in order to take into account
particular remote sensing features as geographical projections, etc.
The Mesh class represents an n-dimensional, unstructured grid. The topology of the mesh is rep-
resented by a set of cells defined by a type and connectivity list; the connectivity list in turn refers
to points. The geometry of the mesh is defined by the n-dimensional points in combination with
associated cell interpolation functions. Mesh is designed as an adaptive representational structure
that changes depending on the operations performed on it. At a minimum, points and cells are re-
quired in order to represent a mesh; but it is possible to add additional topological information. For
example, links from the points to the cells that use each point can be added; this provides implicit
neighborhood information assuming the implied topology is the desired one. It is also possible to
specify boundary cells explicitly, to indicate different connectivity from the implied neighborhood
relationships, or to store information on the boundaries of cells.
The mesh is defined in terms of three template parameters: 1) a pixel type associated with the
points, cells, and cell boundaries; 2) the dimension of the points (which in turn limits the maximum
dimension of the cells); and 3) a “mesh traits” template parameter that specifies the types of the
containers and identifiers used to access the points, cells, and/or boundaries. By using the mesh
traits carefully, it is possible to create meshes better suited for editing, or those better suited for
“read-only” operations, allowing a trade-off between representation flexibility, memory, and speed.
Mesh is a subclass of itk::PointSet. The PointSet class can be used to represent point clouds or
randomly distributed landmarks, etc. The PointSet class has no associated topology.

3.5 Data Processing Pipeline
While data objects (e.g., images and meshes) are used to represent data, process objects are classes
that operate on data objects and may produce new data objects. Process objects are classed as
sources, filter objects, or mappers. Sources (such as readers) produce data, filter objects take in data
and process it to produce new data, and mappers accept data for output either to a file or some other
system. Sometimes the term filter is used broadly to refer to all three types.
The data processing pipeline ties together data objects (e.g., images and meshes) and process objects.
The pipeline supports an automatic updating mechanism that causes a filter to execute if and only
if its input or its internal state changes. Further, the data pipeline supports streaming, the ability
to automatically break data into smaller pieces, process the pieces one by one, and reassemble the
processed data into a final result.
Typically data objects and process objects are connected together using the SetInput() and
GetOutput() methods as follows:
typedef otb::Image<float,2> FloatImage2DType;
itk::RandomImageSource<FloatImage2DType>::Pointer random;
random = itk::RandomImageSource<FloatImage2DType>::New();
random->SetMin(0.0);
random->SetMax(1.0);
itk::ShrinkImageFilter<FloatImage2DType,FloatImage2DType>::Pointer shrink;
shrink = itk::ShrinkImageFilter<FloatImage2DType,FloatImage2DType>::New();
shrink->SetInput(random->GetOutput());
shrink->SetShrinkFactors(2);
otb::ImageFileWriter::Pointer<FloatImage2DType> writer;
writer = otb::ImageFileWriter::Pointer<FloatImage2DType>::New();
writer->SetInput (shrink->GetOutput());
writer->SetFileName( ‘‘test.raw’’ );
writer->Update();
In this example the source object itk::RandomImageSource is connected to
the itk::ShrinkImageFilter, and the shrink filter is connected to the mapper
otb::ImageFileWriter. When the Update() method is invoked on the writer, the data
processing pipeline causes each of these filters in order, culminating in writing the final data to a
file on disk.

3.6. Spatial Objects 35
3.6 Spatial Objects
The ITK spatial object framework supports the philosophy that the task of image segmentation and
registration is actually the task of object processing. The image is but one medium for representing
objects of interest, and much processing and data analysis can and should occur at the object level
and not based on the medium used to represent the object.
ITK spatial objects provide a common interface for accessing the physical location and geometric
properties of and the relationship between objects in a scene that is independent of the form used
to represent those objects. That is, the internal representation maintained by a spatial object may
be a list of points internal to an object, the surface mesh of the object, a continuous or parametric
representation of the object’s internal points or surfaces, and so forth.
The capabilities provided by the spatial objects framework supports their use in object segmentation,
registration, surface/volume rendering, and other display and analysis functions. The spatial object
framework extends the concept of a “scene graph” that is common to computer rendering packages
so as to support these new functions. With the spatial objects framework you can:
1. Specify a spatial object’s parent and children objects. In this way, a city may contain roads
and those roads can be organized in a tree structure.
2. Query if a physical point is inside an object or (optionally) any of its children.
3. Request the value and derivatives, at a physical point, of an associated intensity function, as
speciﬁed by an object or (optionally) its children.
4. Specify the coordinate transformation that maps a parent object’s coordinate system into a
child object’s coordinate system.
5. Compute the bounding box of a spatial object and (optionally) its children.
6. Query the resolution at which the object was originally computed. For example, you can
query the resolution (i.e., pixel spacing) of the image used to generate a particular instance of
a itk::LineSpatialObject.
Currently implemented types of spatial objects include: Blob, Ellipse, Group, Image, Line, Surface,
and Tube. The itk::Scene object is used to hold a list of spatial objects that may in turn have
children. Each spatial object can be assigned a color property. Each spatial object type has its own
capabilities. For example, itk::TubeSpatialObjects indicate to what point on their parent tube
they connect.
There are a limited number of spatial objects and their methods in ITK, but their number is growing
and their potential is huge. Using the nominal spatial object capabilities, methods such as mutual
information registration, can be applied to objects regardless of their internal representation. By
having a common API, the same method can be used to register a parametric representation of a
building with an image or to register two different segmentations of a particular object in object-
based change detection.

CHAPTER
FOUR
BUILDING SIMPLE APPLICATIONS
WITH OTB
Well, that’s it, you’ve just downloaded and installed OTB, lured by the promise that you will be able
to do everything with it. That’s true, you will be able to do everything but - there is always a but -
some effort is required.
OTB uses the very powerful systems of generic programing, many classes are already available,
some powerful tools are defined to help you with recurrent tasks, but it is not an easy world to enter.
These tutorials are designed to help you enter this world and grasp the logic behind OTB. Each
of these tutorials should not take more than 10 minutes (typing included) and each is designed to
highlight a specific point. You may not be concerned by the latest tutorials but it is strongly advised
to go through the first few which cover the basics you’ll use almost everywhere.
4.1 Hello world
Let’s start by the typical Hello world program. We are going to compile this C++ program linking
to your new OTB.
First, create a new folder to put your new programs (all the examples from this tutorial) in and go
into this folder.
Since all programs using OTB are handled using the CMake system, we need to create a
CMakeLists.txt that will be used by CMake to compile our program. An example of this file
can be found in the OTB/Examples/Tutorials directory. The CMakeLists.txt will be very simi-
lar between your projects.
Open the CMakeLists.txt file and write in the few lines:
PROJECT(Tutorials)

40 Chapter 4. Building Simple Applications with OTB
cmake_minimum_required(VERSION 2.6)
FIND_PACKAGE(OTB)
IF(OTB_FOUND)
INCLUDE(${OTB_USE_FILE})
ELSE(OTB_FOUND)
MESSAGE(FATAL_ERROR
"Cannot build OTB project without OTB. Please set OTB_DIR.")
ENDIF(OTB_FOUND)
ADD_EXECUTABLE(HelloWorldOTB HelloWorldOTB.cxx )
TARGET_LINK_LIBRARIES(HelloWorldOTB OTBCommon OTBIO)
The first line defines the name of your project as it appears in Visual Studio (it will have no effect
under UNIX or Linux). The second line loads a CMake file with a predefined strategy for finding
OTB 1. If the strategy for finding OTB fails, CMake will prompt you for the directory where OTB
is installed in your system. In that case you will write this information in the OTB DIR variable.
The line INCLUDE(${USE OTB FILE}) loads the UseOTB.cmake file to set all the configuration
information from OTB.
The line ADD EXECUTABLE defines as its first argument the name of the executable that will be
produced as result of this project. The remaining arguments of ADD EXECUTABLE are the names
of the source files to be compiled and linked. Finally, the TARGET LINK LIBRARIES line specifies
which OTB libraries will be linked against this project.
Examples/Tutorials/HelloWorldOTB.cxx.
The following code is an implementation of a small OTB program. It tests including header files
and linking with OTB libraries.
#include <iostream >
int main(int argc , char * argv[])
{
std::cout << "OTB Hello World !" << std::endl;
}
This code instantiates an image whose pixels are represented with type unsigned short. The
image is then created and assigned to a itk::SmartPointer. Later in the text we will discuss
1Similar files are provided in CMake for other commonly used libraries, all of them named Find*.cmake

4.2. Pipeline basics: read and write 41
SmartPointers in detail, for now think of it as a handle on an instance of an object (see section
3.2.4 for more information).
Once the file is written, run ccmake on the current directory (that is ccmake ./ under Linux/Unix).
If OTB is on a non standard place, you will have to tell CMake where it is. Once your done with
CMake (you shouldn’t have to do it anymore) run make.
You finally have your program. When you run it, you will have the OTB Hello World ! printed.
Ok, well done! You’ve just compiled and executed your first OTB program. Actually, using OTB
for that is not very useful, and we doubt that you downloaded OTB only to do that. It’s time to move
on to a more advanced level.
4.2 Pipeline basics: read and write
OTB is designed to read images, process them and write them to disk or view the result. In this
tutorial, we are going to see how to read and write images and the basics of the pipeline system.
First, let’s add the following lines at the end of the CMakeLists.txt file:
ADD_EXECUTABLE(Pipeline Pipeline.cxx )
TARGET_LINK_LIBRARIES(Pipeline OTBCommon OTBIO)
Now, create a Pipeline.cxx file.
Examples/Tutorials/Pipeline.cxx.
Start by including some necessary headers and with the usual main declaration:
#include "otbImageFileReader.h"
#include "otbImageFileWriter.h"
{
if (argc != 3)
{
std::cerr << "Usage: "
<< argv[0]
<< " <input_filename > <output_filename >"
<< std::endl;
}
Declare the image as an otb::Image, the pixel type is declared as an unsigned char (one byte) and
the image is specified as having two dimensions.
typedef otb::Image <unsigned char, 2> ImageType ;

To read the image, we need an otb::ImageFileReader which is templated with the image type.
typedef otb::ImageFileReader <ImageType > ReaderType ;
ReaderType ::Pointer reader = ReaderType ::New ();
Then, we need an otb::ImageFileWriter also templated with the image type.
typedef otb::ImageFileWriter <ImageType > WriterType ;
WriterType ::Pointer writer = WriterType ::New ();
The filenames are passed as arguments to the program. We keep it simple for now and we don’t
check their validity.
reader ->SetFileName (argv[1]);
writer ->SetFileName (argv[2]);
Now that we have all the elements, we connect the pipeline, pluging the output of the reader to the
input of the writer.
writer ->SetInput(reader ->GetOutput ());
And finally, we trigger the pipeline execution calling the Update() method on the last element of the
pipeline. The last element will make sure to update all previous elements in the pipeline.
writer ->Update();
}
Once this file is written you just have to run make. The ccmake call is not required anymore.
Get one image from the OTB-Data/Examples directory from the OTB-Data repos-
itory. You can get it either by cloning the OTB data repository (hg clone
http://hg.orfeo-toolbox.org/OTB-Data), but that might be quite long as
this also gets the data to run the tests. Alternatively, you can get it from
http://www.orfeo-toolbox.org/packages/OTB-Data-Examples.tgz. Take for example
get QB Suburb.png.
Now, run your new program as Pipeline QB Suburb.png output.png. You obtain the file
output.png which is the same image as QB Suburb.png. When you triggered the Update()
method, OTB opened the original image and wrote it back under another name.
Well...that’s nice but a bit complicated for a copy program!
Wait a minute! We didn’t specify the file format anywhere! Let’s try Pipeline QB Suburb.png
output.jpg. And voila! The output image is a jpeg file.
That’s starting to be a bit more interesting: this is not just a program to copy image files, but also to
convert between image formats.

4.3. Filtering pipeline 43
You have just experienced the pipeline structure which executes the filters only when needed and
the automatic image format detection.
Now it’s time to do some processing in between.
4.3 Filtering pipeline
We are now going to insert a simple filter to do some processing between the reader and the writer.
Let’s first add the 2 following lines to the CMakeLists.txt file:
ADD_EXECUTABLE(FilteringPipeline FilteringPipeline.cxx )
TARGET_LINK_LIBRARIES(FilteringPipeline OTBCommon OTBIO)
Examples/Tutorials/FilteringPipeline.cxx.
We are going to use the itk::GradientMagnitudeImageFilter to compute the gradient of the
image. The begining of the file is similar to the Pipeline.cxx.
We include the required headers, without forgetting to add the header for the
itk::GradientMagnitudeImageFilter.
#include "itkGradientMagnitudeImageFilter.h"
{
if (argc != 3)
{
<< argv[0]
<< std::endl;
}
We declare the image type, the reader and the writer as before:
typedef otb::Image <unsigned char, 2> ImageType ;

Now we have to declare the filter. It is templated with the input image type and the output image
type like many filters in OTB. Here we are using the same type for the input and the output images:
typedef itk::GradientMagnitudeImageFilter
<ImageType , ImageType > FilterType ;
FilterType ::Pointer filter = FilterType ::New ();
Let’s plug the pipeline:
filter ->SetInput(reader ->GetOutput ());
writer ->SetInput(filter ->GetOutput ());
And finally, we trigger the pipeline execution calling the Update() method on the writer
writer ->Update();
}
Compile with make and execute as FilteringPipeline QB Suburb.png output.png.
You have the filtered version of your image in the output.png file.
Now, you can practice a bit and try to replace the filter by one of the 150+ filters which inherit from
the otb::ImageToImageFilter class. You will definitely find some useful filters here!
4.4 Handling types: scaling output
If you tried some other filter in the previous example, you may have noticed that in some cases,
it does not make sense to save the output directly as an integer. This is the case if you tried the
itk::CannyEdgeDetectionImageFilter. If you tried to use it directly in the previous example,
you will have some warning about converting to unsigned char from double.
The output of the Canny edge detection is a floating point number. A simple solution would be to
used double as the pixel type. Unfortunately, most image formats use integer typed and you should
convert the result to an integer image if you still want to visualize your images with your usual
viewer (we will see in a tutorial later how you can avoid that using the built-in viewer).
To realize this conversion, we will use the itk::RescaleIntensityImageFilter.
Add the two lines to the CMakeLists.txt file:
ADD_EXECUTABLE(ScalingPipeline ScalingPipeline.cxx )
TARGET_LINK_LIBRARIES(ScalingPipeline OTBCommon OTBIO)
Examples/Tutorials/ScalingPipeline.cxx.

4.4. Handling types: scaling output 45
This example illustrates the use of the itk::RescaleIntensityImageFilter to convert the result
for proper display.
We include the required header including the header for the itk::CannyEdgeImageFilter and the
itk::RescaleIntensityImageFilter.
#include "itkCannyEdgeDetectionImageFilter.h"
#include "itkRescaleIntensityImageFilter.h"
{
if (argc != 3)
{
<< argv[0]
<< std::endl;
}
We need to declare two different image types, one for the internal processing and one to output the
results:
typedef double PixelType ;
typedef otb::Image <PixelType , 2> ImageType ;
typedef unsigned char OutputPixelType;
typedef otb::Image <OutputPixelType , 2> OutputImageType;
We declare the reader with the image template using the pixel type double. It is worth noticing that
this instanciation does not imply anything about the type of the input image. The original image can
be anything, the reader will just convert the result to double.
The writer is templated with the unsigned char image to be able to save the result on one byte images
(like png for example).
typedef otb::ImageFileWriter <OutputImageType > WriterType ;
Now we are declaring the edge detection ﬁlter which is going to work with double input and output.
typedef itk:: CannyEdgeDetectionImageFilter

Here comes the interesting part: we declare the itk::RescaleIntensityImageFilter. The input
image type is the output type of the edge detection filter. The output type is the same as the input
type of the writer.
Desired minimum and maximum values for the output are specified by the methods
SetOutputMinimum() and SetOutputMaximum().
This filter will actually rescale all the pixels of the image but also cast the type of these pixels.
typedef itk::RescaleIntensityImageFilter
<ImageType , OutputImageType > RescalerType;
RescalerType::Pointer rescaler = RescalerType::New ();
rescaler ->SetOutputMinimum(0);
rescaler ->SetOutputMaximum(255);
rescaler ->SetInput (filter ->GetOutput ());
writer ->SetInput(rescaler ->GetOutput ());
And finally, we trigger the pipeline execution calling the Update() method on the writer
writer ->Update();
}
As you should be getting used to it by now, compile with make and execute as ScalingPipeline
QB Suburb.png output.png.
You have the filtered version of your image in the output.png file.
4.5 Working with multispectral or color images
So far, as you may have noticed, we have been working with grey level images, i.e. with only one
spectral band. If you tried to process a color image with some of the previous examples you have
probably obtained a deceiving grey result.
Often, satellite images combine several spectral band to help the identification of materials: this is
called multispectral imagery. In this tutorial, we are going to explore some of the mechanisms used
by OTB to process multispectral images.
Add the following lines in the CMakeLists.txt file:
ADD_EXECUTABLE(Multispectral Multispectral.cxx )
TARGET_LINK_LIBRARIES(Multispectral OTBCommon OTBIO)

4.5. Working with multispectral or color images 47
Examples/Tutorials/Multispectral.cxx.
First, we are going to use otb::VectorImage instead of the now traditionnal otb::Image. So we
include the required header:
#include "otbVectorImage.h"
We also include some other header which will be useful later. Note that we are still using the
otb::Image in this example for some of the output.
#include "otbMultiToMonoChannelExtractROI.h"
#include "itkShiftScaleImageFilter.h"
#include "otbPerBandVectorImageFilter.h"
{
if (argc != 4)
{
<< argv[0]
<< " <input_filename > <output_extract > <output_shifted_scaled >"
<< std::endl;
}
We want to read a multispectral image so we declare the image type and the reader. As we have
done in the previous example we get the filename from the command line.
typedef unsigned short int PixelType ;
typedef otb::VectorImage <PixelType , 2> VectorImageType;
typedef otb::ImageFileReader <VectorImageType > ReaderType ;
Sometime, you need to process only one spectral band of the image. To get only one of the spectral
band we use the /doxygenotbMultiToMonoChannelExtractROI. The declaration is as usual:
typedef otb::MultiToMonoChannelExtractROI<PixelType , PixelType >
ExtractChannelType;
ExtractChannelType::Pointer extractChannel = ExtractChannelType::New();
We need to pass the parameters to the filter for the extraction. This filter also allow to extract only a
spatial subset of the image. However, we will extract the whole channel in this case.
To do that, we need to pass the desired region using the SetExtractionRegion() (method
such as SetStartX, SetSizeX are also available). We get the region from the reader with the

GetLargestPossibleRegion() method. Before doing that we need to read the metadata from the
file: this is done by calling the UpdateOutputInformation() on the reader’s output. The dif-
ference with the Update() is that the pixel array is not allocated (yet !) and reduce the memory
usage.
reader ->UpdateOutputInformation();
extractChannel ->SetExtractionRegion(
reader ->GetOutput ()->GetLargestPossibleRegion());
We chose the channel number to extract (starting from 1) and we plug the pipeline.
extractChannel ->SetChannel (3);
extractChannel ->SetInput(reader ->GetOutput ());
To output this image, we need a writer. As the output of the
otb::MultiToMonoChannelExtractROI is a otb::Image, we need to template the writer
with this type.
writer ->SetInput(extractChannel ->GetOutput ());
writer ->Update();
After this, we have a one band image that we can process with most OTB filters.
In some situation, you may want to apply the same process to all bands of the image. You don’t have
to extract each band and process them separately. There is several situations:
• the filter (or the combination of filters) you want to use are doing operations that are well
defined for itk::VariableLengthVector (which is the pixel type), then you don’t have to
do anything special.
• if this is not working, you can look for the equivalent filter specially designed for vector
images.
• some of the filter you need to use applies operations undefined for
itk::VariableLengthVector, then you can use the otb::PerBandVectorImageFilter
specially designed for this purpose.
Let’s see how this filter is working. We chose to apply the itk::ShiftScaleImageFilter to each
of the spectral band. We start by declaring the filter on a normal otb::Image. Note that we don’t
need to specify any input for this filter.

4.6. Parsing command line arguments 49
typedef itk::ShiftScaleImageFilter <ImageType , ImageType > ShiftScaleType;
ShiftScaleType::Pointer shiftScale = ShiftScaleType::New ();
shiftScale ->SetScale (0.5);
shiftScale ->SetShift (10);
We declare the otb::PerBandVectorImageFilter which has three template: the input image
type, the output image type and the filter type to apply to each band.
The filter is selected using the SetFilter() method and the input by the usual SetInput() method.
typedef otb:: PerBandVectorImageFilter
<VectorImageType , VectorImageType , ShiftScaleType > VectorFilterType;
VectorFilterType::Pointer vectorFilter = VectorFilterType::New ();
vectorFilter ->SetFilter (shiftScale );
vectorFilter ->SetInput(reader ->GetOutput ());
Now, we just have to save the image using a writer templated over an otb::VectorImage:
typedef otb::ImageFileWriter <VectorImageType > VectorWriterType;
VectorWriterType::Pointer writerVector = VectorWriterType::New ();
writerVector ->SetFileName (argv[3]);
writerVector ->SetInput(vectorFilter ->GetOutput ());
writerVector ->Update();
}
Compile with make and execute as ./Multispectral qb RoadExtract.tif qb blue.tif
qb shiftscale.tif.
4.6 Parsing command line arguments
Well, if you play with some other filters in the previous example, you probably noticed that in
many cases, you need to set some parameters to the filters. Ideally, you want to set some of these
parameters from the command line.
In OTB, there is a mechanism to help you parse the command line parameters. Let try it!
Add the following lines in the CMakeLists.txt file:
ADD_EXECUTABLE(SmarterFilteringPipeline SmarterFilteringPipeline.cxx )
TARGET_LINK_LIBRARIES(SmarterFilteringPipeline OTBCommon OTBIO)
Examples/Tutorials/SmarterFilteringPipeline.cxx.

We are going to use the otb::HarrisImageFilter to find the points of interest in one image.
The derivative computation is performed by a convolution with the derivative of a Gaussian kernel
of variance σD (derivation scale) and the smoothing of the image is performed by convolving with
a Gaussian kernel of variance σI (integration scale). This allows the computation of the following
matrix:
µ(x,σI,σD) = σ2
Dg(σI)⋆
L2
x(x,σD) LxL2
y(x,σD)
LxL2
y(x,σD) L2
y(x,σD)
The output of the detector is det(µ)− αtrace2(µ).
We want to set 3 parameters of this filter through the command line: σD (SigmaD), σI (SigmaI) and
α (Alpha).
We are also going to do the things properly and catch the exceptions.
Let first add the two following headers:
#include "itkMacro.h"
#include "otbCommandLineArgumentParser.h"
The first one is to handle the exceptions, the second one to help us parse the command line.
We include the other required headers, without forgetting to add the header for the
otb::HarrisImageFilter. Then we start the usual main function.
#include "otbHarrisImageFilter.h"
{
To handle the exceptions properly, we need to put all the instructions inside a try.
try
{
Now, we can declare the otb::CommandLineArgumentParser which is going to parse the com-
mand line, select the proper variables, handle the missing compulsory arguments and print an error
message if necessary.
Let’s declare the parser:
typedef otb::CommandLineArgumentParser ParserType ;
ParserType ::Pointer parser = ParserType ::New();

It’s now time to tell the parser what are the options we want. Special options are available for input
and output images with the AddInputImage() and AddOutputImage() methods.
For the other options, we need to use the AddOption() method. This method allows us to specify
• the name of the option
• a message to explain the meaning of this option
• a shortcut for this option
• the number of expected parameters for this option
• whether or not this option is compulsory
parser ->SetProgramDescription(
"This program applies a Harris detector on the input image");
parser ->AddInputImage();
parser ->AddOutputImage();
parser ->AddOption ("--SigmaD",
"Set the sigmaD parameter . Default is 1.0.",
"-d",
1,
false);
parser ->AddOption ("--SigmaI",
"Set the sigmaI parameter . Default is 1.0.",
"-i",
1,
false);
parser ->AddOption ("--Alpha",
"Set the alpha parameter . Default is 1.0.",
"-a",
1,
false);
Now that the parser has all this information, it can actually look at the command line to parse it. We
have to do this within a try - catch loop to handle exceptions nicely.
typedef otb::CommandLineArgumentParseResult ParserResultType;
ParserResultType::Pointer parseResult = ParserResultType::New ();
try
{
parser ->ParseCommandLine(argc , argv , parseResult );
}
catch (itk::ExceptionObject& err)
{
std::string descriptionException = err.GetDescription();
if (descriptionException.find("ParseCommandLine(): Help Parser")
!= std::string::npos)
{

}
if (descriptionException.find("ParseCommandLine(): Version Parser")
!= std::string::npos)
{
}
return EXIT_FAILURE;
}
Now, we can declare the image type, the reader and the writer as before:
ReaderType ::Pointer reader = ReaderType ::New();
WriterType ::Pointer writer = WriterType ::New();
We are getting the filenames for the input and the output images directly from the parser:
reader ->SetFileName (parseResult ->GetInputImage(). c_str());
writer ->SetFileName (parseResult ->GetOutputImage(). c_str());
Now we have to declare the filter. It is templated with the input image type and the output image
type like many filters in OTB. Here we are using the same type for the input and the output images:
typedef otb::HarrisImageFilter
FilterType ::Pointer filter = FilterType ::New();
We set the filter parameters from the parser. The method IsOptionPresent() let us know if an
optional option was provided in the command line.
if (parseResult ->IsOptionPresent("--SigmaD"))
filter ->SetSigmaD (
parseResult ->GetParameterDouble("--SigmaD"));
if (parseResult ->IsOptionPresent("--SigmaI"))
filter ->SetSigmaI (
parseResult ->GetParameterDouble("--SigmaI"));
if (parseResult ->IsOptionPresent("--Alpha"))
filter ->SetAlpha (
parseResult ->GetParameterDouble("--Alpha"));
We add the rescaler filter as before

<ImageType , OutputImageType > RescalerType;
RescalerType::Pointer rescaler = RescalerType::New();
We trigger the pipeline execution calling the Update() method on the writer
writer ->Update ();
}
Finally, we have to handle exceptions we may have raised before
catch (itk:: ExceptionObject& err)
{
std::cout << "Following otbException catch :" << std::endl;
std::cout << err << std::endl;
}
catch (std::bad_alloc & err)
{
std::cout << "Exception bad_alloc : " << (char*) err.what() << std::endl;
}
catch (...)
{
std::cout << "Unknown Exception found !" << std::endl;
}
}
Compile with make as usual. The execution is a bit different now as we have an automatic parsing
of the command line. First, try to execute as SmarterFilteringPipeline without any argument.
The usage message (automatically generated) appears:
’--InputImage’ option is obligatory !!!
Usage : ./SmarterFilteringPipeline
[--help|-h] : Help
[--version|-v] : Version

--InputImage|-in : input image file name (1 parameter)
--OutputImage|-out : output image file name (1 parameter)
[--SigmaD|-d] : Set the sigmaD parameter of the Harris points of
interest algorithm. Default is 1.0. (1 parameter)
[--SigmaI|-i] : Set the SigmaI parameter of the Harris points of
[--Alpha|-a] : Set the alpha parameter of the Harris points of
That looks a bit more professional: another user should be able to play with your program. As this
is automatic, that’s a good way not to forget to document your programs.
So now you have a better idea of the command line options that are possible. Try
SmarterFilteringPipeline -in QB Suburb.png -out output.png for a basic version with
the default values.
If you want a result that looks a bit better, you have to adjust the parameter
with SmarterFilteringPipeline -in QB Suburb.png -out output.png -d 1.5 -i 2 -a
0.1 for example.
4.7 Viewer
So far, we had to save the image and use an external viewer every time we wanted to see the result
of our processing. That is not very convenient, especially for some exotic formats (16 bits, floating
point ...). Thankfully, OTB comes with it’s own visualization tool.
This tool is accessible by the class otb::ImageViewer. We will now design a simple, minimalistic
example to illustrate the use for this viewer.
First you need to add the following lines in the CMakeLists.txt file:
ADD_EXECUTABLE(SimpleViewer SimpleViewer.cxx )
TARGET_LINK_LIBRARIES(SimpleViewer OTBCommon OTBIO OTBGui OTBVisualization)
Notice that you have to link to 2 other OTB libraries: OTBGui and OTBVisualization. In other situa-
tions you might have to add other OTB libraries, for example OTBLearning, OTBFeatureExtraction,
etc. To make a long story short, basically add the library when you’re using some domain specific
filters and if you’re getting messages from the compiler such as undefined reference to.
Examples/Tutorials/SimpleViewer.cxx.
Now, we are going to illustrate the use of the otb::StandardImageViewer to display an image or
the result of an algorithm without saving the image.
We include the required header including the header for the
itk::GradientMagnitudeImageFilter and the otb::ImageViewer.

4.7. Viewer 55
#include "otbStandardImageViewer.h"
{
We need to declare two different image types, one for the internal processing and one to output the
results:
Now we are declaring the edge detection ﬁlter which is going to work with double input and output.
// typedef itk::GradientMagnitudeImageFilter
// <ImageType, ImageType> FilterType;
// FilterType::Pointer filter = FilterType::New();
The otb::StandardImageViewer is the new, highly versatile component to display images in
OTB. There is a system of plugins to fully customize its behavior. But we’ll keep it simple for now.
typedef otb::StandardImageViewer <ImageType > ViewerType ;
ViewerType ::Pointer viewer = ViewerType ::New ();
Let’s plug the pipeline: for the viewer the method is SetImage().
// filter->SetInput(reader->GetOutput());
viewer ->SetImage(reader ->GetOutput ());
When you open several reader, this is handy to be able to identify each one. This can be done using
SetLabel().
viewer ->SetLabel(argv[1]);
We trigger the pipeline execution and the image display with the Update() method of the viewer.
viewer ->Update();
A call to Fl::run() is mandatory to ask the program to listen to mouse and keyword events until
the viewer is closed.

Fl::run();
}
After compiling you can execute the program with SimpleViewer QB Suburb.png. The result
of the edge detection is displayed. Notice that you can call this simple program with a big im-
age (let’s say 30000 × 30000 pixels). For all multithreaded filters (filters which implement a
ThreadedGenerateData() method), the image is splitted into piece and only the piece on display
is processed.
4.8 Going from raw satellite images to useful products
Quite often, when you buy satellite images, you end up with several images. In the case of opti-
cal satellite, you often have a panchromatic spectral band with the highest spatial resolution and a
multispectral product of the same area with a lower resolution. The resolution ratio is likely to be
around 4.
To get the best of the image processing algorithms, you want to combine these data to produce a new
image with the highest spatial resolution and several spectral band. This step is called fusion and you
can find more details about it in 13. However, the fusion suppose that your two images represents
exactly the same area. There are different solutions to process your data to reach this situation.
Here we are going to use the metadata available with the images to produce an orthorectification as
detailled in 11.
First you need to add the following lines in the CMakeLists.txt file:
ADD_EXECUTABLE(OrthoFusion OrthoFusion.cxx)
TARGET_LINK_LIBRARIES(OrthoFusion OTBProjections OTBCommon OTBIO)
Examples/Tutorials/OrthoFusion.cxx.
Start by including some necessary headers and with the usual main declaration. Apart from the
classical header related to image input and output. We need the headers related to the fusion and the
orthorectification. One header is also required to be able to process vector images (the XS one) with
the orthorectification.
#include "otbOrthoRectificationFilter.h"
#include "otbGenericMapProjection.h"

4.8. Going from raw satellite images to useful products 57
#include "otbSimpleRcsPanSharpeningFusionImageFilter.h"
#include "otbStandardFilterWatcher.h"
int main(int argc , char* argv[])
{
We initialize ossim which is required for the orthorectification and we check that all parameters are
provided. Basically, we need:
• the name of the input PAN image;
• the name of the input XS image;
• the desired name for the output;
• as the coordinates are given in UTM, we need the UTM zone number;
• of course, we need the UTM coordinates of the final image;
• the size in pixels of the final image;
• and the sampling of the final image.
We check that all those parameters are provided.
if (argc != 12)
{
std::cout << argv[0] << " <input_pan_filename > <input_xs_filename > ";
std::cout << "<output_filename > <utm zone > <hemisphere N/S> ";
std::cout << "<x_ground_upper_left_corner> <y_ground_upper_left_corner> ";
std::cout << "<x_Size > <y_Size > ";
std::cout << "<x_groundSamplingDistance> ";
std::cout << "<y_groundSamplingDistance "
<< "(negative since origin is upper left)>"
<< std::endl;
}
We declare the different images, readers and writer:
typedef otb::Image <unsigned int, 2> ImageType ;
typedef otb::VectorImage <unsigned int, 2> VectorImageType;
typedef otb::Image <double, 2> DoubleImageType;
typedef otb::VectorImage <double, 2> DoubleVectorImageType;
typedef otb::ImageFileReader <VectorImageType > VectorReaderType;
typedef otb::ImageFileWriter <VectorImageType > WriterType ;
ReaderType ::Pointer readerPAN = ReaderType ::New();
VectorReaderType::Pointer readerXS = VectorReaderType::New ();

readerPAN ->SetFileName (argv[1]);
readerXS ->SetFileName (argv[2]);
We declare the projection (here we chose the UTM projection, other choices are possible) and re-
trieve the paremeters from the command line:
• the UTM zone
• the hemisphere
typedef otb::GenericMapProjection <otb:: TransformDirection::INVERSE> InverseProjectionType;
InverseProjectionType::Pointer utmMapProjection = InverseProjectionType::New ();
utmMapProjection ->SetWkt("Utm");
utmMapProjection ->SetParameter("Zone", argv[4]);
utmMapProjection ->SetParameter("Hemisphere ", argv[5]);
We will need to pass several parameters to the orthorectification concerning the desired output re-
gion:
ImageType ::IndexType start;
start[0] = 0;
start[1] = 0;
ImageType ::SizeType size;
size[0] = atoi(argv[8]);
ImageType ::SpacingType spacing;
spacing [0] = atof(argv[10]);
ImageType ::PointType origin;
origin[0] = strtod(argv[6], NULL);
We declare the orthorectification filter. And provide the different parameters:
typedef otb::OrthoRectificationFilter<ImageType , DoubleImageType ,
InverseProjectionType >
OrthoRectifFilterType;
OrthoRectifFilterType::Pointer orthoRectifPAN =
OrthoRectifFilterType::New ();
orthoRectifPAN ->SetMapProjection(utmMapProjection);
orthoRectifPAN ->SetInput(readerPAN ->GetOutput ());
orthoRectifPAN ->SetOutputStartIndex(start);

4.8. Going from raw satellite images to useful products 59
orthoRectifPAN ->SetOutputSize(size);
orthoRectifPAN ->SetOutputSpacing(spacing );
orthoRectifPAN ->SetOutputOrigin(origin);
Now we are able to have the orthorectiﬁed area from the PAN image. We just have to follow a
similar process for the XS image.
typedef otb::OrthoRectificationFilter<VectorImageType ,
DoubleVectorImageType , InverseProjectionType >
VectorOrthoRectifFilterType;
VectorOrthoRectifFilterType::Pointer orthoRectifXS =
VectorOrthoRectifFilterType::New();
orthoRectifXS ->SetMapProjection(utmMapProjection);
orthoRectifXS ->SetInput(readerXS ->GetOutput ());
orthoRectifXS ->SetOutputStartIndex(start);
orthoRectifXS ->SetOutputSize(size);
orthoRectifXS ->SetOutputSpacing(spacing );
orthoRectifXS ->SetOutputOrigin(origin);
It’s time to declare the fusion ﬁlter and set its inputs:
typedef otb:: SimpleRcsPanSharpeningFusionImageFilter
<DoubleImageType , DoubleVectorImageType , VectorImageType > FusionFilterType;
FusionFilterType::Pointer fusion = FusionFilterType::New ();
fusion ->SetPanInput (orthoRectifPAN ->GetOutput ());
fusion ->SetXsInput (orthoRectifXS ->GetOutput ());
And we can plug it to the writer. To be able to process the images by tiles, we use the
SetAutomaticTiledStreaming() method of the writer. We trigger the pipeline execution with
the Update() method.
writer ->SetInput(fusion ->GetOutput ());
writer ->SetAutomaticTiledStreaming();
otb:: StandardFilterWatcher watcher(writer , "OrthoFusion ");
writer ->Update();
}

CHAPTER
FIVE
DATA REPRESENTATION
This chapter introduces the basic classes responsible for representing data in OTB. The most com-
mon classes are the otb::Image, the itk::Mesh and the itk::PointSet.
5.1 Image
The otb::Image class follows the spirit of Generic Programming, where types are separated from
the algorithmic behavior of the class. OTB supports images with any pixel type and any spatial
dimension.
5.1.1 Creating an Image
Examples/DataRepresentation/Image/Image1.cxx.
This example illustrates how to manually construct an otb::Image class. The following is the
minimal code needed to instantiate, declare and create the image class.
First, the header ﬁle of the Image class must be included.
Then we must decide with what type to represent the pixels and what the dimension of the image
will be. With these two parameters we can instantiate the image class. Here we create a 2D image,
which is what we often use in remote sensing applications, anyway, with unsigned short pixel
data.
The image can then be created by invoking the New() operator from the corresponding image type
and assigning the result to a itk::SmartPointer.

64 Chapter 5. Data Representation
In OTB, images exist in combination with one or more regions. A region is a subset of the image
and indicates a portion of the image that may be processed by other classes in the system. One of the
most common regions is the LargestPossibleRegion, which defines the image in its entirety. Other
important regions found in OTB are the BufferedRegion, which is the portion of the image actually
maintained in memory, and the RequestedRegion, which is the region requested by a filter or other
class when operating on the image.
In OTB, manually creating an image requires that the image is instantiated as previously shown, and
that regions describing the image are then associated with it.
A region is defined by two classes: the itk::Index and itk::Size classes. The origin of the
region within the image with which it is associated is defined by Index. The extent, or size, of the
region is defined by Size. Index is represented by a n-dimensional array where each component is
an integer indicating—in topological image coordinates—the initial pixel of the image. When an
image is created manually, the user is responsible for defining the image size and the index at which
the image grid starts. These two parameters make it possible to process selected regions.
The starting point of the image is defined by an Index class that is an n-dimensional array where
each component is an integer indicating the grid coordinates of the initial pixel of the image.
start[0] = 0; // first index on X
start[1] = 0; // first index on Y
The region size is represented by an array of the same dimension of the image (using the Size class).
The components of the array are unsigned integers indicating the extent in pixels of the image along
every dimension.
size[0] = 200; // size along X
size[1] = 200; // size along Y
Having defined the starting index and the image size, these two parameters are used to create an
ImageRegion object which basically encapsulates both concepts. The region is initialized with the
starting index and size of the image.
ImageType ::RegionType region;
region.SetSize(size);
region.SetIndex(start);
Finally, the region is passed to the Image object in order to define its extent and origin. The
SetRegions method sets the LargestPossibleRegion, BufferedRegion, and RequestedRegion simul-
taneously. Note that none of the operations performed to this point have allocated memory for the

5.1. Image 65
image pixel data. It is necessary to invoke the Allocate() method to do this. Allocate does not
require any arguments since all the information needed for memory allocation has already been
provided by the region.
image ->SetRegions (region);
image ->Allocate ();
In practice it is rare to allocate and initialize an image directly. Images are typically read from a
source, such a file or data acquisition hardware. The following example illustrates how an image
can be read from a file.
5.1.2 Reading an Image from a File
The first thing required to read an image from a file is to include the header file of the
otb::ImageFileReader class.
Then, the image type should be defined by specifying the type used to represent pixels and the
dimensions of the image.
typedef unsigned char PixelType ;
const unsigned int Dimension = 2;
typedef otb::Image <PixelType , Dimension > ImageType ;
Using the image type, it is now possible to instantiate the image reader class. The image type is used
as a template parameter to define how the data will be represented once it is loaded into memory.
This type does not have to correspond exactly to the type stored in the file. However, a conversion
based on C-style type casting is used, so the type chosen to represent the data on disk must be
sufficient to characterize it accurately. Readers do not apply any transformation to the pixel data
other than casting from the pixel type of the file to the pixel type of the ImageFileReader. The
following illustrates a typical instantiation of the ImageFileReader type.
The reader type can now be used to create one reader object. A itk::SmartPointer (defined by
the ::Pointer notation) is used to receive the reference to the newly created reader. The New()
method is invoked to create an instance of the image reader.
The minimum information required by the reader is the filename of the image to be loaded in mem-
ory. This is provided through the SetFileName() method. The file format here is inferred from

the filename extension. The user may also explicitly specify the data format explicitly using the
itk::ImageIO (See Chapter 6.1 99 for more information):
const char * filename = argv[1];
reader ->SetFileName (filename );
Reader objects are referred to as pipeline source objects; they respond to pipeline update requests
and initiate the data flow in the pipeline. The pipeline update mechanism ensures that the reader
only executes when a data request is made to the reader and the reader has not read any data. In the
current example we explicitly invoke the Update() method because the output of the reader is not
connected to other filters. In normal application the reader’s output is connected to the input of an
image filter and the update invocation on the filter triggers an update of the reader. The following
line illustrates how an explicit update is invoked on the reader.
reader ->Update();
Access to the newly read image can be gained by calling the GetOutput() method on the reader.
This method can also be called before the update request is sent to the reader. The reference to the
image will be valid even though the image will be empty until the reader actually executes.
ImageType ::Pointer image = reader ->GetOutput ();
Any attempt to access image data before the reader executes will yield an image with no pixel data.
It is likely that a program crash will result since the image will not have been properly initialized.
5.1.3 Accessing Pixel Data
This example illustrates the use of the SetPixel() and GetPixel() methods. These two methods
provide direct access to the pixel data contained in the image. Note that these two methods are
relatively slow and should not be used in situations where high-performance access is required.
Image iterators are the appropriate mechanism to efficiently access image pixel data.
The individual position of a pixel inside the image is identified by a unique index. An index is an
array of integers that defines the position of the pixel along each coordinate dimension of the image.
The IndexType is automatically defined by the image and can be accessed using the scope operator
like itk::Index. The length of the array will match the dimensions of the associated image.
The following code illustrates the declaration of an index variable and the assignment of values to
each of its components. Please note that Index does not use SmartPointers to access it. This is
because Index is a light-weight object that is not intended to be shared between objects. It is more
efficient to produce multiple copies of these small objects than to share them using the SmartPointer
mechanism.

5.1. Image 67
The following lines declare an instance of the index type and initialize its content in order to associate
it with a pixel position in the image.
ImageType ::IndexType pixelIndex ;
pixelIndex [0] = 27; // x position
pixelIndex [1] = 29; // y position
Having defined a pixel position with an index, it is then possible to access the content of the pixel in
the image. The GetPixel() method allows us to get the value of the pixels.
ImageType ::PixelType pixelValue = image ->GetPixel(pixelIndex );
The SetPixel() method allows us to set the value of the pixel.
image ->SetPixel(pixelIndex , pixelValue + 1);
Please note that GetPixel() returns the pixel value using copy and not reference semantics. Hence,
the method cannot be used to modify image data values.
Remember that both SetPixel() and GetPixel() are inefficient and should only be used for de-
bugging or for supporting interactions like querying pixel values by clicking with the mouse.
5.1.4 Defining Origin and Spacing
Even though OTB can be used to perform general image processing tasks, the primary purpose of
the toolkit is the processing of remote sensing image data. In that respect, additional information
about the images is considered mandatory. In particular the information associated with the physical
spacing between pixels and the position of the image in space with respect to some world coordinate
system are extremely important.
Image origin and spacing are fundamental to many applications. Registration, for example, is per-
formed in physical coordinates. Improperly defined spacing and origins will result in inconsistent
results in such processes. Remote sensing images with no spatial information should not be used for
image analysis, feature extraction, GIS input, etc. In other words, remote sensing images lacking
spatial information are not only useless but also hazardous.
Figure 5.1 illustrates the main geometrical concepts associated with the otb::Image. In this figure,
circles are used to represent the center of pixels. The value of the pixel is assumed to exist as a
Dirac Delta Function located at the pixel center. Pixel spacing is measured between the pixel centers
and can be different along each dimension. The image origin is associated with the coordinates
of the first pixel in the image. A pixel is considered to be the rectangular region surrounding the
pixel center holding the data value. This can be viewed as the Voronoi region of the image grid, as

0 10050 150 200
0
50
100
150
200
250
300
30.0
20.0
Size=7x6
Spacing=( 20.0, 30.0 )
Physical extent=( 140.0, 180.0 )
Origin=(60.0,70.0)
Image Origin
Voronoi Region
Pixel Coverage
Delaunay Region
Linear Interpolation Region
Pixel Coordinates
Spacing[0]
Spacing[1]
Figure 5.1: Geometrical concepts associated with the OTB image.
illustrated in the right side of the ﬁgure. Linear interpolation of image values is performed inside
the Delaunay region whose corners are pixel centers.
Image spacing is represented in a FixedArray whose size matches the dimension of the image. In
order to manually set the spacing of the image, an array of the corresponding type must be created.
The elements of the array should then be initialized with the spacing between the centers of adjacent
pixels. The following code illustrates the methods available in the Image class for dealing with
spacing and origin.
// Note: measurement units (e.g., meters, feet, etc.) are defined by the application.
spacing [0] = 0.70; // spacing along X
spacing [1] = 0.70; // spacing along Y
The array can be assigned to the image using the SetSpacing() method.
image ->SetSpacing (spacing );
The spacing information can be retrieved from an image by using the GetSpacing() method. This
method returns a reference to a FixedArray. The returned object can then be used to read the
contents of the array. Note the use of the const keyword to indicate that the array will not be
modiﬁed.
const ImageType :: SpacingType & sp = image ->GetSpacing ();

5.1. Image 69
std::cout << "Spacing = ";
std::cout << sp[0] << ", " << sp[1] << std::endl;
The image origin is managed in a similar way to the spacing. A Point of the appropriate dimension
must first be allocated. The coordinates of the origin can then be assigned to every component. These
coordinates correspond to the position of the first pixel of the image with respect to an arbitrary
reference system in physical space. It is the user’s responsibility to make sure that multiple images
used in the same application are using a consistent reference system. This is extremely important in
image registration applications.
The following code illustrates the creation and assignment of a variable suitable for initializing the
image origin.
origin[0] = 0.0; // coordinates of the
origin[1] = 0.0; // first pixel in 2-D
image ->SetOrigin (origin);
The origin can also be retrieved from an image by using the GetOrigin() method. This will return
a reference to a Point. The reference can be used to read the contents of the array. Note again the
use of the const keyword to indicate that the array contents will not be modified.
const ImageType :: PointType & orgn = image ->GetOrigin ();
std::cout << "Origin = ";
std::cout << orgn[0] << ", " << orgn[1] << std::endl;
Once the spacing and origin of the image have been initialized, the image will correctly map pixel
indices to and from physical space coordinates. The following code illustrates how a point in phys-
ical space can be mapped into an image index for the purpose of reading the content of the closest
pixel.
First, a itk::Point type must be declared. The point type is templated over the type used to
represent coordinates and over the dimension of the space. In this particular case, the dimension of
the point must match the dimension of the image.
typedef itk::Point <double, ImageType ::ImageDimension > PointType ;
The Point class, like an itk::Index, is a relatively small and simple object. For this reason, it is not
reference-counted like the large data objects in OTB. Consequently, it is also not manipulated with
itk::SmartPointers. Point objects are simply declared as instances of any other C++ class. Once
the point is declared, its components can be accessed using traditional array notation. In particular,
the [] operator is available. For efficiency reasons, no bounds checking is performed on the index
used to access a particular point component. It is the user’s responsibility to make sure that the index
is in the range {0,Dimension − 1}.

PointType point;
point[0] = 1.45; // x coordinate
point[1] = 7.21; // y coordinate
The image will map the point to an index using the values of the current spacing and origin. An index
object must be provided to receive the results of the mapping. The index object can be instantiated
by using the IndexType defined in the Image type.
ImageType ::IndexType pixelIndex ;
The TransformPhysicalPointToIndex() method of the image class will compute the pixel index
closest to the point provided. The method checks for this index to be contained inside the current
buffered pixel data. The method returns a boolean indicating whether the resulting index falls inside
the buffered region or not. The output index should not be used when the returned value of the
method is false.
The following lines illustrate the point to index mapping and the subsequent use of the pixel index
for accessing pixel data from the image.
bool isInside = image ->TransformPhysicalPointToIndex(point , pixelIndex );
if (isInside )
{
ImageType ::PixelType pixelValue = image ->GetPixel(pixelIndex );
pixelValue += 5;
image ->SetPixel(pixelIndex , pixelValue );
}
Remember that GetPixel() and SetPixel() are very inefficient methods for accessing pixel data.
Image iterators should be used when massive access to pixel data is required.
5.1.5 Accessing Image Metadata
Examples/IO/MetadataExample.cxx.
This example illustrates the access to metadata image information with OTB. By metadata, we mean
data which is typically stored with remote sensing images, like geographical coordinates of pixels,
pixel spacing or resolution, etc. Of course, the availability of these data depends on the image
format used and on the fact that the image producer must fill the available metadata fields. The
image formats which typically support metadata are for example CEOS and GeoTiff.
The metadata support is embedded in OTB’s IO functionnalities and is accessible through the
otb::Image and otb::VectorImage classes. You should avoid using the itk::Image class if

5.1. Image 71
you want to have metadata support.
This simple example will consist on reading an image from a file and writing the metadata to an
output ASCII file. As usual we start by defining the types needed for the image to be read.
typedef unsigned char InputPixelType;
typedef otb::Image <InputPixelType , Dimension > InputImageType;
typedef otb::ImageFileReader <InputImageType > ReaderType ;
We can now instantiate the reader and get a pointer to the input image.
InputImageType::Pointer image = InputImageType::New ();
reader ->SetFileName (inputFilename);
reader ->Update();
image = reader ->GetOutput ();
Once the image has been read, we can access the metadata information. We will copy this informa-
tion to an ASCII file, so we create an output file stream for this purpose.
std::ofstream file;
file.open(outputAsciiFilename);
We can now call the different available methods for accessing the metadata. Useful methods are :
• GetSpacing: the sampling step;
• GetOrigin: the coordinates of the origin of the image;
• GetProjectionRef: the image projection reference;
• GetGCPProjection: the projection for the eventual ground control points;
• GetGCPCount: the number of GCPs available;
file << "Spacing " << image ->GetSpacing () << std::endl;
file << "Origin " << image ->GetOrigin () << std::endl;
file << "Projection REF " << image ->GetProjectionRef() << std::endl;
file << "GCP Projection " << image ->GetGCPProjection() << std::endl;
unsigned int GCPCount = image ->GetGCPCount ();
file << "GCP Count " << image ->GetGCPCount () << std::endl;

One can also get the GCPs by number, as well as their coordinates in image and geographical space.
for (unsigned int GCPnum = 0; GCPnum < GCPCount ; GCPnum ++)
{
file << "GCP[" << GCPnum << "] Id " << image ->GetGCPId(GCPnum) << std::endl;
file << "GCP[" << GCPnum << "] Info " << image ->GetGCPInfo (GCPnum) <<
std::endl;
file << "GCP[" << GCPnum << "] Row " << image ->GetGCPRow (GCPnum) <<
std::endl;
file << "GCP[" << GCPnum << "] Col " << image ->GetGCPCol (GCPnum) <<
std::endl;
file << "GCP[" << GCPnum << "] X " << image ->GetGCPX(GCPnum) << std::endl;
file << "GCP[" << GCPnum << "] Y " << image ->GetGCPY(GCPnum) << std::endl;
file << "GCP[" << GCPnum << "] Z " << image ->GetGCPZ(GCPnum) << std::endl;
file << "----------------" << std::endl;
}
If a geographical transformation is available, it can be recovered as follows.
InputImageType::VectorType tab = image ->GetGeoTransform();
file << "Geo Transform " << std::endl;
for (unsigned int i = 0; i < tab.size(); ++i)
{
file << " " < " << tab[i] << std::endl;
}
tab.clear();
tab = image ->GetUpperLeftCorner();
file << "Corners " << std::endl;
{
file << " UL[" < " << tab[i] << std::endl;
}
tab.clear();
tab = image ->GetUpperRightCorner();
{
file << " UR[" < " << tab[i] << std::endl;
}
tab.clear();
tab = image ->GetLowerLeftCorner();
{
file << " LL[" < " << tab[i] << std::endl;
}
tab.clear();
tab = image ->GetLowerRightCorner();
{
file << " LR[" < " << tab[i] << std::endl;
}

5.1. Image 73
tab.clear();
file.close();
5.1.6 RGB Images
The term RGB (Red, Green, Blue) stands for a color representation commonly used in digital imag-
ing. RGB is a representation of the human physiological capability to analyze visual light using
three spectral-selective sensors [94, 150]. The human retina possess different types of light sensitive
cells. Three of them, known as cones, are sensitive to color [52] and their regions of sensitivity
loosely match regions of the spectrum that will be perceived as red, green and blue respectively. The
rods on the other hand provide no color discrimination and favor high resolution and high sensitiv-
ity1. A fifth type of receptors, the ganglion cells, also known as circadian2 receptors are sensitive
to the lighting conditions that differentiate day from night. These receptors evolved as a mechanism
for synchronizing the physiology with the time of the day. Cellular controls for circadian rythms are
present in every cell of an organism and are known to be exquisitively precise [90].
The RGB space has been constructed as a representation of a physiological response to light by the
three types of cones in the human eye. RGB is not a Vector space. For example, negative numbers
are not appropriate in a color space because they will be the equivalent of “negative stimulation” on
the human eye. In the context of colorimetry, negative color values are used as an artificial construct
for color comparison in the sense that
ColorA = ColorB−ColorC (5.1)
just as a way of saying that we can produce ColorB by combining ColorA and ColorC. However,
we must be aware that (at least in emitted light) it is not possible to substract light. So when we
mention Equation 5.1 we actually mean
ColorB = ColorA+ColorC (5.2)
On the other hand, when dealing with printed color and with paint, as opposed to emitted light like
in computer screens, the physical behavior of color allows for subtraction. This is because strictly
speaking the objects that we see as red are those that absorb all light frequencies except those in the
red section of the spectrum [150].
The concept of addition and subtraction of colors has to be carefully interpreted. In fact, RGB has a
different definition regarding whether we are talking about the channels associated to the three color
sensors of the human eye, or to the three phosphors found in most computer monitors or to the color
inks that are used for printing reproduction. Color spaces are usually non linear and do not even
from a Group. For example, not all visible colors can be represented in RGB space [150].
1The human eye is capable of perceiving a single isolated photon.
2The term Circadian refers to the cycle of day and night, that is, events that are repeated with 24 hours intervals.

ITK introduces the itk::RGBPixel type as a support for representing the values of an RGB color
space. As such, the RGBPixel class embodies a different concept from the one of an itk::Vector
in space. For this reason, the RGBPixel lack many of the operators that may be naively expected
from it. In particular, there are no defined operations for subtraction or addition.
When you anticipate to perform the operation of “Mean” on a RGB type you are assuming that in
the color space provides the action of finding a color in the middle of two colors, can be found by
using a linear operation between their numerical representation. This is unfortunately not the case
in color spaces due to the fact that they are based on a human physiological response [94].
If you decide to interpret RGB images as simply three independent channels then you should rather
use the itk::Vector type as pixel type. In this way, you will have access to the set of operations
that are defined in Vector spaces. The current implementation of the RGBPixel in ITK presumes
that RGB color images are intended to be used in applications where a formal interpretation of color
is desired, therefore only the operations that are valid in a color space are available in the RGBPixel
class.
The following example illustrates how RGB images can be represented in OTB.
Examples/DataRepresentation/Image/RGBImage.cxx.
Thanks to the flexibility offered by the Generic Programming style on which OTB is based, it is
possible to instantiate images of arbitrary pixel type. The following example illustrates how a color
image with RGB pixels can be defined.
A class intended to support the RGB pixel type is available in ITK. You could also define your own
pixel class and use it to instantiate a custom image type. In order to use the itk::RGBPixel class,
it is necessary to include its header file.
#include "itkRGBPixel .h"
The RGB pixel class is templated over a type used to represent each one of the red, green and blue
pixel components. A typical instantiation of the templated class is as follows.
typedef itk::RGBPixel <unsigned char> PixelType ;
The type is then used as the pixel template parameter of the image.
The image type can be used to instantiate other filter, for example, an otb::ImageFileReader
object that will read the image from a file.
Access to the color components of the pixels can now be performed using the methods provided by
the RGBPixel class.

5.1. Image 75
PixelType onePixel = image ->GetPixel(pixelIndex );
PixelType ::ValueType red = onePixel.GetRed ();
PixelType ::ValueType green = onePixel.GetGreen ();
PixelType ::ValueType blue = onePixel.GetBlue ();
The subindex notation can also be used since the itk::RGBPixel inherits the [] operator from the
itk::FixedArray class.
red = onePixel [0]; // extract Red component
green = onePixel [1]; // extract Green component
blue = onePixel [2]; // extract Blue component
std::cout << "Pixel values:" << std ::endl;
std::cout << "Red = "
<< itk::NumericTraits <PixelType ::ValueType >:: PrintType (red)
<< std::endl;
std::cout << "Green = "
<< itk::NumericTraits <PixelType ::ValueType >:: PrintType (green)
<< std::endl;
std::cout << "Blue = "
<< itk::NumericTraits <PixelType ::ValueType >:: PrintType (blue)
<< std::endl;
5.1.7 Vector Images
Examples/DataRepresentation/Image/VectorImage.cxx.
Many image processing tasks require images of non-scalar pixel type. A typical example is a multi-
spectral image. The following code illustrates how to instantiate and use an image whose pixels are
of vector type.
We could use the itk::Vector class to define the pixel type. The Vector class is intended to
represent a geometrical vector in space. It is not intended to be used as an array container like the
std::vector in STL. If you are interested in containers, the itk::VectorContainer class may
provide the functionality you want.
However, the itk::Vector is a fixed size array and it assumes that the number of channels of
the image is known at compile time. Therefore, we prefer to use the otb::VectorImage class
which allows to choose the number of channels of the image at runtime. The pixels will be of type
itk::VariableLengthVector.
The first step is to include the header file of the VectorImage class.

The VectorImage class is templated over the type used to represent the coordinate in space and over
the dimension of the space. In this example, we want to represent Pléiades images which have 4
bands.
typedef otb::VectorImage <PixelType , 2> ImageType ;
Since the pixel dimensionality is choosen at runtime, one has to pass this parameter to the image
before memory allocation.
image ->SetNumberOfComponentsPerPixel(4);
The VariableLengthVector class overloads the operator []. This makes it possible to access the
Vector’s components using index notation. The user must not forget to allocate the memory for each
individual pixel by using the Reserve method.
ImageType ::PixelType pixelValue ;
pixelValue .Reserve (4);
pixelValue [0] = 1; // Blue component
pixelValue [1] = 6; // Green component
pixelValue [2] = 100; // Red component
pixelValue [3] = 100; // NIR component
We can now store this vector in one of the image pixels by defining an index and invoking the
SetPixel() method.
image ->SetPixel(pixelIndex , pixelValue );
The GetPixel method can also be used to read Vectors pixels from the image
ImageType ::PixelType value = image ->GetPixel(pixelIndex );
Lets repeat that both SetPixel() and GetPixel() are inefficient and should only be used for de-
bugging purposes or for implementing interactions with a graphical user interface such as querying
pixel value by clicking with the mouse.
5.1.8 Importing Image Data from a Buffer
This example illustrates how to import data into the otb::Image class. This is particularly useful
for interfacing with other software systems. Many systems use a contiguous block of memory as a
buffer for image pixel data. The current example assumes this is the case and feeds the buffer into
an otb::ImportImageFilter, thereby producing an Image as output.

5.1. Image 77
For fun we create a synthetic image with a centered sphere in a locally allocated buffer and pass this
block of memory to the ImportImageFilter. This example is set up so that on execution, the user
must provide the name of an output file as a command-line argument.
First, the header file of the ImportImageFilter class must be included.
#include "otbImportImageFilter.h"
Next, we select the data type to use to represent the image pixels. We assume that the external block
of memory uses the same data type to represent the pixels.
The type of the ImportImageFilter is instantiated in the following line.
typedef otb::ImportImageFilter <ImageType > ImportFilterType;
A filter object created using the New() method is then assigned to a SmartPointer.
ImportFilterType::Pointer importFilter = ImportFilterType::New ();
This filter requires the user to specify the size of the image to be produced as output. The
SetRegion() method is used to this end. The image size should exactly match the number of
pixels available in the locally allocated buffer.
ImportFilterType::SizeType size;
size[0] = 200; // size along X
size[1] = 200; // size along Y
ImportFilterType::IndexType start;
start.Fill(0);
ImportFilterType::RegionType region;
importFilter ->SetRegion (region);
The origin of the output image is specified with the SetOrigin() method.
double origin[Dimension ];
origin[0] = 0.0; // X coordinate
origin[1] = 0.0; // Y coordinate
importFilter ->SetOrigin (origin);
The spacing of the image is passed with the SetSpacing() method.

double spacing[Dimension ];
spacing [0] = 1.0; // along X direction
spacing [1] = 1.0; // along Y direction
importFilter ->SetSpacing (spacing );
Next we allocate the memory block containing the pixel data to be passed to the ImportImageFilter.
Note that we use exactly the same size that was specified with the SetRegion() method. In a
practical application, you may get this buffer from some other library using a different data structure
to represent the images.
// MODIFIED
const unsigned int numberOfPixels = size[0] * size[1];
PixelType * localBuffer = new PixelType [numberOfPixels];
Here we fill up the buffer with a binary sphere. We use simple for() loops here similar to
those found in the C or FORTRAN programming languages. Note that otb does not use for()
loops in its internal code to access pixels. All pixel access tasks are instead performed using
otb::ImageIterators that support the management of n-dimensional images.
const double radius2 = radius * radius;
PixelType * it = localBuffer ;
for (unsigned int y = 0; y < size[1]; y++)
{
const double dy = static_cast<double>(y) - static_cast<double>(size[1]) /
2.0;
for (unsigned int x = 0; x < size[0]; x++)
{
const double dx = static_cast<double>(x) - static_cast<double>(size[0]) /
2.0;
const double d2 = dx * dx + dy * dy;
*it++ = (d2 < radius2) ? 255 : 0;
}
}
The buffer is passed to the ImportImageFilter with the SetImportPointer(). Note that the last
argument of this method specifies who will be responsible for deleting the memory block once it
is no longer in use. A false value indicates that the ImportImageFilter will not try to delete the
buffer when its destructor is called. A true value, on the other hand, will allow the filter to delete
the memory block upon destruction of the import filter.
For the ImportImageFilter to appropriately delete the memory block, the memory must be allocated
with the C++ new() operator. Memory allocated with other memory allocation mechanisms, such as
C malloc or calloc, will not be deleted properly by the ImportImageFilter. In other words, it is the
application programmer’s responsibility to ensure that ImportImageFilter is only given permission
to delete the C++ new operator-allocated memory.
const bool importImageFilterWillOwnTheBuffer = true;
importFilter ->SetImportPointer(localBuffer , numberOfPixels ,

5.1. Image 79
importImageFilterWillOwnTheBuffer);
Finally, we can connect the output of this filter to a pipeline. For simplicity we just use a writer here,
but it could be any other filter.
writer ->SetInput(dynamic_cast<ImageType *>(importFilter ->GetOutput ()));
Note that we do not call delete on the buffer since we pass true as the last argument of
SetImportPointer(). Now the buffer is owned by the ImportImageFilter.
5.1.9 Image Lists
Examples/DataRepresentation/Image/ImageListExample.cxx.
This example illustrates the use of the otb::ImageList::class. This class provides the function-
nalities needed in order to integrate image lists as data objects into the OTB pipeline. Indeed, if
a std::list< ImageType > was used, the update operations on the pipeline might not have the
desired effects.
In this example, we will only present the basic operations which can be applied on an
otb::ImageList::object.
The first thing required to read an image from a file is to include the header file of the
otb::ImageFileReader::class.
#include "otbImageList.h"
As usual, we start by defining the types for the pixel and image types, as well as those for the readers
and writers.
typedef otb::ImageFileWriter <InputImageType > WriterType ;
We can now define the type for the image list. The otb::ImageList::class is templated over the
type of image contained in it. This means that all images in a list must have the same type.
typedef otb::ImageList <InputImageType > ImageListType;
Let us assume now that we want to read an image from a file and store it in a list. The first thing to
do is to instantiate the reader and set the image file name. We effectively read the image by calling
the Update().

reader ->Update();
We create an image list by using the New() method.
ImageListType::Pointer imageList = ImageListType::New();
In order to store the image in the list, the PushBack() method is used.
imageList ->PushBack (reader ->GetOutput ());
We could repeat this operation for other readers or the outputs of filters. We will now write an image
of the list to a file. We therefore instantiate a writer, set the image file name and set the input image
for it. This is done by calling the Back() method of the list, which allows us to get the last element.
// Getting the image from the list and writing it to file
writer ->SetFileName (outputFilename);
writer ->SetInput(imageList ->Back());
writer ->Update();
Other useful methods are:
• SetNthElement() and GetNthElement() allow to randomly access any element of the list.
• Front() to access to the first element of the list.
• Erase() to remove an element.
Also, iterator classes are defined in order to have an efficient mean of moving through the list.
Finally, the otb::ImageListToImageListFilter::is provided in order to implement filter which
operate on image lists and produce image lists.
5.2 PointSet
5.2.1 Creating a PointSet
Examples/DataRepresentation/Mesh/PointSet1.cxx.
The itk::PointSet is a basic class intended to represent geometry in the form of a set of points
in n-dimensional space. It is the base class for the itk::Mesh providing the methods necessary to
manipulate sets of point. Points can have values associated with them. The type of such values is
defined by a template parameter of the itk::PointSet class (i.e., TPixelType. Two basic inter-
action styles of PointSets are available in ITK. These styles are referred to as static and dynamic.

5.2. PointSet 81
The first style is used when the number of points in the set is known in advance and is not expected
to change as a consequence of the manipulations performed on the set. The dynamic style, on the
other hand, is intended to support insertion and removal of points in an efficient manner. Distin-
guishing between the two styles is meant to facilitate the fine tuning of a PointSet’s behavior while
optimizing performance and memory management.
In order to use the PointSet class, its header file should be included.
#include "itkPointSet .h"
Then we must decide what type of value to associate with the points. This is generally called the
PixelType in order to make the terminology consistent with the itk::Image. The PointSet is
also templated over the dimension of the space in which the points are represented. The following
declaration illustrates a typical instantiation of the PointSet class.
typedef itk::PointSet <unsigned short, 2> PointSetType;
A PointSet object is created by invoking the New() method on its type. The resulting object must be
assigned to a SmartPointer. The PointSet is then reference-counted and can be shared by multiple
objects. The memory allocated for the PointSet will be released when the number of references to
the object is reduced to zero. This simply means that the user does not need to be concerned with
invoking the Delete() method on this class. In fact, the Delete() method should never be called
directly within any of the reference-counted ITK classes.
PointSetType::Pointer pointsSet = PointSetType::New ();
Following the principles of Generic Programming, the PointSet class has a set of associated defined
types to ensure that interacting objects can be declared with compatible types. This set of type
definitions is commonly known as a set of traits. Among them we can find the PointType type,
for example. This is the type used by the point set to represent points in space. The following
declaration takes the point type as defined in the PointSet traits and renames it to be conveniently
used in the global namespace.
typedef PointSetType::PointType PointType ;
The PointType can now be used to declare point objects to be inserted in the PointSet. Points are
fairly small objects, so it is inconvenient to manage them with reference counting and smart pointers.
They are simply instantiated as typical C++ classes. The Point class inherits the [] operator from
the itk::Array class. This makes it possible to access its components using index notation. For
efficiency’s sake no bounds checking is performed during index access. It is the user’s responsibility
to ensure that the index used is in the range {0,Dimension−1}. Each of the components in the point
is associated with space coordinates. The following code illustrates how to instantiate a point and
initialize its components.
PointType p0;
p0[0] = -1.0; // x coordinate
p0[1] = -1.0; // y coordinate

Points are inserted in the PointSet by using the SetPoint() method. This method requires the user
to provide a unique identifier for the point. The identifier is typically an unsigned integer that will
enumerate the points as they are being inserted. The following code shows how three points are
inserted into the PointSet.
pointsSet ->SetPoint (0, p0);
It is possible to query the PointSet in order to determine how many points have been inserted into it.
This is done with the GetNumberOfPoints() method as illustrated below.
const unsigned int numberOfPoints = pointsSet ->GetNumberOfPoints();
std::cout << numberOfPoints << std::endl;
Points can be read from the PointSet by using the GetPoint() method and the integer identifier. The
point is stored in a pointer provided by the user. If the identifier provided does not match an existing
point, the method will return false and the contents of the point will be invalid. The following code
illustrates point access using defensive programming.
PointType pp;
bool pointExists = pointsSet ->GetPoint (1, &pp);
if (pointExists )
{
std::cout << "Point is = " << pp << std ::endl;
}
GetPoint() and SetPoint() are not the most efficient methods to access points in the PointSet. It
is preferable to get direct access to the internal point container defined by the traits and use iterators
to walk sequentially over the list of points (as shown in the following example).
5.2.2 Getting Access to Points
The itk::PointSet class uses an internal container to manage the storage of itk::Points. It is
more efficient, in general, to manage points by using the access methods provided directly on the
points container. The following example illustrates how to interact with the point container and how
to use point iterators.
The type is defined by the traits of the PointSet class. The following line conveniently takes the
PointsContainer type from the PointSet traits and declare it in the global namespace.
typedef PointSetType::PointsContainer PointsContainer;

5.2. PointSet 83
The actual type of the PointsContainer depends on what style of PointSet is being used. The dynamic
PointSet use the itk::MapContainer while the static PointSet uses the itk::VectorContainer.
The vector and map containers are basically ITK wrappers around the STL classes std::map and
std::vector. By default, the PointSet uses a static style, hence the default type of point container is
an VectorContainer. Both the map and vector container are templated over the type of the elements
they contain. In this case they are templated over PointType. Containers are reference counted
object. They are then created with the New() method and assigned to a itk::SmartPointer after
creation. The following line creates a point container compatible with the type of the PointSet from
which the trait has been taken.
PointsContainer::Pointer points = PointsContainer::New();
Points can now be defined using the PointType trait from the PointSet.
PointType p0;
PointType p1;
p0[0] = -1.0;
p0[1] = 0.0; // Point 0 = {-1, 0 }
p1[0] = 1.0;
p1[1] = 0.0; // Point 1 = { 1, 0 }
The created points can be inserted in the PointsContainer using the generic method
InsertElement() which requires an identifier to be provided for each point.
unsigned int pointId = 0;
points ->InsertElement(pointId++, p0);
points ->InsertElement(pointId++, p1);
Finally the PointsContainer can be assigned to the PointSet. This will substitute any previously
existing PointsContainer on the PointSet. The assignment is done using the SetPoints() method.
pointSet ->SetPoints (points);
The PointsContainer object can be obtained from the PointSet using the GetPoints() method. This
method returns a pointer to the actual container owned by the PointSet which is then assigned to a
SmartPointer.
PointsContainer::Pointer points2 = pointSet ->GetPoints ();
The most efficient way to sequentially visit the points is to use the iterators provided by PointsCon-
tainer. The Iterator type belongs to the traits of the PointsContainer classes. It behaves pretty
much like the STL iterators.3 The Points iterator is not a reference counted class, so it is created
directly from the traits without using SmartPointers.
typedef PointsContainer::Iterator PointsIterator;
3If you dig deep enough into the code, you will discover that these iterators are actually ITK wrappers around STL
iterators.

The subsequent use of the iterator follows what you may expect from a STL iterator. The iterator
to the first point is obtained from the container with the Begin() method and assigned to another
iterator.
PointsIterator pointIterator = points ->Begin();
The ++ operator on the iterator can be used to advance from one point to the next. The actual value
of the Point to which the iterator is pointing can be obtained with the Value() method. The loop for
walking through all the points can be controlled by comparing the current iterator with the iterator
returned by the End() method of the PointsContainer. The following lines illustrate the typical loop
for walking through the points.
PointsIterator end = points ->End();
while (pointIterator != end)
{
PointType p = pointIterator.Value(); // access the point
std::cout << p << std::endl; // print the point
++ pointIterator; // advance to next point
}
Note that as in STL, the iterator returned by the End() method is not a valid iterator. This is called
a past-end iterator in order to indicate that it is the value resulting from advancing one step after
visiting the last element in the container.
The number of elements stored in a container can be queried with the Size() method. In the case
of the PointSet, the following two lines of code are equivalent, both of them returning the number
of points in the PointSet.
std::cout << pointSet ->GetNumberOfPoints() << std::endl;
std::cout << pointSet ->GetPoints ()->Size() << std::endl;
5.2.3 Getting Access to Data in Points
The itk::PointSet class was designed to interact with the Image class. For this reason it was
found convenient to allow the points in the set to hold values that could be computed from images.
The value associated with the point is referred as PixelType in order to make it consistent with
image terminology. Users can define the type as they please thanks to the flexibility offered by the
Generic Programming approach used in the toolkit. The PixelType is the first template parameter
of the PointSet.
The following code defines a particular type for a pixel type and instantiates a PointSet class with it.
typedef unsigned short PixelType ;
typedef itk::PointSet <PixelType , 2> PointSetType;

5.2. PointSet 85
Data can be inserted into the PointSet using the SetPointData() method. This method requires the
user to provide an identifier. The data in question will be associated to the point holding the same
identifier. It is the user’s responsibility to verify the appropriate matching between inserted data and
inserted points. The following line illustrates the use of the SetPointData() method.
unsigned int dataId = 0;
PixelType value = 79;
pointSet ->SetPointData(dataId++, value);
Data associated with points can be read from the PointSet using the GetPointData() method. This
method requires the user to provide the identifier to the point and a valid pointer to a location where
the pixel data can be safely written. In case the identifier does not match any existing identifier on
the PointSet the method will return false and the pixel value returned will be invalid. It is the user’s
responsibility to check the returned boolean value before attempting to use it.
const bool found = pointSet ->GetPointData(dataId , &value);
if (found)
{
std::cout << "Pixel value = " << value << std::endl;
}
The SetPointData() and GetPointData() methods are not the most efficient way to get access
to point data. It is far more efficient to use the Iterators provided by the PointDataContainer.
Data associated with points is internally stored in PointDataContainers. In the same way as
with points, the actual container type used depend on whether the style of the PointSet is static
or dynamic. Static point sets will use an itk::VectorContainer while dynamic point sets will
use an itk::MapContainer. The type of the data container is defined as one of the traits in the
PointSet. The following declaration illustrates how the type can be taken from the traits and used to
conveniently declare a similar type on the global namespace.
typedef PointSetType::PointDataContainer PointDataContainer;
Using the type it is now possible to create an instance of the data container. This is a standard
reference counted object, henceforth it uses the New() method for creation and assigns the newly
created object to a SmartPointer.
PointDataContainer::Pointer pointData = PointDataContainer::New ();
Pixel data can be inserted in the container with the method InsertElement(). This method requires
an identified to be provided for each point data.
PixelType value0 = 34;
PixelType value1 = 67;
pointData ->InsertElement(pointId ++, value0);
pointData ->InsertElement(pointId ++, value1);

Finally the PointDataContainer can be assigned to the PointSet. This will substitute any previously
existing PointDataContainer on the PointSet. The assignment is done using the SetPointData()
method.
pointSet ->SetPointData(pointData );
The PointDataContainer can be obtained from the PointSet using the GetPointData() method.
This method returns a pointer (assigned to a SmartPointer) to the actual container owned by the
PointSet.
PointDataContainer::Pointer pointData2 = pointSet ->GetPointData();
The most efﬁcient way to sequentially visit the data associated with points is to use the iterators
provided by PointDataContainer. The Iterator type belongs to the traits of the PointsContainer
classes. The iterator is not a reference counted class, so it is just created directly from the traits
without using SmartPointers.
typedef PointDataContainer::Iterator PointDataIterator;
The subsequent use of the iterator follows what you may expect from a STL iterator. The iterator
to the ﬁrst point is obtained from the container with the Begin() method and assigned to another
iterator.
PointDataIterator pointDataIterator = pointData2 ->Begin();
The ++ operator on the iterator can be used to advance from one data point to the next. The actual
value of the PixelType to which the iterator is pointing can be obtained with the Value() method.
The loop for walking through all the point data can be controlled by comparing the current iterator
with the iterator returned by the End() method of the PointsContainer. The following lines illustrate
the typical loop for walking through the point data.
PointDataIterator end = pointData2 ->End();
while (pointDataIterator != end)
{
PixelType p = pointDataIterator.Value(); // access the pixel data
std::cout << p << std::endl; // print the pixel data
++ pointDataIterator; // advance to next pixel/point
}
Note that as in STL, the iterator returned by the End() method is not a valid iterator. This is called
a past-end iterator in order to indicate that it is the value resulting from advancing one step after
visiting the last element in the container.
5.2.4 Vectors as Pixel Type
Examples/DataRepresentation/Mesh/PointSetWithVectors.cxx.

5.2. PointSet 87
This example illustrates how a point set can be parameterized to manage a particular pixel type.
It is quite common to associate vector values with points for producing geometric representations
or storing multi-band informations. The following code shows how vector values can be used as
pixel type on the PointSet class. The itk::Vector class is used here as the pixel type. This class
is appropriate for representing the relative position between two points. It could then be used to
manage displacements in disparity map estimations, for example.
In order to use the vector class it is necessary to include its header ﬁle along with the header of the
point set.
#include "itkVector .h"
Figure 5.2: Vectors as PixelType.
The Vector class is templated over the type used to rep-
resent the spatial coordinates and over the space di-
mension. Since the PixelType is independent of the
PointType, we are free to select any dimension for the
vectors to be used as pixel type. However, for the sake
of producing an interesting example, we will use vec-
tors that represent displacements of the points in the
PointSet. Those vectors are then selected to be of the
same dimension as the PointSet.
typedef itk::Vector <float, Dimension > PixelType ;
Then we use the PixelType (which are actually Vectors) to instantiate the PointSet type and subse-
quently create a PointSet object.
typedef itk::PointSet <PixelType , Dimension > PointSetType;
PointSetType::Pointer pointSet = PointSetType::New ();
The following code is generating a circle and assigning vector values to the points. The components
of the vectors in this example are computed to represent the tangents to the circle as shown in
Figure 5.2.
PointSetType::PixelType tangent;
PointSetType::PointType point;
const double radius = 300.0;
for (unsigned int i = 0; i < 360; ++i)
{
const double angle = i * atan(1.0) / 45.0;
point[0] = radius * sin(angle);
point[1] = radius * cos(angle);
tangent [0] = cos(angle);
tangent [1] = -sin(angle);

pointSet ->SetPoint (pointId, point);
pointSet ->SetPointData(pointId, tangent );
pointId ++;
}
We can now visit all the points and use the vector on the pixel values to apply a displacement on the
points. This is along the spirit of what a deformable model could do at each one of its iterations.
typedef PointSetType::PointDataContainer::ConstIterator PointDataIterator;
PointDataIterator pixelIterator = pointSet ->GetPointData()->Begin();
PointDataIterator pixelEnd = pointSet ->GetPointData()->End ();
typedef PointSetType::PointsContainer::Iterator PointIterator;
PointIterator pointIterator = pointSet ->GetPoints ()->Begin();
PointIterator pointEnd = pointSet ->GetPoints ()->End ();
while (pixelIterator != pixelEnd && pointIterator != pointEnd)
{
pointIterator.Value() = pointIterator.Value() + pixelIterator.Value();
++ pixelIterator;
++ pointIterator;
}
Note that the ConstIterator was used here instead of the normal Iterator since the pixel values
are only intended to be read and not modified. ITK supports const-correctness at the API level.
The itk::Vector class has overloaded the + operator with the itk::Point. In other words,
vectors can be added to points in order to produce new points. This property is exploited in the
center of the loop in order to update the points positions with a single statement.
We can finally visit all the points and print out the new values
pointIterator = pointSet ->GetPoints ()->Begin();
pointEnd = pointSet ->GetPoints ()->End ();
while (pointIterator != pointEnd )
{
std::cout << pointIterator.Value() << std::endl;
++ pointIterator;
}
Note that itk::Vector is not the appropriate class for representing normals to surfaces and gradi-
ents of functions. This is due to the way in which vectors behave under affine transforms. ITK has a
specific class for representing normals and function gradients. This is the itk::CovariantVector
class.

5.3. Mesh 89
5.3 Mesh
5.3.1 Creating a Mesh
Examples/DataRepresentation/Mesh/Mesh1.cxx.
The itk::Mesh class is intended to represent shapes in space. It derives from the itk::PointSet
class and hence inherits all the functionality related to points and access to the pixel-data associated
with the points. The mesh class is also n-dimensional which allows a great flexibility in its use.
In practice a Mesh class can be seen as a PointSet to which cells (also known as elements) of many
different dimensions and shapes have been added. Cells in the mesh are defined in terms of the
existing points using their point-identifiers.
In the same way as for the PointSet, two basic styles of Meshes are available in ITK. They are
referred to as static and dynamic. The first one is used when the number of points in the set can be
known in advance and it is not expected to change as a consequence of the manipulations performed
on the set. The dynamic style, on the other hand, is intended to support insertion and removal of
points in an efficient manner. The reason for making the distinction between the two styles is to
facilitate fine tuning its behavior with the aim of optimizing performance and memory management.
In the case of the Mesh, the dynamic/static aspect is extended to the management of cells.
In order to use the Mesh class, its header file should be included.
#include "itkMesh.h"
Then, the type associated with the points must be selected and used for instantiating the Mesh type.
typedef float PixelType ;
The Mesh type extensively uses the capabilities provided by Generic Programming. In particular
the Mesh class is parameterized over the PixelType and the dimension of the space. PixelType is the
type of the value associated with every point just as is done with the PointSet. The following line
illustrates a typical instantiation of the Mesh.
typedef itk::Mesh <PixelType , Dimension > MeshType ;
Meshes are expected to take large amounts of memory. For this reason they are reference counted
objects and are managed using SmartPointers. The following line illustrates how a mesh is cre-
ated by invoking the New() method of the MeshType and the resulting object is assigned to a
itk::SmartPointer.
MeshType ::Pointer mesh = MeshType ::New ();
The management of points in the Mesh is exactly the same as in the PointSet. The type point
associated with the mesh can be obtained through the PointType trait. The following code shows

the creation of points compatible with the mesh type defined above and the assignment of values to
its coordinates.
MeshType :: PointType p0;
p0[0] = -1.0;
p0[1] = -1.0; // first point ( -1, -1 )
p1[0] = 1.0;
p1[1] = -1.0; // second point ( 1, -1 )
p2[0] = 1.0;
p2[1] = 1.0; // third point ( 1, 1 )
p3[0] = -1.0;
p3[1] = 1.0; // fourth point ( -1, 1 )
The points can now be inserted in the Mesh using the SetPoint() method. Note that points are
copied into the mesh structure. This means that the local instances of the points can now be modified
without affecting the Mesh content.
mesh ->SetPoint (0, p0);
The current number of points in the Mesh can be queried with the GetNumberOfPoints() method.
std::cout << "Points = " << mesh ->GetNumberOfPoints() << std::endl;
The points can now be efficiently accessed using the Iterator to the PointsContainer as it was done
in the previous section for the PointSet. First, the point iterator type is extracted through the mesh
traits.
typedef MeshType :: PointsContainer::Iterator PointsIterator;
A point iterator is initialized to the first point with the Begin() method of the PointsContainer.
PointsIterator pointIterator = mesh ->GetPoints ()->Begin();
The ++ operator on the iterator is now used to advance from one point to the next. The actual value
of the Point to which the iterator is pointing can be obtained with the Value() method. The loop
for walking through all the points is controlled by comparing the current iterator with the iterator
returned by the End() method of the PointsContainer. The following lines illustrate the typical loop
for walking through the points.
PointsIterator end = mesh ->GetPoints ()->End ();
while (pointIterator != end)
{
MeshType ::PointType p = pointIterator.Value(); // access the point

5.3. Mesh 91
std::cout << p << std::endl; // print the point
++ pointIterator; // advance to next point
}
5.3.2 Inserting Cells
A itk::Mesh can contain a variety of cell types. Typical cells are the itk::LineCell,
itk::TriangleCell, itk::QuadrilateralCell and itk::TetrahedronCell. The latter will
not be used very often in the remote sensing context. Additional flexibility is provided for managing
cells at the price of a bit more of complexity than in the case of point management.
The following code creates a polygonal line in order to illustrate the simplest case of cell manage-
ment in a Mesh. The only cell type used here is the LineCell. The header file of this class has to be
included.
#include "itkLineCell .h"
In order to be consistent with the Mesh, cell types have to be configured with a number of custom
types taken from the mesh traits. The set of traits relevant to cells are packaged by the Mesh class
into the CellType trait. This trait needs to be passed to the actual cell types at the moment of their
instantiation. The following line shows how to extract the Cell traits from the Mesh type.
typedef MeshType ::CellType CellType ;
The LineCell type can now be instantiated using the traits taken from the Mesh.
typedef itk::LineCell <CellType > LineType ;
The main difference in the way cells and points are managed by the Mesh is that points are stored by
copy on the PointsContainer while cells are stored in the CellsContainer using pointers. The reason
for using pointers is that cells use C++ polymorphism on the mesh. This means that the mesh is only
aware of having pointers to a generic cell which is the base class of all the specific cell types. This
architecture makes it possible to combine different cell types in the same mesh. Points, on the other
hand, are of a single type and have a small memory footprint, which makes it efficient to copy them
directly into the container.
Managing cells by pointers add another level of complexity to the Mesh since it is now necessary to
establish a protocol to make clear who is responsible for allocating and releasing the cells’ memory.
This protocol is implemented in the form of a specific type of pointer called the CellAutoPointer.
This pointer, based on the itk::AutoPointer, differs in many respects from the SmartPointer.
The CellAutoPointer has an internal pointer to the actual object and a boolean flag that indicates
if the CellAutoPointer is responsible for releasing the cell memory whenever the time comes for

its own destruction. It is said that a CellAutoPointer owns the cell when it is responsible for
its destruction. Many CellAutoPointer can point to the same cell but at any given time, only one
CellAutoPointer can own the cell.
The CellAutoPointer trait is defined in the MeshType and can be extracted as illustrated in the
following line.
typedef CellType :: CellAutoPointer CellAutoPointer;
Note that the CellAutoPointer is pointing to a generic cell type. It is not aware of the actual type
of the cell, which can be for example LineCell, TriangleCell or TetrahedronCell. This fact will
influence the way in which we access cells later on.
At this point we can actually create a mesh and insert some points on it.
MeshType ::Pointer mesh = MeshType ::New();
p0[0] = -1.0;
p0[1] = 0.0;
p1[0] = 1.0;
p1[1] = 0.0;
p2[0] = 1.0;
p2[1] = 1.0;
The following code creates two CellAutoPointers and initializes them with newly created cell ob-
jects. The actual cell type created in this case is LineCell. Note that cells are created with the normal
new C++ operator. The CellAutoPointer takes ownership of the received pointer by using the method
TakeOwnership(). Even though this may seem verbose, it is necessary in order to make it explicit
from the code that the responsibility of memory release is assumed by the AutoPointer.
CellAutoPointer line0;
CellAutoPointer line1;
line0.TakeOwnership(new LineType );
line1.TakeOwnership(new LineType );
The LineCells should now be associated with points in the mesh. This is done using the identifiers
assigned to points when they were inserted in the mesh. Every cell type has a specific number of
points that must be associated with it.4 For example a LineCell requires two points, a TriangleCell
requires three and a TetrahedronCell requires four. Cells use an internal numbering system for
4
Some cell types like polygons have a variable number of points associated with them.

5.3. Mesh 93
points. It is simply an index in the range {0,NumberOfPoints− 1}. The association of points and
cells is done by the SetPointId() method which requires the user to provide the internal index of
the point in the cell and the corresponding PointIdentifier in the Mesh. The internal cell index is the
first parameter of SetPointId() while the mesh point-identifier is the second.
line0 ->SetPointId (0, 0); // line between points 0 and 1
line0 ->SetPointId (1, 1);
line1 ->SetPointId (0, 1); // line between points 1 and 2
line1 ->SetPointId (1, 2);
Cells are inserted in the mesh using the SetCell() method. It requires an identifier and the Auto-
Pointer to the cell. The Mesh will take ownership of the cell to which the AutoPointer is pointing.
This is done internally by the SetCell() method. In this way, the destruction of the CellAutoPointer
will not induce the destruction of the associated cell.
mesh ->SetCell(0, line0);
mesh ->SetCell(1, line1);
After serving as an argument of the SetCell() method, a CellAutoPointer no longer holds owner-
ship of the cell. It is important not to use this same CellAutoPointer again as argument to SetCell()
without first securing ownership of another cell.
The number of Cells currently inserted in the mesh can be queried with the GetNumberOfCells()
method.
std::cout << "Cells = " << mesh ->GetNumberOfCells() << std::endl;
In a way analogous to points, cells can be accessed using Iterators to the CellsContainer in the mesh.
The trait for the cell iterator can be extracted from the mesh and used to define a local type.
typedef MeshType :: CellsContainer::Iterator CellIterator;
Then the iterators to the first and past-end cell in the mesh can be obtained respectively with the
Begin() and End() methods of the CellsContainer. The CellsContainer of the mesh is returned by
the GetCells() method.
CellIterator cellIterator = mesh ->GetCells ()->Begin();
CellIterator end = mesh ->GetCells ()->End ();
Finally a standard loop is used to iterate over all the cells. Note the use of the Value() method
used to get the actual pointer to the cell from the CellIterator. Note also that the values returned are
pointers to the generic CellType. These pointers have to be down-casted in order to be used as actual
LineCell types. Safe down-casting is performed with the dynamic cast operator which will throw
an exception if the conversion cannot be safely performed.
while (cellIterator != end)
{

MeshType ::CellType * cellptr = cellIterator.Value();
LineType * line = dynamic_cast<LineType *>(cellptr);
std::cout << line ->GetNumberOfPoints() << std::endl;
++ cellIterator;
}
5.3.3 Managing Data in Cells
In the same way that custom data can be associated with points in the mesh, it is also possible to
associate custom data with cells. The type of the data associated with the cells can be different
from the data type associated with points. By default, however, these two types are the same. The
following example illustrates how to access data associated with cells. The approach is analogous
to the one used to access point data.
Consider the example of a mesh containing lines on which values are associated with each line. The
mesh and cell header files should be included first.
#include "itkMesh.h"
#include "itkLineCell .h"
Then the PixelType is defined and the mesh type is instantiated with it.
typedef itk::Mesh <PixelType , 2> MeshType ;
The itk::LineCell type can now be instantiated using the traits taken from the Mesh.
typedef MeshType ::CellType CellType ;
typedef itk::LineCell <CellType > LineType ;
Let’s now create a Mesh and insert some points into it. Note that the dimension of the points matches
the dimension of the Mesh. Here we insert a sequence of points that look like a plot of the log()
function.
MeshType ::Pointer mesh = MeshType ::New();
typedef MeshType ::PointType PointType ;
PointType point;
const unsigned int numberOfPoints = 10;
for (unsigned int id = 0; id < numberOfPoints; id++)
{
point[0] = static_cast<PointType ::ValueType >(id); // x
point[1] = log(static_cast<double>(id)); // y
mesh ->SetPoint(id, point);
}

5.3. Mesh 95
A set of line cells is created and associated with the existing points by using point identifiers. In this
simple case, the point identifiers can be deduced from cell identifiers since the line cells are ordered
in the same way.
CellType ::CellAutoPointer line;
const unsigned int numberOfCells = numberOfPoints - 1;
for (unsigned int cellId = 0; cellId < numberOfCells; cellId++)
{
line.TakeOwnership(new LineType );
line ->SetPointId (0, cellId); // first point
line ->SetPointId (1, cellId + 1); // second point
mesh ->SetCell(cellId , line); // insert the cell
}
Data associated with cells is inserted in the itk::Mesh by using the SetCellData() method. It
requires the user to provide an identifier and the value to be inserted. The identifier should match
one of the inserted cells. In this simple example, the square of the cell identifier is used as cell data.
Note the use of static cast to PixelType in the assignment.
{
mesh ->SetCellData (cellId , static_cast<PixelType >(cellId * cellId));
}
Cell data can be read from the Mesh with the GetCellData() method. It requires the user to provide
the identifier of the cell for which the data is to be retrieved. The user should provide also a valid
pointer to a location where the data can be copied.
{
PixelType value = itk::NumericTraits <PixelType >:: Zero;
mesh ->GetCellData (cellId , &value);
std::cout << "Cell " << cellId << " = " << value << std::endl;
}
Neither SetCellData() or GetCellData() are efficient ways to access cell data. More efficient
access to cell data can be achieved by using the Iterators built into the CellDataContainer.
typedef MeshType :: CellDataContainer::ConstIterator CellDataIterator;
Note that the ConstIterator is used here because the data is only going to be read. This approach
is exactly the same already illustrated for getting access to point data. The iterator to the first cell
data item can be obtained with the Begin() method of the CellDataContainer. The past-end iterator
is returned by the End() method. The cell data container itself can be obtained from the mesh with
the method GetCellData().
CellDataIterator cellDataIterator = mesh ->GetCellData ()->Begin();
CellDataIterator end = mesh ->GetCellData ()->End();

Finally a standard loop is used to iterate over all the cell data entries. Note the use of the Value()
method used to get the actual value of the data entry. PixelType elements are copied into the local
variable cellValue.
while (cellDataIterator != end)
{
PixelType cellValue = cellDataIterator.Value();
std::cout << cellValue << std::endl;
++ cellDataIterator;
}
More details about the use of itk::Mesh can be found in the ITK Software Guide.
5.4 Path
5.4.1 Creating a PolyLineParametricPath
Examples/DataRepresentation/Path/PolyLineParametricPath1.cxx.
This example illustrates how to use the itk::PolyLineParametricPath. This class will typically
be used for representing in a concise way the output of an image segmentation algorithm in 2D. See
section 14.3 for an example in the context of alignment detection. The PolyLineParametricPath
however could also be used for representing any open or close curve in N-Dimensions as a linear
piece-wise approximation.
First, the header ﬁle of the PolyLineParametricPath class must be included.
#include "itkPolyLineParametricPath.h"
The path is instantiated over the dimension of the image.
typedef otb::Image <unsigned char, Dimension > ImageType ;
typedef itk::PolyLineParametricPath <Dimension > PathType;
ImageType :: ConstPointer image = reader ->GetOutput ();
PathType ::Pointer path = PathType ::New();
path ->Initialize ();
typedef PathType :: ContinuousIndexType ContinuousIndexType;
ContinuousIndexType cindex;

5.4. Path 97
typedef ImageType ::PointType ImagePointType;
ImagePointType origin = image ->GetOrigin ();
ImageType :: SpacingType spacing = image ->GetSpacing ();
ImageType ::SizeType size = image ->GetBufferedRegion(). GetSize ();
ImagePointType point;
point[0] = origin [0] + spacing [0] * size[0];
point[1] = origin [1] + spacing [1] * size[1];
image ->TransformPhysicalPointToContinuousIndex(origin , cindex);
path ->AddVertex (cindex);
image ->TransformPhysicalPointToContinuousIndex(point , cindex);
path ->AddVertex (cindex);

CHAPTER
SIX
READING AND WRITING IMAGES
This chapter describes the toolkit architecture supporting reading and writing of images to files.
OTB does not enforce any particular file format, instead, it provides a structure inherited from ITK,
supporting a variety of formats that can be easily extended by the user as new formats become
available.
We begin the chapter with some simple examples of file I/O.
6.1 Basic Example
Examples/IO/ImageReadWrite.cxx.
The classes responsible for reading and writing images are located at the beginning and end of the
data processing pipeline. These classes are known as data sources (readers) and data sinks (writers).
Generally speaking they are referred to as filters, although readers have no pipeline input and writers
have no pipeline output.
The reading of images is managed by the class otb::ImageFileReader while writing is performed
by the class otb::ImageFileWriter. These two classes are independent of any particular file
format. The actual low level task of reading and writing specific file formats is done behind the
scenes by a family of classes of type itk::ImageIO. Actually, the OTB image Readers and Writers
are very similar to those of ITK, but provide new functionnalities which are specific to remote
sensing images.
The first step for performing reading and writing is to include the following headers.
Then, as usual, a decision must be made about the type of pixel used to represent the image processed
by the pipeline. Note that when reading and writing images, the pixel type of the image is not

100 Chapter 6. Reading and Writing Images
necessarily the same as the pixel type stored in the file. Your choice of the pixel type (and hence
template parameter) should be driven mainly by two considerations:
• It should be possible to cast the file pixel type in the file to the pixel type you select. This
casting will be performed using the standard C-language rules, so you will have to make sure
that the conversion does not result in information being lost.
• The pixel type in memory should be appropriate to the type of processing you intended to
apply on the images.
A typical selection for remote sensing images is illustrated in the following lines.
Note that the dimension of the image in memory should match the one of the image in file. There
are a couple of special cases in which this condition may be relaxed, but in general it is better to
ensure that both dimensions match. This is not a real issue in remote sensing, unless you want to
consider multi-band images as volumes (3D) of data.
We can now instantiate the types of the reader and writer. These two classes are parameterized over
the image type.
Then, we create one object of each type using the New() method and assigning the result to a
itk::SmartPointer.
The name of the file to be read or written is passed with the SetFileName() method.
We can now connect these readers and writers to filters to create a pipeline. For example, we can
create a short pipeline by passing the output of the reader directly to the input of the writer.
At first view, this may seem as a quite useless program, but it is actually implementing a powerful file
format conversion tool! The execution of the pipeline is triggered by the invocation of the Update()
methods in one of the final objects. In this case, the final data pipeline object is the writer. It is a
wise practice of defensive programming to insert any Update() call inside a try/catch block in
case exceptions are thrown during the execution of the pipeline.

6.2. Pluggable Factories 101
ImageFileWriterImageFileReader
PNGImageIO MetaImageIO
CanReadFile():bool
CanWriteFile():bool
ImageIO
VTKImageIO
RawImageIO
VOLImageIO
1
1 1
1
GDALImageIO
ONERAImageIO
Figure 6.1: Collaboration diagram of the ImageIO classes.
try
{
writer ->Update ();
}
{
std::cerr << "ExceptionObject caught !" << std::endl;
std::cerr << err << std::endl;
}
Note that exceptions should only be caught by pieces of code that know what to do with them. In
a typical application this catch block should probably reside on the GUI code. The action on the
catch block could inform the user about the failure of the IO operation.
The IO architecture of the toolkit makes it possible to avoid explicit specification of the file format
used to read or write images.1 The object factory mechanism enables the ImageFileReader and
ImageFileWriter to determine (at run-time) with which file format it is working with. Typically, file
formats are chosen based on the filename extension, but the architecture supports arbitrarily complex
processes to determine whether a file can be read or written. Alternatively, the user can specify the
data file format by explicit instantiation and assignment the appropriate itk::ImageIO subclass.
To better understand the IO architecture, please refer to Figures 6.1, 6.2, and 6.3.
The following section describes the internals of the IO architecture provided in the toolbox.
6.2 Pluggable Factories
The principle behind the input/output mechanism used in ITK and therefore OTB is known
as pluggable-factories [49]. This concept is illustrated in the UML diagram in Figure 6.1.
1In this example no file format is specified; this program can be used as a general file conversion utility.

Register
CanWrite ?
CanRead ?
MetaImageIOFactory
PNGImageIOFactory
ImageIOFactory
Pluggable Factories Pluggable Factories
ImageFileReader
ImageFileWriter
CreateImageIO
for Reading
CreateImageIO
for Writing
filename
filename
filename
filename
Figure 6.2: Use cases of ImageIO factories.
PNGImageIOFactory VTKImageIOFactory
BMPImageIOFactory
MetaImageIOFactory
TIFFImageIOFactory
RegisterFactory(factory:ObjectFactoryBase)
ObjectFactoryBase
CreateImageIO(string)
RegisterBuiltInFactories()
ImageIOFactory
RawImageIOFactory
JPEGImageIOFactory
*1
GDALImageIOFactory
other ITK Factories
ONERAImageIOFactory
Figure 6.3: Class diagram of the ImageIO factories.

6.2. Pluggable Factories 103
From the user’s point of view the objects responsible for reading and writing files are the
otb::ImageFileReader and otb::ImageFileWriter classes. These two classes, however, are
not aware of the details involved in reading or writing particular file formats like PNG or GeoTIFF.
What they do is to dispatch the user’s requests to a set of specific classes that are aware of the details
of image file formats. These classes are the itk::ImageIO classes. The ITK delegation mecha-
nism enables users to extend the number of supported file formats by just adding new classes to the
ImageIO hierarchy.
Each instance of ImageFileReader and ImageFileWriter has a pointer to an ImageIO object. If this
pointer is empty, it will be impossible to read or write an image and the image file reader/writer
must determine which ImageIO class to use to perform IO operations. This is done basically by
passing the filename to a centralized class, the itk::ImageIOFactory and asking it to identify any
subclass of ImageIO capable of reading or writing the user-specified file. This is illustrated by the
use cases on the right side of Figure 6.2. The ImageIOFactory acts here as a dispatcher that help to
locate the actual IO factory classes corresponding to each file format.
Each class derived from ImageIO must provide an associated factory class capable of producing an
instance of the ImageIO class. For example, for PNG files, there is a itk::PNGImageIO object
that knows how to read this image files and there is a itk::PNGImageIOFactory class capable
of constructing a PNGImageIO object and returning a pointer to it. Each time a new file format is
added (i.e., a new ImageIO subclass is created), a factory must be implemented as a derived class of
the ObjectFactoryBase class as illustrated in Figure 6.3.
For example, in order to read PNG files, a PNGImageIOFactory is created and registered with the
central ImageIOFactory singleton2 class as illustrated in the left side of Figure 6.2. When the Im-
ageFileReader asks the ImageIOFactory for an ImageIO capable of reading the file identified with
filename the ImageIOFactory will iterate over the list of registered factories and will ask each one of
them is they know how to read the file. The factory that responds affirmatively will be used to create
the specific ImageIO instance that will be returned to the ImageFileReader and used to perform the
read operations.
With respect to the ITK formats, OTB adds most of the remote sensing image formats. In order to
do so, the Geospatial Data Abstraction Library, GDAL http://www.gdal.org/, is encapsultated
in a ImageIO factory. GDAL is a translator library for raster geospatial data formats that is released
under an X/MIT style Open Source license. As a library, it presents a single abstract data model to
the calling application for all supported formats, which include CEOS, GeoTIFF, ENVI, and much
more. See http://www.gdal.org/formats_list.html for the full format list.
Since GDAL is itself a multi-format library, the GDAL IO factory is able to choose the appropriate
ressource for reading and writing images.
In most cases the mechanism is transparent to the user who only interacts with the ImageFileReader
and ImageFileWriter. It is possible, however, to explicitly select the type of ImageIO object to use.
Please see the ITK Software for more details about this.
2Singleton means that there is only one instance of this class in a particular application

6.3 IO Streaming
6.3.1 Implicit Streaming
Examples/IO/StreamingImageReadWrite.cxx.
As we have seen, the reading of images is managed by the class otb::ImageFileReader while
writing is performed by the class otb::ImageFileWriter. ITK’s pipeline implements streaming.
That means that a filter for which the ThreadedGenerateData method is implemented, will only
produce the data for the region requested by the following filter in the pipeline. Therefore, in order
to use the streaming functionnality one needs to use a filter at the end of the pipeline which requests
for adjacent regions of the image to be processed. In ITK, the itk::StreamingImageFilter class
is used for this purpose. However, ITK does not implement streaming from/to files. This means that
even if the pipeline has a small memory footprint, the images have to be stored in memory at least
after the read operation and before the write operation.
OTB implements read/write streaming. For the image file reading, this is transparent for the pro-
grammer, and if a streaming loop is used at the end of the pipeline, the read operation will be
streamed. For the file writing, the otb::ImageFileWriter has to be used.
The first step for performing streamed reading and writing is to include the following headers.
by the pipeline.
the image type. We will rescale the intensities of the as an example of intermediate processing step.
typedef itk::RescaleIntensityImageFilter<ImageType , ImageType > RescalerType;
itk::SmartPointer.
The name of the file to be read or written is passed with the SetFileName() method. We also choose
the range of intensities for the rescaler.

6.3. IO Streaming 105
We can now connect these readers and writers to filters to create a pipeline.
rescaler ->SetInput(reader ->GetOutput ());
We can now trigger the pipeline execution by calling the Update method on the writer.
writer ->Update ();
The writer will ask its preceding filter to provide different portions of the image. Each filter in the
pipeline will do the same until the request arrives to the reader. In this way, the pipeline will be
executed for each requested region and the whole input image will be read, processed and written
without being fully loaded in memory.
6.3.2 Explicit Streaming
Examples/IO/ExplicitStreamingExample.cxx.
Usually, the streaming process is hidden within the pipeline. This allows the user to get rid of the
annoying task of splitting the images into tiles, and so on. However, for some kinds of processing,
we do not really need a pipeline: no writer is needed, only read access to pixel values is wanted.
In these cases, one has to explicitly set up the streaming procedure. Fortunately, OTB offers a high
level of abstraction for this task. We will need to include the following header files:
#include "itkImageRegionSplitter.h"
#include "otbStreamingTraits.h"
The otb::StreamingTraits class manages the streaming approaches which are possible with
the image type over which it is templated. The class itk::ImageRegionSplitter is templated
over the number of dimensions of the image and will perform the actual image splitting. More
information on splitter can be found in section 28.3
typedef otb::StreamingTraits <ImageType > StreamingTraitsType;
typedef itk::ImageRegionSplitter <2> SplitterType;
Once a region of the image is available, we will use classical region iterators to get the pixels.
typedef ImageType ::RegionType RegionType ;
typedef itk::ImageRegionConstIterator<ImageType > IteratorType;

We instantiate the image ﬁle reader, but in order to avoid reading the whole im-
age, we call the GenerateOutputInformation() method instead of the Update() one.
GenerateOutputInformation() will make available the information about sizes, band, resolu-
tions, etc. After that, we can access the largest possible region of the input image.
ImageReaderType::Pointer reader = ImageReaderType::New();
reader ->SetFileName (infname );
reader ->GenerateOutputInformation();
RegionType largestRegion = reader ->GetOutput ()->GetLargestPossibleRegion();
We set up now the local streaming capabilities by asking the streaming traits to compute the number
of regions to split the image into given the splitter, the user deﬁned number of lines, and the input
image information.
SplitterType::Pointer splitter = SplitterType::New ();
unsigned int numberOfStreamDivisions =
StreamingTraitsType::CalculateNumberOfStreamDivisions(
reader ->GetOutput (),
largestRegion ,
splitter ,
otb::SET_BUFFER_NUMBER_OF_LINES,
0, 0, nbLinesForStreaming);
We can now get the split regions and iterate through them.
unsigned int piece = 0;
RegionType streamingRegion;
for (piece = 0;
piece < numberOfStreamDivisions;
piece++)
{
We get the region
streamingRegion =
splitter ->GetSplit(piece , numberOfStreamDivisions , largestRegion);
std::cout << "Processing region: " << streamingRegion << std::endl;
We ask the reader to provide the region.
reader ->GetOutput ()->SetRequestedRegion(streamingRegion);
reader ->GetOutput ()->PropagateRequestedRegion();
reader ->GetOutput ()->UpdateOutputData();
We declare an iterator and walk through the region.

6.4. Reading and Writing RGB Images 107
IteratorType it(reader ->GetOutput (), streamingRegion);
it.GoToBegin ();
while (!it.IsAtEnd ())
{
std::cout << it.Get() << std::endl;
++it;
}
6.4 Reading and Writing RGB Images
Examples/IO/RGBImageReadWrite.cxx.
RGB images are commonly used for representing data acquired from multispectral sensors. This
example illustrates how to read and write RGB color images to and from a file. This requires the
following headers as shown.
#include "itkRGBPixel .h"
The itk::RGBPixel class is templated over the type used to represent each one of the red, green
and blue components. A typical instantiation of the RGB image class might be as follows.
typedef itk::RGBPixel <unsigned char> PixelType ;
The image type is used as a template parameter to instantiate the reader and writer.
The filenames of the input and output files must be provided to the reader and writer respectively.
Finally, execution of the pipeline can be triggered by invoking the Update() method in the writer.
writer ->Update();

You may have noticed that apart from the declaration of the PixelType there is nothing in this code
that is specific for RGB images. All the actions required to support color images are implemented
internally in the itk::ImageIO objects.
6.5 Reading, Casting and Writing Images
Examples/IO/ImageReadCastWrite.cxx.
Given that ITK and OTB are based on the Generic Programming paradigm, most of the types are
defined at compilation time. It is sometimes important to anticipate conversion between different
types of images. The following example illustrates the common case of reading an image of one
pixel type and writing it on a different pixel type. This process not only involves casting but also
rescaling the image intensity since the dynamic range of the input and output pixel types can be quite
different. The itk::RescaleIntensityImageFilter is used here to linearly rescale the image
values.
The first step in this example is to include the appropriate headers.
Then, as usual, a decision should be made about the pixel type that should be used to represent the
images. Note that when reading an image, this pixel type is not necessarily the pixel type of the
image stored in the file. Instead, it is the type that will be used to store the image as soon as it is read
into memory.
typedef float InputPixelType;
typedef otb::Image <OutputPixelType , Dimension > OutputImageType;
the image type.
Below we instantiate the RescaleIntensityImageFilter class that will linearly scale the image inten-
sities.
typedef itk::RescaleIntensityImageFilter<
InputImageType ,
OutputImageType > FilterType ;

6.6. Extracting Regions 109
A filter object is constructed and the minimum and maximum values of the output are selected using
the SetOutputMinimum() and SetOutputMaximum() methods.
filter ->SetOutputMinimum(0);
filter ->SetOutputMaximum(255);
Then, we create the reader and writer and connect the pipeline.
The name of the files to be read and written are passed with the SetFileName() method.
Finally we trigger the execution of the pipeline with the Update() method on the writer. The output
image will then be the scaled and cast version of the input image.
try
{
writer ->Update ();
}
{
}
6.6 Extracting Regions
Examples/IO/ImageReadRegionOfInterestWrite.cxx.
This example should arguably be placed in the filtering chapter. However its usefulness for typical
IO operations makes it interesting to mention here. The purpose of this example is to read and
image, extract a subregion and write this subregion to a file. This is a common task when we want
to apply a computationally intensive method to the region of interest of an image.
As usual with OTB IO, we begin by including the appropriate header files.

The otb::ExtractROI is the filter used to extract a region from an image. Its header is included
below.
#include "otbExtractROI.h"
Image types are defined below.
The types for the otb::ImageFileReader and otb::ImageFileWriter are instantiated using the
image types.
The ExtractROI type is instantiated using the input and output pixel types. Using the pixel
types as template parameters instead of the image types allows to restrict the use of this class
to otb::Images which are used with scalar pixel types. See section 6.8.1 for the extraction of
ROIs on otb::VectorImages. A filter object is created with the New() method and assigned to a
itk::SmartPointer.
typedef otb::ExtractROI <InputImageType::PixelType ,
OutputImageType::PixelType > FilterType ;
The ExtractROI requires a region to be defined by the user. This is done by defining a rectangle
with the following methods (the filter assumes that a 2D image is being processed, for N-D region
extraction, you can use the itk::RegionOfInterestImageFilter class).
filter ->SetStartX (atoi(argv[3]));
filter ->SetStartY (atoi(argv[4]));
filter ->SetSizeX(atoi(argv[5]));
filter ->SetSizeY(atoi(argv[6]));
Below, we create the reader and writer using the New() method and assigning the result to a Smart-
Pointer.

6.7. Reading and Writing Vector Images 111
Below we connect the reader, ﬁlter and writer to form the data processing pipeline.
Finally we execute the pipeline by invoking Update() on the writer. The call is placed in a try/catch
block in case exceptions are thrown.
try
{
writer ->Update ();
}
{
}
6.7 Reading and Writing Vector Images
Images whose pixel type is a Vector, a CovariantVector, an Array, or a Complex are quite common
in image processing. One of the uses of these tye of images is the processing of SLC SAR images,
which are complex.
6.7.1 Reading and Writing Complex Images
Examples/IO/ComplexImageReadWrite.cxx.
This example illustrates how to read and write an image of pixel type std::complex. The complex
type is deﬁned as an integral part of the C++ language.
We start by including the headers of the complex class, the image, and the reader and writer classes.
#include <complex>
The image dimension and pixel type must be declared. In this case we use the std::complex<> as
the pixel type. Using the dimension and pixel type we proceed to instantiate the image type.
typedef std::complex <float> PixelType ;

The image file reader and writer types are instantiated using the image type. We can then create
objects for both of them.
Filenames should be provided for both the reader and the writer. In this particular example we take
those filenames from the command line arguments.
Here we simply connect the output of the reader as input to the writer. This simple program could
be used for converting complex images from one fileformat to another.
The execution of this short pipeline is triggered by invoking the Update() method of the writer. This
invocation must be placed inside a try/catch block since its execution may result in exceptions being
thrown.
try
{
writer ->Update ();
}
{
}
For a more interesting use of this code, you may want to add a filter in between the reader and the
writer and perform any complex image to complex image operation.
6.8 Reading and Writing Multiband Images
Examples/IO/MultibandImageReadWrite.cxx.
The otb::Image class with a vector pixel type could be used for representing multispectral images,
with one band per vector component, however, this is not a practical way, since the dimensionality
of the vector must be known at compile time. OTB offers the otb::VectorImage where the
dimensionality of the vector stored for each pixel can be chosen at runtime. This is needed for the
image file readers in order to dynamically set the number of bands of an image read from a file.

6.8. Reading and Writing Multiband Images 113
The OTB Readers and Writers are able to deal with otb::VectorImages transparently for the user.
The first step for performing reading and writing is to include the following headers.
by the pipeline. The pixel type corresponds to the scalar type stored in the vector components.
Therefore, for a multiband Pléiades image we will do:
typedef otb::VectorImage <PixelType , Dimension > ImageType ;
the image type.
itk::SmartPointer.
We can now connect these readers and writers to filters to create a pipeline. The only thig to take care
of is, when executing the program, choosing an output image file format which supports multiband
images.
try
{
writer ->Update ();
}
{
}

6.8.1 Extracting ROIs
Examples/IO/ExtractROI.cxx.
This example shows the use of the otb::MultiChannelExtractROI and
otb::MultiToMonoChannelExtractROI which allow the extraction of ROIs from multiband
images stored into otb::VectorImages. The first one povides a Vector Image as output, while the
second one provides a classical otb::Image with a scalar pixel type. The present example shows
how to extract a ROI from a 4-band SPOT 5 image and to produce a first multi-band 3-channel
image and a second mono-channel one for the SWIR band.
We start by including the needed header files.
#include "otbMultiChannelExtractROI.h"
#include "otbMultiToMonoChannelExtractROI.h"
The program arguments define the image file names as well as the rectangular area to be extracted.
const char * inputFilename = argv[1];
const char * outputFilenameRGB = argv[2];
const char * outputFilenameMIR = argv[3];
unsigned int startX((unsigned int) ::atoi(argv[4]));
unsigned int startY((unsigned int) ::atoi(argv[5]));
unsigned int sizeX((unsigned int) ::atoi(argv[6]));
unsigned int sizeY((unsigned int) ::atoi(argv[7]));
As usual, we define the input and output pixel types.
First of all, we extract the multiband part by using the otb::MultiChannelExtractROI class,
which is templated over the input and output pixel types. This class in not templated over the
images types in order to force these images to be of otb::VectorImage type.
typedef otb::MultiChannelExtractROI <InputPixelType ,
OutputPixelType > ExtractROIFilterType;
We create the extractor filter by using the New method of the class and we set its parameters.
ExtractROIFilterType::Pointer extractROIFilter = ExtractROIFilterType::New();
extractROIFilter ->SetStartX (startX);
extractROIFilter ->SetStartY (startY);
extractROIFilter ->SetSizeX (sizeX);
extractROIFilter ->SetSizeY (sizeY);

6.8. Reading and Writing Multiband Images 115
We must tell the filter which are the channels to be used. When selecting contiguous bands, we can
use the SetFirstChannel and the SetLastChannel. Otherwise, we select individual channels by
using the SetChannel method.
extractROIFilter ->SetFirstChannel(1);
extractROIFilter ->SetLastChannel(3);
We will use the OTB readers and writers for file access.
typedef otb::ImageFileReader <ExtractROIFilterType::InputImageType > ReaderType ;
typedef otb::ImageFileWriter <ExtractROIFilterType::InputImageType > WriterType ;
Since the number of bands of the input image is dynamically set at runtime, the
UpdateOutputInformation method of the reader must be called before using the extractor filter.
writer ->SetFileName (outputFilenameRGB);
We can then build the pipeline as usual.
extractROIFilter ->SetInput (reader ->GetOutput ());
writer ->SetInput(extractROIFilter ->GetOutput ());
And execute the pipeline by calling the Update method of the writer.
writer ->Update();
The usage of the otb::MultiToMonoChannelExtractROI is similar to the one of the
otb::MultiChannelExtractROI described above.
The goal now is to extract an ROI from a multi-band image and generate a mono-channel image as
output.
We could use the otb::MultiChannelExtractROI and select a single channel, but us-
ing the otb::MultiToMonoChannelExtractROI we generate a otb::Image instead of an
otb::VectorImage. This is useful from a computing and memory usage point of view. This class
is also templated over the pixel types.
typedef otb::MultiToMonoChannelExtractROI<InputPixelType ,
OutputPixelType >
ExtractROIMonoFilterType;
For this filter, only one output channel has to be selected.
extractROIMonoFilter ->SetChannel (4);

Figure 6.4: Quicklook of the original SPOT 5 image.
Figure 6.5: Result of the extraction. Left: 3-channel image. Right: mono-band image.

6.9. Reading Image Series 117
Figure 6.5 illustrates the result of the application of both extraction filters on the image presented in
figure 6.4.
6.9 Reading Image Series
Examples/IO/ImageSeriesIOExample.cxx.
This example shows how to read a list of images and concatenate them into a vector image. We
will write a program which is able to perform this operation taking advantage of the streaming
functionnalities of the processing pipeline. We will assume that all the input images have the same
size and a single band.
The following header files will be needed:
#include "otbObjectList.h"
#include "otbImageListToVectorImageFilter.h"
We will start by defining the types for the input images and the associated readers.
typedef unsigned short int PixelType ;
typedef otb::Image <PixelType , Dimension > InputImageType;
typedef otb::ImageFileReader <InputImageType > ImageReaderType;
We will use a list of image file readers in order to open all the input images at once. For this, we use
the otb::ObjectList object and we template it over the type of the readers.
typedef otb::ObjectList <ImageReaderType > ReaderListType;
ReaderListType::Pointer readerList = ReaderListType::New ();
We will also build a list of input images in order to store the smart pointers obtained at the output of
each reader. This allows us to build a pipeline without really reading the images and using lots of
RAM. The otb::ImageList object will be used.
typedef otb::ImageList <InputImageType > ImageListType;
ImageListType::Pointer imageList = ImageListType::New();
We can now loop over the input image list in order to populate the reader list and the input image
list.

for (unsigned int i = 0; i < NbImages; ++i)
{
ImageReaderType::Pointer imageReader = ImageReaderType::New ();
imageReader ->SetFileName (argv[i + 2]);
std::cout << "Adding image " << argv[i + 2] << std::endl;
imageReader ->UpdateOutputInformation();
imageList ->PushBack (imageReader ->GetOutput ());
readerList ->PushBack(imageReader );
}
All the input images will be concatenated into a single output vector image. For this matter, we will
use the otb::ImageListToVectorImageFilter which is templated over the input image list type
and the output vector image type.
typedef otb::VectorImage <PixelType , Dimension > VectorImageType;
typedef otb::ImageListToVectorImageFilter<ImageListType , VectorImageType >
ImageListToVectorImageFilterType;
ImageListToVectorImageFilterType::Pointer iL2VI =
ImageListToVectorImageFilterType::New();
We plug the image list as input of the filter and use a otb::ImageFileWriter to write the result
image to a file, so that the streaming capabilities of all the readers and the filter are used.
iL2VI ->SetInput(imageList );
typedef otb::ImageFileWriter <VectorImageType > ImageWriterType;
ImageWriterType::Pointer imageWriter = ImageWriterType::New();
imageWriter ->SetFileName (argv[1]);
We can tune the size of the image tiles, so that the total memory footprint of the pipeline is constant
for any execution of the program.
unsigned long memoryConsumptionInMB = 10;
std::cout << "Memory consumption : " << memoryConsumptionInMB << std::endl;
imageWriter ->SetAutomaticTiledStreaming(memoryConsumptionInMB);
imageWriter ->SetInput (iL2VI ->GetOutput ());
imageWriter ->Update();

6.10. Extended filename for reader and writer 119
6.10 Extended filename for reader and writer
6.10.1 Syntax
The reader and writer extended file name support is based on the same syntax, only the options are
different. To benefit from the extended file name mecanism, the following syntax is to be used:
Path/Image.ext?&key1=<value1>&key2=<value2>
IMPORTANT: Note that you’ll probably need to ”quote” the filename.
6.10.2 Reader options
Available Options:
• &geom=<path/filename.geom>
– Contains the file name of a valid geom file
– Use the content of the specified geom file instead of image-embedded geometric infor-
mation
– empty by default, use the image-embedded information if available
• &sdataidx=<(int)idx>
– Select the sub-dataset to read
– 0 by default
• &resol=<(int)resolution factor>
– Select the JPEG2000 sub-resolution image to read
– 0 by default
• &skipcarto=<(bool)true>
– Skip the cartographic information
– Clears the projectionref, set the origin to [0,0] and the spacing to
[1/max(1,resolution factor),1/max(1,resolution factor)]
– Keeps the keyword list
– false by default
• &skipgeom=<(bool)true>

– Skip geometric information
– Clears the keyword list
– Keeps the projectionref and the origin/spacing informations
– false by default.
6.10.3 Writer options
Available Options:
• &writegeom=<(bool)false>
– To activate writing of external geom file
– true by default
• &gdal:co:<GDALKEY>=<VALUE>
– To specify a gdal creation option
– For gdal creation option information, see dedicated gdal documentation
– None by default
• &streaming:type=<VALUE>
– Activates configuration of streaming through extended filenames
– Override any previous configuration of streaming
– Allows to configure the kind of streaming to perform
– Available values are:
∗ auto : tiled or stripped streaming mode chosen automatically depending on TileHint
read from input files
∗ tiled : tiled streaming mode
∗ stripped : stripped streaming mode
∗ none : explicitly deactivate streaming
– Not set by default
• &streaming:sizemode=<VALUE>
– Allows to choose how the size of the streaming pieces is computed
– Available values are:
∗ auto : size is estimated from the available memory setting by evaluating pipeline
memory print

6.10. Extended filename for reader and writer 121
∗ height : size is set by setting height of strips or tiles
∗ nbsplits : size is computed from a given number of splits
– Default is auto
• &streaming:sizevalue=<VALUE>
– Parameter for size of streaming pieces computation
– Value is :
∗ if sizemode=auto : available memory in Mb
∗ if sizemode=height : height of the strip or tile in pixels
∗ if sizemode=nbsplits : number of requested splits for streaming
– If not provided, the default value is set to 0 and result in different behaviour depending
on sizemode (if set to height or nbsplits, streaming is deactivated, if set to auto, value is
fetched from configuration or cmake configuration file)
• &box=<startx>:<starty>:<sizex>:<sizey>
– User defined parameters of output image region
– The region must be set with 4 unsigned integers (the separator used is the colon ’:’).
Values are:
∗ startx: first index on X
∗ starty: first index on Y
∗ sizex: size along X
∗ sizey: size along Y
– The definition of the region follows the same convention as itk::Region definition in
C++. A region is defined by two classes: the itk::Index and itk::Size classes. The origin
of the region within the image with which it is associated is defined by Index
The available syntax for boolean options are:
• ON, On, on, true, True, 1 are available for setting a ’true’ boolean value
• OFF, Off, off, false, False, 0 are available for setting a ’false’ boolean value

CHAPTER
SEVEN
READING AND WRITING AUXILARY
DATA
As we have seen in the previous chapter, OTB has a great capability to read and process images.
However, images are not the only type of data we will need to manipulate. Images are characterized
by a regular sampling grid. For some data, such as Digital Elevation Models (DEM) or Lidar, this is
too restrictive and we need other representations.
Vector data are also used to represent cartographic objects, segmentation results, etc: basically, ev-
erything which can be seen as points, lines or polygons. OTB provides functionnalities for accessing
this kind of data.
7.1 Reading DEM Files
Examples/IO/DEMToImageGenerator.cxx.
The following example illustrates the use of the otb::DEMToImageGenerator class. The aim of
this class is to generate an image from the srtm data (precising the start extraction latitude and
longitude point). Each pixel is a geographic point and its intensity is the altitude of the point. If srtm
doesn’t have altitude information for a point, the altitude value is set at -32768 (value of the srtm
norm).
Let’s look at the minimal code required to use this algorithm. First, the following header defining
the otb::DEMToImageGenerator class must be included.
#include "otbDEMToImageGenerator.h"
The image type is now defined using pixel type and dimension. The output image is defined as an
otb::Image.

124 Chapter 7. Reading and Writing Auxilary Data
typedef otb::Image <double, Dimension > ImageType ;
The DEMToImageGenerator is defined using the image pixel type as a template parameter. After
that, the object can be instancied.
typedef otb::DEMToImageGenerator <ImageType > DEMToImageGeneratorType;
DEMToImageGeneratorType::Pointer object = DEMToImageGeneratorType::New ();
Input parameter types are defined to set the value in the otb::DEMToImageGenerator.
typedef DEMToImageGeneratorType::SizeType SizeType;
typedef DEMToImageGeneratorType::SpacingType SpacingType ;
typedef DEMToImageGeneratorType::PointType PointType ;
The path to the DEM folder is given to the otb::DEMHandler.
otb::DEMHandler ::Instance ()->OpenDEMDirectory(folderPath );
The origin (Longitude/Latitude) of the output image in the DEM is given to the filter.
PointType origin;
origin[0] = ::atof(argv[3]);
origin[1] = ::atof(argv[4]);
object ->SetOutputOrigin(origin);
The size (in Pixel) of the output image is given to the filter.
SizeType size;
size[0] = ::atoi(argv[5]);
size[1] = ::atoi(argv[6]);
object ->SetOutputSize(size);
The spacing (step between to consecutive pixel) is given to the filter. By default, this spacing is set
at 0.001.
SpacingType spacing;
spacing [0] = ::atof(argv[7]);
spacing [1] = ::atof(argv[8]);
object ->SetOutputSpacing(spacing );
The output image name is given to the writer and the filter output is linked to the writer input.
writer ->SetFileName (outputName );
writer ->SetInput(object ->GetOutput ());

7.2. Elevation management with OTB 125
Figure 7.1: DEMToImageGenerator image.
The invocation of the Update() method on the writer triggers the execution of the pipeline. It is
recommended to place update calls in a try/catch block in case errors occur and exceptions are
thrown.
try
{
writer ->Update ();
}
{
std::cout << "Exception itk:: ExceptionObject thrown !" << std::endl;
}
Let’s now run this example using as input the SRTM data contained in DEM srtm folder. Figure
7.1 shows the obtained DEM. Invalid data values – hidden areas due to SAR shadowing – are set to
zero.
7.2 Elevation management with OTB
Examples/IO/DEMHandlerExample.cxx.
OTB relies on OSSIM for elevation handling. Since release 3.16, there is a single configuration
class otb::DEMHandler to manage elevation (in image projections or localization functions for
example). This configuration is managed by the a proper instanciation and parameters setting of
this class. These instanciations must be done before any call to geometric filters or functionalities.
Ossim internal accesses to elevation are also configured by this class and this will ensure consistency
throughout the library.
This class is a singleton, the New() method is deprecated and will be removed in future release. We
need to use the Instance() method instead.

otb::DEMHandler ::Pointer demHandler = otb ::DEMHandler ::Instance ();
It allows to configure a directory containing DEM tiles (DTED or SRTM supported) using the
OpenDEMDirectory() method. The OpenGeoidFile() method allows to input a geoid file as well.
Last, a default height above ellipsoid can be set using the SetDefaultHeightAboveEllipsoid()
method.
demHandler ->SetDefaultHeightAboveEllipsoid(defaultHeight);
if(!demHandler ->IsValidDEMDirectory(demdir.c_str()))
{
std::cerr <<"IsValidDEMDirectory("<<demdir <<") = false"<<std ::endl;
fail = true;
}
demHandler ->OpenDEMDirectory(demdir);
demHandler ->OpenGeoidFile(geoid);
We can now retrieve height above ellipsoid or height above Mean Sea Level (MSL) using the meth-
ods GetHeightAboveEllipsoid() and GetHeightAboveMSL(). Outputs of these methods depend
on the configuration of the class otb::DEMHandler and the different cases are:
For GetHeightAboveEllipsoid():
• DEM and geoid both available: dem value+ geoid o f fset
• No DEM but geoid available: geoid offset
• DEM available, but no geoid: dem value
• No DEM and no geoid available: default height above ellipsoid
For GetHeightAboveMSL():
• DEM and geoid both available: srtm value
• No DEM but geoid available: 0
• DEM available, but no geoid: srtm value
• No DEM and no geoid available: 0
otb::DEMHandler :: PointType point;
point[0] = longitude ;
point[1] = latitude ;
double height = -32768;

7.3. Lidar data Files 127
height = demHandler ->GetHeightAboveMSL(point);
std::cout <<"height above MSL ("<<longitude <<","
<<latitude <<") = "<<height <<" meters"<<std::endl;
height = demHandler ->GetHeightAboveEllipsoid(point);
std::cout <<"height above ellipsoid ("<<longitude
<<", "<<latitude <<") = "<<height <<" meters"<<std::endl;
Note that OSSIM internal calls for sensor modelling use the height above ellipsoid, and follow the
same logic as the GetHeightAboveEllipsoid() method.
More examples about representing DEM are presented in section 24.2.4.
7.3 Lidar data Files
Examples/IO/LidarReaderExample.cxx.
This example describes how to read a lidar data file and to display the basic information about it.
Here, OTB make use of libLAS.
The first step toward the use of these filters is to include the proper header files.
#include "otbPointSetFileReader.h"
#include <fstream>
{
We need now to declare the data types that we will be using and instanciate the reader (which is a
otb::PointSetFileReader).
typedef itk::PointSet <double, 2> PointSetType;
typedef otb::PointSetFileReader <PointSetType > PointSetFileReaderType;
PointSetFileReaderType::Pointer reader = PointSetFileReaderType::New();
reader ->Update();
We can already display some interesting information such as the data extension and the number of
points:
std::cout << "*** Data area *** " << std::endl;
std::cout << std:: setprecision(15);
std::cout << " - Easting min: " << reader ->GetMinX () << std::endl;
std::cout << " - Easting max: " << reader ->GetMaxX () << std::endl;
std::cout << " - Northing min: " << reader ->GetMinY () << std::endl;
std::cout << " - Northing max: " << reader ->GetMaxY () << std::endl;
std::cout << "Data points: " << reader ->GetNumberOfPoints() << std::endl;

We can also loop on the point to see each point with its coordinates and value. Be careful here, data
set can have hundred of thousands of points:
PointSetType::Pointer data = reader ->GetOutput ();
unsigned long nPoints = data ->GetNumberOfPoints();
for (unsigned long i = 0; i < nPoints; ++i)
{
PointSetType::PointType point;
data ->GetPoint(i, &point);
std::cout << point << " : ";
PointSetType::PixelType value =
itk::NumericTraits <PointSetType::PixelType >:: Zero;
data ->GetPointData(i, &value);
std::cout << value << std::endl;
}
}
Examples/IO/LidarToImageExample.cxx.
This example describes how to convert a point set obtained from lidar data to an image ﬁle. A lidar
produces a point set which is irregular in terms of spatial sampling. To be able to generate an image,
an interpolation is required.
The interpolation is done using the itk::BSplineScatteredDataPointSetToImageFilter
which uses BSplines. The method is fully described in [135] and [87].
#include "otbPointSetFileReader.h"
#include "itkBSplineScatteredDataPointSetToImageFilter.h"
Here a note is important. The itk::BSplineScatteredDataPointSetToImageFilter is on the
review directory of ITK, which means that it won’t be compiled by default. If you want to use it,
you have to set: ITK USE REVIEW to ON in the ccmake of OTB.
Then, we declare the type of the data that we are going to use. As lidar decribes the altitude of the
point (often to centimeter accuracy), it is required to use a real type to represent it.
typedef double RealType ;
typedef itk::Vector <RealType , 1> VectorType ;
typedef itk::PointSet <VectorType , 2> PointSetType;
typedef otb::Image <RealType , 2> ImageType ;
typedef otb::Image <VectorType , 2> VectorImageType;

7.3. Lidar data Files 129
Lidar data are read into a point set using the otb::PointSetFileReader. Its usage is very similar
to the otb::ImageFileReader:
typedef otb::PointSetFileReader <PointSetType > PointSetFileReaderType;
PointSetFileReaderType::Pointer reader = PointSetFileReaderType::New();
reader ->Update();
We can now prepare the parameters to pass to the interpolation ﬁlter: you have to be aware that the
origin of the image is on the upper left corner and thus corresponds to the minimun easting but the
maximum northing.
double resolution = atof(argv[4]);
int splineOrder = atoi(argv[5]);
int level = atoi(argv[6]);
start[0] = 0;
start[1] = 0;
size[0] = static_cast<long int>(ceil(
(vcl_ceil(reader ->GetMaxX ()) -
vcl_floor (reader ->GetMinX ()) +
1) / resolution
)) + 1;
size[1] = static_cast<long int>(ceil(
(vcl_ceil(reader ->GetMaxY ()) -
vcl_floor (reader ->GetMinY ()) +
1) / resolution
)) + 1;
origin[0] = reader ->GetMinX ();
origin[1] = reader ->GetMaxY ();
ImageType :: SpacingType spacing;
spacing [0] = resolution ;
spacing [1] = -resolution ;
All these parameters are passed to the interpolation ﬁlter:
typedef itk:: BSplineScatteredDataPointSetToImageFilter
<PointSetType , VectorImageType > FilterType ;
filter ->SetSplineOrder(splineOrder );
FilterType ::ArrayType ncps;
ncps.Fill(6);
filter ->SetNumberOfControlPoints(ncps);
filter ->SetNumberOfLevels(level);

filter ->SetOrigin (origin);
filter ->SetSpacing (spacing );
filter ->SetSize(size);
filter ->Update();
The result of this filter is an image in which every pixel is a vector (with only one element). For
now, the otb writer does not know how to process that (hopefully soon!). So we have to manually
copy the element in a standard image:
typedef otb::Image <RealType , 2> RealImageType;
RealImageType::Pointer image = RealImageType::New ();
ImageType ::RegionType region;
itk::ImageRegionIteratorWithIndex<RealImageType >
Itt(image , image ->GetLargestPossibleRegion());
for (Itt.GoToBegin (); !Itt.IsAtEnd (); ++Itt)
{
Itt.Set(filter ->GetOutput ()->GetPixel (Itt.GetIndex ())[0]);
}
Everything is ready so we can just write the image:
writer ->SetInput(image);
writer ->Update();
Figure 7.2 shows the output images with two sets of parameters
7.4 Reading and Writing Shapefiles and KML
Examples/IO/VectorDataIOExample.cxx.
Unfortunately, many vector data formats do not share the models for the data they represent. How-
ever, in some cases, when simple data is stored, it can be decomposed in simple objects as for
instance polylines, polygons and points. This is the case for the Shapefile and the KML (Keyhole
Markup Language) formats, for instance.

7.4. Reading and Writing Shapefiles and KML 131
Figure 7.2: Image obtained with 4 level spline interpolation (left) and 8 levels (right)
Even though specific reader/writer for Shapefile and the Google KML are available in OTB, we
designed a generic approach for the IO of this kind of data.
The reader/writer for VectorData in OTB is able to access a variety of vector file formats (all OGR
supported formats)
In section 11.4, you will find more information on how projections work for the vector data and how
you can export the results obtained with OTB to the real world.
This example illustrates the use of OTB’s vector data IO framework.
We will start by including the header files for the classes describing the vector data and the corre-
sponding reader and writer.
#include "otbVectorData.h"
#include "otbVectorDataFileReader.h"
#include "otbVectorDataFileWriter.h"
We will also need to include the header files for the classes which model the individual objects that
we get from the vector data structure.
#include "itkPreOrderTreeIterator.h"
#include "otbObjectList.h"
#include "otbPolygon .h"
We define the types for the vector data structure and the corresponding file reader.
typedef otb::VectorData <PixelType , 2> VectorDataType;
typedef otb::VectorDataFileReader <VectorDataType >
VectorDataFileReaderType;

We can now instantiate the reader and read the data.
VectorDataFileReaderType::Pointer reader = VectorDataFileReaderType::New();
reader ->Update();
The vector data obtained from the reader will provide a tree of nodes containing the actual objects
of the scene. This tree will be accessed using an itk::PreOrderTreeIterator.
typedef VectorDataType:: DataTreeType DataTreeType;
typedef itk::PreOrderTreeIterator <DataTreeType > TreeIteratorType;
In this example we will only read polygon objects from the input file before writing them to the
output file. We define the type for the polygon object as well as an iterator to the vertices. The
polygons obtained will be stored in an otb::ObjectList.
typedef otb::Polygon <double> PolygonType ;
typedef PolygonType ::VertexListConstIteratorType PolygonIteratorType;
typedef otb::ObjectList <PolygonType > PolygonListType;
typedef PolygonListType::Iterator PolygonListIteratorType;
PolygonListType::Pointer polygonList = PolygonListType::New();
We get the data tree and instantiate an iterator to walk through it.
TreeIteratorType it(reader ->GetOutput ()->GetDataTree ());
it.GoToBegin ();
We check that the current object is a polygon using the IsPolygonFeature() method and get its
exterior ring in order to store it into the list.
{
if (it.Get()->IsPolygonFeature())
{
polygonList ->PushBack (it.Get()->GetPolygonExteriorRing());
}
++it;
}
Before writing the polygons to the output file, we have to build the vector data structure. This
structure will be built up of nodes. We define the types needed for that.
VectorDataType::Pointer outVectorData = VectorDataType::New();
typedef VectorDataType:: DataNodeType DataNodeType;
We fill the data structure with the nodes. The root node is a document which is composed of folders.
A list of polygons can be seen as a multi polygon object.

7.5. Handling large vector data through OGR 133
DataNodeType::Pointer document = DataNodeType::New ();
document ->SetNodeType (otb::DOCUMENT );
document ->SetNodeId ("polygon");
DataNodeType::Pointer folder = DataNodeType::New();
folder ->SetNodeType (otb::FOLDER);
DataNodeType::Pointer multiPolygon = DataNodeType::New();
multiPolygon ->SetNodeType (otb:: FEATURE_MULTIPOLYGON);
We assign these objects to the data tree stored by the vector data object.
DataTreeType::Pointer tree = outVectorData ->GetDataTree ();
DataNodeType::Pointer root = tree ->GetRoot()->Get ();
tree ->Add(document , root);
tree ->Add(folder , document );
tree ->Add(multiPolygon , folder);
We can now iterate through the polygon list and fill the vector data structure.
for (PolygonListType::Iterator it = polygonList ->Begin();
it != polygonList ->End(); ++it)
{
DataNodeType::Pointer newPolygon = DataNodeType::New();
newPolygon ->SetPolygonExteriorRing(it.Get ());
tree ->Add(newPolygon , multiPolygon);
}
And finally we write the vector data to a file using a generic otb::VectorDataFileWriter.
typedef otb::VectorDataFileWriter <VectorDataType > WriterType ;
writer ->SetInput(outVectorData);
writer ->Update();
This example can convert an ESRI Shapefile to a MapInfo File but you can also access
with the same OTB source code to a PostgreSQL datasource, using a connection string as
: PG:”dbname=’databasename’ host=’addr’ port=’5432’ user=’x’ password=’y’” Starting with
GDAL 1.6.0, the set of tables to be scanned can be overridden by specifying tables=schema.table.
7.5 Handling large vector data through OGR
Examples/IO/OGRWrappersExample.cxx.
Starting with the version 3.14.0 of the OTB library, a wrapper around OGR API is provided. The
purposes of the wrapper are:

• to permit OTB to handle very large vector data sets;
• and to offer a modern (in the RAII sense) interface to handle vector data.
As OGR already provides a rich set of geometric related data, as well as the algorithms to manipulate
and serialize them, we’ve decided to wrap it into a new exception-safe interface.
This example illustrates the use of OTB’s OGR wrapper framework. This program takes a source of
polygons (a shape file for instance) as input and produces a datasource of multi-polygons as output.
We will start by including the header files for the OGR wrapper classes, plus other header files that
are out of scope here.
#include "otbOGRDataSourceWrapper.h"
The following declarations will permit to merge the otb::ogr::Fields from each
otb::ogr::Feature into list-fields. We’ll get back to this point later.
#include <string >
#include <vector >
#include <boost/variant.hpp>
#include "otbJoinContainer.h"
typedef std::vector <int> IntList_t ;
typedef std::vector <std ::string > StringList_t;
typedef std::vector <double> RealList_t ;
// TODO: handle non recognized fields
typedef boost::variant <IntList_t , StringList_t , RealList_t > AnyListField_t;
typedef std::vector <AnyListField_t > AnyListFieldList_t;
AnyListFieldList_t prepareNewFields(
OGRFeatureDefn /*const*/& defn , otb::ogr::Layer & destLayer );
void printField (
otb::ogr::Field const& field , AnyListField_t const& newListField);
void assignField (
otb::ogr::Field field , AnyListField_t const& newListFieldValue);
void pushFieldsToFieldLists(
otb::ogr::Feature const& inputFeature , AnyListFieldList_t & field);
We caw now instantiate first the input otb::ogr::DataSource.
otb::ogr::DataSource ::Pointer source = otb::ogr::DataSource ::New(
argv[1], otb::ogr ::DataSource ::Modes::Read);
And then, we can instantiate the output otb::ogr::DataSource and its unique otb::ogr::Layer
multi-polygons.
otb::ogr::DataSource ::Pointer destination = otb::ogr::DataSource ::New(
argv[2], otb::ogr ::DataSource ::Modes::Update_LayerCreateOnly);
otb::ogr::Layer destLayer = destination ->CreateLayer (
argv[2], 0, wkbMultiPolygon);

The data obtained from the reader mimics the interface of OGRDataSource. To access
the geometric objects stored, we need first to iterate on the otb::ogr::Layers from the
otb::ogr::DataSource, then on the otb::ogr::Feature from each layer.
// for (auto const& inputLayer : *source)
for (otb::ogr:: DataSource ::const_iterator lb=source ->begin(), le=source ->end()
; lb != le
; ++lb)
{
otb::ogr::Layer const& inputLayer = *lb;
In this example we will only read polygon objects from the input file before writing them to the
output file. As all features from a layer share the same geometric type, we can filter on the layer
geometric type.
if (inputLayer .GetGeomType () != wkbPolygon )
{
std::cout << "Warning: Ignoring layer: ";
inputLayer .PrintSelf (std::cout , 2);
continue; // skip to next layer
}
In order to prepare the fields for the new layer, we first need to extract the fields definition from the
input layer in order to deduce the new fields of the result layer.
OGRFeatureDefn & sourceFeatureDefn = inputLayer .GetLayerDefn();
AnyListFieldList_t fields = prepareNewFields(sourceFeatureDefn , destLayer );
The result layer will contain only one feature, per input layer, that stores a multi-polygon shape. All
geometric shapes are plain OGRGeometry objects.
OGRMultiPolygon destGeometry; // todo: use UniqueGeometryPtr
The transformation algorithm is as simple as aggregating all the polygons from the features from the
input layer into the destination multi-polygon geometric object.
Note that otb::ogr::Feature::GetGeometry() provides a direct access to a non-mutable
OGRGeometry pointer and that OGRGeometryCollection::addGeometry() copies the received
pointer. As a consequence, the following code is optimal regarding the geometric objects manipu-
lated.
This is also at this point that we fetch the field values from the input features to accumulate them
into the fields list.
// for (auto const& inputFeature : inputLayer)
for (otb::ogr::Layer::const_iterator fb=inputLayer .begin(), fe=inputLayer .end()
; fb != fe
; ++fb)
{
otb::ogr::Feature const& inputFeature = *fb;

destGeometry.addGeometry (inputFeature.GetGeometry ());
pushFieldsToFieldLists(inputFeature ,fields);
} // for each feature
Then the new geometric object can be added to a new feature, that will be eventually added to the
destination layer.
otb::ogr::Feature newFeature (destLayer .GetLayerDefn());
newFeature .SetGeometry (&destGeometry); // SetGeom -> copies
We set here the fields of the new feature with the ones accumulated over the features from the input
layer.
for (size_t i=0, N=sourceFeatureDefn.GetFieldCount(); i!=N; ++i)
{
printField (newFeature [i], fields[i]);
assignField (newFeature [i], fields[i]);
}
Finally we add (with otb::ogr::Layer::CreateFeature() the new feature to the destination
layer, and we can process the next layer from the input datasource.
destLayer .CreateFeature(newFeature ); // adds feature to the layer
} // for each layer
In order to simplify the manipulation of otb::ogr::Fields and to avoid copy-paste for each pos-
sible case of field-type, this example relies on boost.Variant.
As such, we have defined AnyListField t as a variant type on all possible types of field.
Then, the manipulation of the variant field values is done through the templatized functions
otb::ogr::Field::SetValue<>() and otb::ogr::Field::GetValue<>(), from the various
variant-visitors.
Before using the visitors, we need to operate a switch on the exact type of each field from the input
layers. An empty field-values container is first added to the set of fields containers. Finally, the
destination layer is completed with a new field of the right deduced type.
AnyListFieldList_t prepareNewFields(
OGRFeatureDefn /*const*/& defn , otb::ogr::Layer & destLayer )
{
AnyListFieldList_t fields;
for (size_t i=0, N=defn.GetFieldCount(); i!=N; ++i)
{
const char* name = defn.GetFieldDefn(i)->GetNameRef ();
OGRFieldType type = static_cast<OGRFieldType >(-1);
switch (defn.GetFieldDefn(i)->GetType ())
{
case OFTInteger :
fields.push_back (IntList_t ());
type = OFTIntegerList;
break;

case OFTString :
fields.push_back (StringList_t());
type = OFTStringList;
break;
case OFTReal:
fields.push_back (RealList_t ());
type = OFTRealList ;
break;
default:
std::cerr << "Unsupported field type: " <<
OGRFieldDefn::GetFieldTypeName(defn.GetFieldDefn(i)->GetType ())
<< " for " << name << "n";
break;
}
OGRFieldDefn newFieldDefn(name , type); // name is duplicated here => no dangling pointer
destLayer .CreateField (newFieldDefn , false);
}
return fields;
}
The first visitor, PushVisitor(), takes the value from one field and pushes it into a container of list-
variant. The type of the field to fetch is deduced from the type of the values stored in the container. It
is called by pushFieldsToFieldLists(), that for each field of the input feature applies the visitor
on the container.
struct PushVisitor : boost::static_visitor <>
{
PushVisitor (otb::ogr::Field const& f) : m_f(f) {}
template <typename T> void operator()(T & container ) const
{
typedef typename T::value_type value_type ;
value_type const value = m_f.GetValue <value_type >();
container .push_back (value);
}
private:
otb::ogr ::Field const& m_f;
};
void pushFieldsToFieldLists(
otb::ogr ::Feature const& inputFeature , AnyListFieldList_t & fields)
{
// For each field
for (size_t i=0, N=inputFeature.GetSize (); i!=N; ++i)
{
otb::ogr::Field field = inputFeature[i];
boost::apply_visitor(PushVisitor (field), fields[i]);
}
}
A second simple visitor, PrintVisitor, is defined to trace the values of each field (which contains
a list of typed data (integers, strings or reals).

struct PrintVisitor : boost::static_visitor <>
{
template <typename T> void operator()(T const& container ) const
{
otb::Join(std::cout , container , ", ");
}
};
void printField (otb::ogr::Field const& field , AnyListField_t const& newListField)
{
std::cout << field.GetName () << " -> ";
boost::apply_visitor(PrintVisitor(), newListField);
}
The third visitor, SetFieldVisitor, sets the field of the destination features, which have been
accumulated in the list of typed values.
struct SetFieldVisitor : boost::static_visitor <>
{
SetFieldVisitor(otb::ogr::Field f) : m_f(f) {}
// operator() from visitors are expected to be const
template <typename T> void operator()(T const& container ) const
{
m_f.SetValue (container );
}
private:
otb::ogr::Field mutable m_f; // this is a proxy -> a reference in a sort
};
void assignField (otb::ogr::Field field , AnyListField_t const& newListFieldValue)
{
boost::apply_visitor(SetFieldVisitor(field), newListFieldValue);
}
Note that this example does not handle the case when the input layers don’t share a same fields-
definition.

CHAPTER
EIGHT
BASIC FILTERING
This chapter introduces the most commonly used filters found in OTB. Most of these filters are
intended to process images. They will accept one or more images as input and will produce one or
more images as output. OTB is based ITK’s data pipeline architecture in which the output of one
filter is passed as input to another filter. (See Section 3.5 on page 34 for more information.)
8.1 Thresholding
The thresholding operation is used to change or identify pixel values based on specifying one or more
values (called the threshold value). The following sections describe how to perform thresholding
operations using OTB.
8.1.1 Binary Thresholding
Examples/Filtering/BinaryThresholdImageFilter.cxx.

140 Chapter 8. Basic Filtering
Lower
Threshold
Upper
Threshold
Output
Intensity
Input
Intensity
Outside
Value
Inside
Value
Figure 8.1: Transfer function of the BinaryThresholdImage-
Filter.
This example illustrates the use of the
binary threshold image filter. This
filter is used to transform an image
into a binary image by changing the
pixel values according to the rule illus-
trated in Figure 8.1. The user defines
two thresholds—Upper and Lower—
and two intensity values—Inside and
Outside. For each pixel in the input im-
age, the value of the pixel is compared
with the lower and upper thresholds. If
the pixel value is inside the range de-
fined by [Lower,U pper] the output pixel
is assigned the InsideValue. Otherwise
the output pixels are assigned to the Out-
sideValue. Thresholding is commonly applied as the last operation of a segmentation pipeline.
The first step required to use the itk::BinaryThresholdImageFilter is to include its header file.
#include "itkBinaryThresholdImageFilter.h"
The next step is to decide which pixel types to use for the input and output images.
The input and output image types are now defined using their respective pixel types and dimensions.
typedef otb::Image <InputPixelType , 2> InputImageType;
The filter type can be instantiated using the input and output image types defined above.
typedef itk::BinaryThresholdImageFilter<
InputImageType , OutputImageType > FilterType ;
An otb::ImageFileReader class is also instantiated in order to read image data from a file. (See
Section 6 on page 99 for more information about reading and writing data.)
An otb::ImageFileWriter is instantiated in order to write the output image to a file.
Both the filter and the reader are created by invoking their New() methods and assigning the result
to itk::SmartPointers.

8.1. Thresholding 141
The image obtained with the reader is passed as input to the BinaryThresholdImageFilter.
The method SetOutsideValue() defines the intensity value to be assigned to those pixels
whose intensities are outside the range defined by the lower and upper thresholds. The method
SetInsideValue() defines the intensity value to be assigned to pixels with intensities falling in-
side the threshold range.
filter ->SetOutsideValue(outsideValue);
filter ->SetInsideValue(insideValue );
The methods SetLowerThreshold() and SetUpperThreshold() define the range of the input
image intensities that will be transformed into the InsideValue. Note that the lower and upper
thresholds are values of the type of the input image pixels, while the inside and outside values are of
the type of the output image pixels.
filter ->SetLowerThreshold(lowerThreshold);
filter ->SetUpperThreshold(upperThreshold);
The execution of the filter is triggered by invoking the Update() method. If the filter’s output has
been passed as input to subsequent filters, the Update() call on any posterior filters in the pipeline
will indirectly trigger the update of this filter.
filter ->Update();
Figure 8.2 illustrates the effect of this filter on a ROI of a Spot 5 image of an agricultural area. This
figure shows the limitations of this filter for performing segmentation by itself. These limitations
are particularly noticeable in noisy images and in images lacking spatial uniformity.
The following classes provide similar functionality:
• itk::ThresholdImageFilter
8.1.2 General Thresholding
Examples/Filtering/ThresholdImageFilter.cxx.
This example illustrates the use of the itk::ThresholdImageFilter. This filter can be used to
transform the intensity levels of an image in three different ways.

Figure 8.2: Effect of the BinaryThresholdImageFilter on a ROI of a Spot 5 image.
Outside
Value
Output
Intensity
Input
Intensity
Threshold
Below
Unchanged
Intensities
Figure 8.3: ThresholdImageFilter using the threshold-below mode.

Output
Intensity
Input
Intensity
Unchanged
Intensities
Threshold
Above
Outside
Value
Figure 8.4: ThresholdImageFilter using the threshold-above mode.
Outside
Value
Output
Intensity
Lower
Threshold
Unchanged
Intensities
Upper
Threshold
Input
Intensity
Figure 8.5: ThresholdImageFilter using the threshold-outside mode.

• First, the user can define a single threshold. Any pixels with values below this threshold will
be replaced by a user defined value, called here the OutsideValue. Pixels with values above
the threshold remain unchanged. This type of thresholding is illustrated in Figure 8.3.
• Second, the user can define a particular threshold such that all the pixels with values above
the threshold will be replaced by the OutsideValue. Pixels with values below the threshold
remain unchanged. This is illustrated in Figure 8.4.
• Third, the user can provide two thresholds. All the pixels with intensity values inside the
range defined by the two thresholds will remain unchanged. Pixels with values outside this
range will be assigned to the OutsideValue. This is illustrated in Figure 8.5.
The following methods choose among the three operating modes of the filter.
• ThresholdBelow()
• ThresholdAbove()
• ThresholdOutside()
The first step required to use this filter is to include its header file.
#include "itkThresholdImageFilter.h"
Then we must decide what pixel type to use for the image. This filter is templated over a single
image type because the algorithm only modifies pixel values outside the specified range, passing the
rest through unchanged.
The image is defined using the pixel type and the dimension.
The filter can be instantiated using the image type defined above.
typedef itk::ThresholdImageFilter <ImageType > FilterType ;
An otb::ImageFileReader class is also instantiated in order to read image data from a file.
to SmartPointers.

The image obtained with the reader is passed as input to the itk::ThresholdImageFilter.
The method SetOutsideValue() defines the intensity value to be assigned to those pixels whose
intensities are outside the range defined by the lower and upper thresholds.
filter ->SetOutsideValue(0);
The method ThresholdBelow() defines the intensity value below which pixels of the input image
will be changed to the OutsideValue.
filter ->ThresholdBelow(40);
The filter is executed by invoking the Update() method. If the filter is part of a larger image
processing pipeline, calling Update() on a downstream filter will also trigger update of this filter.
filter ->Update();
The output of this example is shown in Figure 8.3. The second operating mode of the filter is now
enabled by calling the method ThresholdAbove().
filter ->ThresholdAbove(100);
filter ->Update();
Updating the filter with this new setting produces the output shown in Figure 8.4. The third operating
mode of the filter is enabled by calling ThresholdOutside().
filter ->ThresholdOutside(40, 100);
filter ->Update();
The output of this third, “band-pass” thresholding mode is shown in Figure 8.5.
The examples in this section also illustrate the limitations of the thresholding filter for performing
segmentation by itself. These limitations are particularly noticeable in noisy images and in images
lacking spatial uniformity.
• itk::BinaryThresholdImageFilter

8.1.3 Threshold to Point Set
Examples/FeatureExtraction/ThresholdToPointSetExample.cxx.
Sometimes, it may be more valuable not to get an image from the threshold step but rather a list of
coordinates. This can be done with the otb::ThresholdImageToPointSetFilter.
The following example illustrates the use of the otb::ThresholdImageToPointSetFilter which
provide a list of points within given thresholds. Points set are described in section 5.2 on page 80.
The first step required to use this filter is to include the header
#include "otbThresholdImageToPointSetFilter.h"
The next step is to decide which pixel types to use for the input image and the Point Set as well as
their dimension.
typedef itk::PointSet <PixelType , Dimension > PointSetType;
A reader is instanciated to read the input image
const char * filenamereader = argv[1];
reader ->SetFileName (filenamereader);
We get the parameters from the command line for the threshold filter. The lower and upper thresholds
parameters are similar to those of the itk::BinaryThresholdImageFilter (see Section 8.1.1 on
page 139 for more information).
int lowerThreshold = atoi(argv[2]);
int upperThreshold = atoi(argv[3]);
Then we create the ThresholdImageToPointSetFilter and we pass the parameters.
typedef otb:: ThresholdImageToPointSetFilter
<ImageType , PointSetType > FilterThresholdType;
FilterThresholdType::Pointer filterThreshold = FilterThresholdType::New ();
filterThreshold ->SetLowerThreshold(lowerThreshold);
filterThreshold ->SetUpperThreshold(upperThreshold);
filterThreshold ->SetInput (0, reader ->GetOutput ());
To manipulate and display the result of this filter, we manually instanciate a point set and we call
the Update() method on the threshold filter to trigger the pipeline execution.

8.2. Mathematical operations on images 147
After this step, the pointSet variable contains the point set.
PointSetType::Pointer pointSet = PointSetType::New();
pointSet = filterThreshold ->GetOutput ();
filterThreshold ->Update ();
To display each point, we create an iterator on the list of points, which is accessible through the
method GetPoints() of the PointSet.
typedef PointSetType::PointsContainer ContainerType;
ContainerType* pointsContainer = pointSet ->GetPoints ();
typedef ContainerType::Iterator IteratorType;
IteratorType itList = pointsContainer ->Begin();
A while loop enable us to through the list a display the coordinate of each point.
while (itList != pointsContainer ->End ())
{
std::cout << itList.Value() << std::endl;
++itList;
}
8.2 Mathematical operations on images
OTB and ITK provide a lot of filters allowing to perform basic operations on image layers (thresh-
olding, ratio, layers combinations...). It allows to create a processing chain defining at each step
operations and to combine them in the data pipeline. But the library offers also the possibil-
ity to perform more generic complex mathematical operation on images in a single filter: the
otb::BandMathImageFilter.
Examples/BasicFilters/BandMathFilterExample.cxx.
This filter is based on the mathematical parser library muParser. The built in functions and operators
list is available at: http://muparser.sourceforge.net/mup_features.html.
In order to use this filter, at least one input image is to be set. An associated variable name can
be specified or not by using the corresponding SetNthInput method. For the nth input image, if no
associated variable name has been specified, a default variable name is given by concatenating the
letter ”b” (for band) and the corresponding input index.
The next step is to set the expression according to the variable names. For example, in the default
case with three input images the following expression is valid : ”(b1+b2)*b3”.
We start by including the needed header file. The aim of this example is to compute the Normalized
Difference Vegetation Index (NDVI) from a multispecral image and perform a threshold on this
indice to extract area containing a dense vegetation canopy.

#include "otbBandMathImageFilter.h"
We start by the classical typedefs needed for reading and writing the images. The
otb::BandMathImageFilter class works with otb::Image as input so we need to define ad-
ditional filters to extract each layer of the multispectral image
typedef double
PixelType ;
typedef otb::VectorImage <PixelType , 2>
InputImageType;
typedef otb::Image <PixelType , 2>
OutputImageType;
typedef otb::ImageList <OutputImageType >
ImageListType;
typedef OutputImageType::PixelType
VPixelType ;
typedef otb::VectorImageToImageListFilter<InputImageType , ImageListType >
VectorImageToImageListType;
typedef otb::ImageFileReader <InputImageType >
ReaderType ;
typedef otb::ImageFileWriter <OutputImageType >
WriterType ;
We can now define the type for the filter:
typedef otb::BandMathImageFilter <OutputImageType > FilterType ;
We instantiate the filter, the reader, and the writer:
We need now to extract now each band from the input otb::VectorImage, it illustrates
the use of the otb::VectorImageToImageList. Each extracted layer are inputs of the
otb::BandMathImageFilter:
VectorImageToImageListType::Pointer imageList = VectorImageToImageListType::New();
imageList ->SetInput (reader ->GetOutput ());
imageList ->UpdateOutputInformation();
const unsigned int nbBands = reader ->GetOutput ()->GetNumberOfComponentsPerPixel();
for(unsigned int j = 0; j < nbBands; ++j)
{

8.3. Gradients 149
Figure 8.6: From left to right: Original image, thresholded NDVI indice.
filter ->SetNthInput (j, imageList ->GetOutput ()->GetNthElement(j));
}
Now we can define the mathematical expression to perform on the layers (b1, b2, b3, b4). The filter
takes advantage of the parsing capabilities of the muParser library and allows to set the expression
as on a digital calculator.
The expression below returns 255 if the ratio (NIR− RED)/(NIR+ RED) is greater than 0.4 and 0
if not.
filter ->SetExpression("if((b4-b3)/( b4+b3) > 0.4, 255, 0)");
We can now plug the pipeline and run it.
writer ->Update();
The muParser library offers also the possibility to extended existing built-in functions. For example,
you can use the OTB expression ”ndvi(b3, b4)” with the filter. The mathematical expression would
be in this case if(ndvi(b3,b4) > 0.4, 255, 0). It will return the same result.
Figure 8.6 shows the result of the threshold over the NDVI indice to a Quickbird image.
8.3 Gradients
Computation of gradients is a fairly common operation in image processing. The term “gradient”
may refer in some contexts to the gradient vectors and in others to the magnitude of the gradient vec-

tors. ITK filters attempt to reduce this ambiguity by including the magnitude term when appropriate.
ITK provides filters for computing both the image of gradient vectors and the image of magnitudes.
8.3.1 Gradient Magnitude
Examples/Filtering/GradientMagnitudeImageFilter.cxx.
The magnitude of the image gradient is extensively used in image analysis, mainly to help
in the determination of object contours and the separation of homogeneous regions. The
itk::GradientMagnitudeImageFilter computes the magnitude of the image gradient at each
pixel location using a simple finite differences approach. For example, in the case of 2D the compu-
tation is equivalent to convolving the image with masks of type
-1 0 1
1
0
-1
then adding the sum of their squares and computing the square root of the sum.
This filter will work on images of any dimension thanks to the internal use of
itk::NeighborhoodIterator and itk::NeighborhoodOperator.
Types should be chosen for the pixels of the input and output images.
typedef float OutputPixelType;
The input and output image types can be defined using the pixel types.
The type of the gradient magnitude filter is defined by the input image and the output image types.
typedef itk::GradientMagnitudeImageFilter<
A filter object is created by invoking the New() method and assigning the result to a
itk::SmartPointer.

8.3. Gradients 151
Figure 8.7: Effect of the GradientMagnitudeImageFilter.
The input image can be obtained from the output of another filter. Here, the source is an image
reader.
Finally, the filter is executed by invoking the Update() method.
filter ->Update();
If the output of this filter has been connected to other filters in a pipeline, updating any of the
downstream filters will also trigger an update of this filter. For example, the gradient magnitude
filter may be connected to an image writer.
rescaler ->SetInput(filter ->GetOutput ());
writer ->Update();
Figure 8.7 illustrates the effect of the gradient magnitude. The figure shows the sensitivity of this
filter to noisy data.
Attention should be paid to the image type chosen to represent the output image since the dynamic
range of the gradient magnitude image is usually smaller than the dynamic range of the input image.
As always, there are exceptions to this rule, for example, images of man-made objects that contain
high contrast objects.

This filter does not apply any smoothing to the image before computing the gradients. The results
can therefore be very sensitive to noise and may not be best choice for scale space analysis.
8.3.2 Gradient Magnitude With Smoothing
Examples/Filtering/GradientMagnitudeRecursiveGaussianImageFilter.cxx.
Differentiation is an ill-defined operation over digital data. In practice it is convenient to define a
scale in which the differentiation should be performed. This is usually done by preprocessing the
data with a smoothing filter. It has been shown that a Gaussian kernel is the most convenient choice
for performing such smoothing. By choosing a particular value for the standard deviation (σ) of the
Gaussian, an associated scale is selected that ignores high frequency content, commonly considered
image noise.
The itk::GradientMagnitudeRecursiveGaussianImageFilter computes the magnitude of the
image gradient at each pixel location. The computational process is equivalent to first smoothing the
image by convolving it with a Gaussian kernel and then applying a differential operator. The user
selects the value of σ.
Internally this is done by applying an IIR 1 filter that approximates a convolution with the derivative
of the Gaussian kernel. Traditional convolution will produce a more accurate result, but the IIR
approach is much faster, especially using large σs [37, 38].
GradientMagnitudeRecursiveGaussianImageFilter will work on images of any dimension by taking
advantage of the natural separability of the Gaussian kernel and its derivatives.
#include "itkGradientMagnitudeRecursiveGaussianImageFilter.h"
Types should be instantiated based on the pixels of the input and output images.
With them, the input and output image types can be instantiated.
The filter type is now instantiated using both the input image and the output image types.
typedef itk::GradientMagnitudeRecursiveGaussianImageFilter<
1Infinite Impulse Response

8.3. Gradients 153
Figure 8.8: Effect of the GradientMagnitudeRecursiveGaussianImageFilter.
A filter object is created by invoking the New() method and assigning the result to a
itk::SmartPointer.
The input image can be obtained from the output of another filter. Here, an image reader is used as
source.
The standard deviation of the Gaussian smoothing kernel is now set.
filter ->SetSigma(sigma);
Finally the filter is executed by invoking the Update() method.
filter ->Update();
If connected to other filters in a pipeline, this filter will automatically update when any downstream
filters are updated. For example, we may connect this gradient magnitude filter to an image file
writer and then update the writer.
writer ->Update();
Figure 8.8 illustrates the effect of this filter using σ values of 3 (left) and 5 (right). The figure shows
how the sensitivity to noise can be regulated by selecting an appropriate σ. This type of scale-tunable
filter is suitable for performing scale-space analysis.

Attention should be paid to the image type chosen to represent the output image since the dynamic
range of the gradient magnitude image is usually smaller than the dynamic range of the input image.
8.3.3 Derivative Without Smoothing
Examples/Filtering/DerivativeImageFilter.cxx.
The itk::DerivativeImageFilter is used for computing the partial derivative of an image, the
derivative of an image along a particular axial direction.
The header file corresponding to this filter should be included first.
#include "itkDerivativeImageFilter.h"
Next, the pixel types for the input and output images must be defined and, with them, the image
types can be instantiated. Note that it is important to select a signed type for the image, since the
values of the derivatives will be positive as well as negative.
Using the image types, it is now possible to define the filter type and create the filter object.
typedef itk::DerivativeImageFilter <
The order of the derivative is selected with the SetOrder() method. The direction along which the
derivative will be computed is selected with the SetDirection() method.
filter ->SetOrder(atoi(argv[4]));
filter ->SetDirection(atoi(argv[5]));
The input to the filter can be taken from any other filter, for example a reader. The output can be
passed down the pipeline to other filters, for example, a writer. An update call on any downstream
filter will trigger the execution of the derivative filter.
writer ->Update();
Figure 8.9 illustrates the effect of the DerivativeImageFilter. The derivative is taken along the x
direction. The sensitivity to noise in the image is evident from this result.

8.4. Second Order Derivatives 155
Figure 8.9: Effect of the Derivative ﬁlter.
8.4 Second Order Derivatives
8.4.1 Laplacian Filters
Laplacian Filter Recursive Gaussian
Examples/Filtering/LaplacianRecursiveGaussianImageFilter1.cxx.
This example illustrates how to use the itk::RecursiveGaussianImageFilter for computing
the Laplacian of an image.
#include "itkRecursiveGaussianImageFilter.h"
Types should be selected on the desired input and output pixel types.
The input and output image types are instantiated using the pixel types.

typedef itk::RecursiveGaussianImageFilter<
This filter applies the approximation of the convolution along a single dimension. It is therefore
necessary to concatenate several of these filters to produce smoothing in all directions. In this
example, we create a pair of filters since we are processing a 2D image. The filters are created by
invoking the New() method and assigning the result to a itk::SmartPointer.
We need two filters for computing the X component of the Laplacian and two other filters for com-
puting the Y component.
FilterType ::Pointer filterX1 = FilterType ::New();
FilterType ::Pointer filterY1 = FilterType ::New();
FilterType ::Pointer filterX2 = FilterType ::New();
FilterType ::Pointer filterY2 = FilterType ::New();
Since each one of the newly created filters has the potential to perform filtering along any dimension,
we have to restrict each one to a particular direction. This is done with the SetDirection() method.
filterX1 ->SetDirection(0); // 0 --> X direction
filterY1 ->SetDirection(1); // 1 --> Y direction
filterX2 ->SetDirection(0); // 0 --> X direction
filterY2 ->SetDirection(1); // 1 --> Y direction
The itk::RecursiveGaussianImageFilter can approximate the convolution with the Gaussian
or with its first and second derivatives. We select one of these options by using the SetOrder()
method. Note that the argument is an enum whose values can be ZeroOrder, FirstOrder and
SecondOrder. For example, to compute the x partial derivative we should select FirstOrder for
x and ZeroOrder for y. Here we want only to smooth in x and y, so we select ZeroOrder in both
directions.
filterX1 ->SetOrder (FilterType ::ZeroOrder );
filterY1 ->SetOrder (FilterType :: SecondOrder );
filterX2 ->SetOrder (FilterType :: SecondOrder );
filterY2 ->SetOrder (FilterType ::ZeroOrder );
There are two typical ways of normalizing Gaussians depending on their application. For scale-
space analysis it is desirable to use a normalization that will preserve the maximum value of the
input. This normalization is represented by the following equation.
1
σ
√
2π
(8.1)
In applications that use the Gaussian as a solution of the diffusion equation it is desirable to use a

normalization that preserve the integral of the signal. This last approach can be seen as a conserva-
tion of mass principle. This is represented by the following equation.
1
σ2
√
2π
(8.2)
The itk::RecursiveGaussianImageFilter has a boolean flag that allows users to select between
these two normalization options. Selection is done with the method SetNormalizeAcrossScale().
Enable this flag to analyzing an image across scale-space. In the current example, this setting has
no impact because we are actually renormalizing the output to the dynamic range of the reader, so
we simply disable the flag.
const bool normalizeAcrossScale = false;
filterX1 ->SetNormalizeAcrossScale(normalizeAcrossScale);
filterY1 ->SetNormalizeAcrossScale(normalizeAcrossScale);
filterX2 ->SetNormalizeAcrossScale(normalizeAcrossScale);
filterY2 ->SetNormalizeAcrossScale(normalizeAcrossScale);
the source. The image is passed to the x filter and then to the y filter. The reason for keeping these
two filters separate is that it is usual in scale-space applications to compute not only the smoothing
but also combinations of derivatives at different orders and smoothing. Some factorization is pos-
sible when separate filters are used to generate the intermediate results. Here this capability is less
interesting, though, since we only want to smooth the image in all directions.
filterX1 ->SetInput(reader ->GetOutput ());
filterY1 ->SetInput(filterX1 ->GetOutput ());
filterY2 ->SetInput(reader ->GetOutput ());
filterX2 ->SetInput(filterY2 ->GetOutput ());
It is now time to select the σ of the Gaussian used to smooth the data. Note that σ must be passed
to both filters and that sigma is considered to be in the units of the image spacing. That is, at
the moment of applying the smoothing process, the filter will take into account the spacing values
defined in the image.
filterX1 ->SetSigma(sigma);
filterY1 ->SetSigma(sigma);
filterX2 ->SetSigma(sigma);
filterY2 ->SetSigma(sigma);
Finally the two components of the Laplacian should be added together. The itk::AddImageFilter
is used for this purpose.
typedef itk::AddImageFilter <
OutputImageType ,
OutputImageType ,
OutputImageType > AddFilterType;

AddFilterType::Pointer addFilter = AddFilterType::New();
addFilter ->SetInput1 (filterY1 ->GetOutput ());
addFilter ->SetInput2 (filterX2 ->GetOutput ());
The filters are triggered by invoking Update() on the Add filter at the end of the pipeline.
try
{
addFilter ->Update();
}
{
std::cout << "ExceptionObject caught !" << std::endl;
}
The resulting image could be saved to a file using the otb::ImageFileWriter class.
typedef float WritePixelType;
typedef otb::Image <WritePixelType , 2> WriteImageType;
typedef otb::ImageFileWriter <WriteImageType > WriterType ;
writer ->SetInput(addFilter ->GetOutput ());
writer ->Update();
how the attenuation of noise can be regulated by selecting the appropriate standard deviation. This
type of scale-tunable filter is suitable for performing scale-space analysis.
Examples/Filtering/LaplacianRecursiveGaussianImageFilter2.cxx.
The previous exampled showed how to use the itk::RecursiveGaussianImageFilter
for computing the equivalent of a Laplacian of an image after smoothing with a Gaus-
sian. The elements used in this previous example have been packaged together in the
itk::LaplacianRecursiveGaussianImageFilter in order to simplify its usage. This current
example shows how to use this convenience filter for achieving the same results as the previous
example.
#include "itkLaplacianRecursiveGaussianImageFilter.h"

Figure 8.10: Effect of the RecursiveGaussianImageFilter.
Types should be selected on the desired input and output pixel types.
The input and output image types are instantiated using the pixel types.
typedef itk::LaplacianRecursiveGaussianImageFilter<
This filter packages all the components illustrated in the previous example. The filter is created by
invoking the New() method and assigning the result to a itk::SmartPointer.
FilterType ::Pointer laplacian = FilterType ::New();
The option for normalizing across scale space can also be selected in this filter.
laplacian ->SetNormalizeAcrossScale(false);
the source.
laplacian ->SetInput (reader ->GetOutput ());

Figure 8.11: Effect of the LaplacianRecursiveGaussianImageFilter.
It is now time to select the σ of the Gaussian used to smooth the data. Note that σ must be passed
to both filters and that sigma is considered to be in the units of the image spacing. That is, at
the moment of applying the smoothing process, the filter will take into account the spacing values
defined in the image.
laplacian ->SetSigma (sigma);
Finally the pipeline is executed by invoking the Update() method.
try
{
laplacian ->Update();
}
{
std::cout << "ExceptionObject caught !" << std::endl;
}
how the attenuation of noise can be regulated by selecting the appropriate standard deviation. This
type of scale-tunable filter is suitable for performing scale-space analysis.

8.5. Edge Detection 161
Figure 8.12: Effect of the CannyEdgeDetectorImageFilter on a ROI of a Spot 5 image.
8.5 Edge Detection
8.5.1 Canny Edge Detection
Examples/Filtering/CannyEdgeDetectionImageFilter.cxx.
This example introduces the use of the itk::CannyEdgeDetectionImageFilter. This filter is
widely used for edge detection since it is the optimal solution satisfying the constraints of good
sensitivity, localization and noise robustness.
The first step required for using this filter is to include its header file
As the Canny filter works with real values, we can instanciated the reader using an image with pixels
as double. This does not imply anything on the real image coding format which will be cast into
double.
typedef otb::ImageFileReader <RealImageType > ReaderType ;
The itk::CannyEdgeDetectionImageFilter is instantiated using the float image type.
Figure 8.12 illustrates the effect of this filter on a ROI of a Spot 5 image of an agricultural area.

8.5.2 Ratio of Means Detector
Examples/FeatureExtraction/TouziEdgeDetectorExample.cxx.
This example illustrates the use of the otb::TouziEdgeDetectorImageFilter. This filter belongs
to the family of the fixed false alarm rate edge detectors but it is apropriate for SAR images, where
the speckle noise is considered as multiplicative. By analogy with the classical gradient-based edge
detectors which are suited to the additive noise case, this filter computes a ratio of local means in
both sides of the edge [132]. In order to have a normalized response, the following computation is
performed :
r = 1 − min{
µA
µB
,
µB
µA
}, (8.3)
where µA and µB are the local means computed at both sides of the edge. In order to detect edges
with any orientation, r is computed for the 4 principal directions and the maximum response is kept.
#include "otbTouziEdgeDetectorImageFilter.h"
Then we must decide what pixel type to use for the image. We choose to make all computations with
floating point precision and rescale the results between 0 and 255 in order to export PNG images.
typedef float InternalPixelType;
The images are defined using the pixel type and the dimension.
typedef otb::Image <InternalPixelType , 2> InternalImageType;
The filter can be instantiated using the image types defined above.
typedef otb::TouziEdgeDetectorImageFilter<InternalImageType ,
InternalImageType > FilterType ;
An ImageFileReader::class is also instantiated in order to read image data from a file.
typedef otb::ImageFileReader <InternalImageType > ReaderType ;
An ImageFileWriter::is instantiated in order to write the output image to a file.
The intensity rescaling of the results will be carried out by the
itk::RescaleIntensityImageFilter which is templated by the input and output image
types.

8.5. Edge Detection 163
typedef itk::RescaleIntensityImageFilter<InternalImageType ,
OutputImageType > RescalerType;
to SmartPointers.
The same is done for the rescaler and the writer.
The itk::RescaleIntensityImageFilter needs to know which is the minimu and maximum
values of the output generated image. Those can be chosen in a generic way by using the
NumericTraits functions, since they are templated over the pixel type.
rescaler ->SetOutputMinimum(itk::NumericTraits <OutputPixelType >:: min ());
rescaler ->SetOutputMaximum(itk::NumericTraits <OutputPixelType >:: max ());
The image obtained with the reader is passed as input to the
otb::TouziEdgeDetectorImageFilter. The pipeline is built as follows.
The method SetRadius() deﬁnes the size of the window to be used for the computation of the local
means.
FilterType ::SizeType Radius;
Radius[0] = atoi(argv[4]);
filter ->SetRadius (Radius);
filter ->Update();
We can also obtain the direction of the edges by invoking the GetOutputDirection() method.
rescaler ->SetInput(filter ->GetOutputDirection());
writer ->Update();
Figure 8.13 shows the result of applying the Touzi edge detector ﬁlter to a SAR image.

Figure 8.13: Result of applying the otb::TouziEdgeDetectorImageFilter to a SAR image. From left to
right : original image, edge intensity and edge orientation.
8.6 Neighborhood Filters
The concept of locality is frequently encountered in image processing in the form of filters that
compute every output pixel using information from a small region in the neighborhood of the input
pixel. The classical form of these filters are the 3 × 3 filters in 2D images. Convolution masks
based on these neighborhoodscan perform diverse tasks ranging from noise reduction, to differential
operations, to mathematical morphology.
The Insight toolkit implements an elegant approach to neighborhood-based image filtering. The
input image is processed using a special iterator called the itk::NeighborhoodIterator. This
iterator is capable of moving over all the pixels in an image and, for each position, it can address
the pixels in a local neighborhood. Operators are defined that apply an algorithmic operation in
the neighborhood of the input pixel to produce a value for the output pixel. The following section
describes some of the more commonly used filters that take advantage of this construction. (See
Chapter 26 on page 587 for more information about iterators.)
8.6.1 Mean Filter
Examples/Filtering/MeanImageFilter.cxx.
The itk::MeanImageFilter is commonly used for noise reduction. The filter computes the value
of each output pixel by finding the statistical mean of the neighborhood of the corresponding input
pixel. The following figure illustrates the local effect of the MeanImageFilter. The statistical mean
of the neighborhood on the left is passed as the output value associated with the pixel at the center
of the neighborhood.

8.6. Neighborhood Filters 165
25 30 32
27 25 29
28 26 50
- 30.22 - 30
Note that this algorithm is sensitive to the presence of outliers in the neighbor-
hood. This filter will work on images of any dimension thanks to the internal use of
itk::SmartNeighborhoodIterator and itk::NeighborhoodOperator. The size of the neigh-
borhood over which the mean is computed can be set by the user.
#include "itkMeanImageFilter.h"
Then the pixel types for input and output image must be defined and, with them, the image types
can be instantiated.
Using the image types it is now possible to instantiate the filter type and create the filter object.
typedef itk::MeanImageFilter <
The size of the neighborhood is defined along every dimension by passing a SizeType object with
the corresponding values. The value on each dimension is used as the semi-size of a rectangular
box. For example, in 2D a size of 1,2 will result in a 3 × 5 neighborhood.
InputImageType::SizeType indexRadius ;
indexRadius [0] = 1; // radius along x
indexRadius [1] = 1; // radius along y
filter ->SetRadius (indexRadius );
filter will trigger the execution of the mean filter.
writer ->Update();

Figure 8.14: Effect of the MeanImageFilter.
Figure 8.14 illustrates the effect of this filter using neighborhood radii of 1,1 which corresponds to
a 3 × 3 classical neighborhood. It can be seen from this picture that edges are rapidly degraded by
the diffusion of intensity values among neighbors.
8.6.2 Median Filter
Examples/Filtering/MedianImageFilter.cxx.
The itk::MedianImageFilter is commonly used as a robust approach for noise reduction. This
filter is particularly efficient against salt-and-peppernoise. In other words, it is robust to the presence
of gray-level outliers. MedianImageFilter computes the value of each output pixel as the statistical
median of the neighborhood of values around the corresponding input pixel. The following figure
illustrates the local effect of this filter. The statistical median of the neighborhood on the left is
passed as the output value associated with the pixel at the center of the neighborhood.
25 30 32
27 25 29
28 26 50
- 28
This filter will work on images of any dimension thanks to the internal use of
itk::NeighborhoodIterator and itk::NeighborhoodOperator. The size of the neighborhood
over which the median is computed can be set by the user.
#include "itkMedianImageFilter.h"

Then the pixel and image types of the input and output must be defined.
Using the image types, it is now possible to define the filter type and create the filter object.
typedef itk::MedianImageFilter <
The size of the neighborhood is defined along every dimension by passing a SizeType object with
the corresponding values. The value on each dimension is used as the semi-size of a rectangular
box. For example, in 2D a size of 1,2 will result in a 3 × 5 neighborhood.
InputImageType::SizeType indexRadius ;
indexRadius [0] = 1; // radius along x
indexRadius [1] = 1; // radius along y
filter ->SetRadius (indexRadius );
filter will trigger the execution of the median filter.
writer ->Update();
Figure 8.15 illustrates the effect of the MedianImageFilter filter a neighborhood radius of 1,1, which
corresponds to a 3 × 3 classical neighborhood. The filtered image demonstrates the moderate ten-
dency of the median filter to preserve edges.
8.6.3 Mathematical Morphology
Mathematical morphology has proved to be a powerful resource for image processing and anal-
ysis [126]. ITK implements mathematical morphology filters using NeighborhoodIterators and
itk::NeighborhoodOperators. The toolkit contains two types of image morphology algorithms,
filters that operate on binary images and filters that operate on grayscale images.

Figure 8.15: Effect of the MedianImageFilter.
Binary Filters
Examples/Filtering/MathematicalMorphologyBinaryFilters.cxx.
The following section illustrates the use of filters that perform basic mathematical
morphology operations on binary images. The itk::BinaryErodeImageFilter and
itk::BinaryDilateImageFilter are described here. The filter names clearly specify the type
of image on which they operate. The header files required to construct a simple example of the use
of the mathematical morphology filters are included below.
#include "itkBinaryErodeImageFilter.h"
#include "itkBinaryDilateImageFilter.h"
#include "itkBinaryBallStructuringElement.h"
The following code defines the input and output pixel types and their associated image types.
Mathematical morphology operations are implemented by applying an operator over the neighbor-
hood of each input pixel. The combination of the rule and the neighborhood is known as structuring
element. Although some rules have become de facto standards for image processing, there is a good
deal of freedom as to what kind of algorithmic rule should be applied to the neighborhood. The
implementation in ITK follows the typical rule of minimum for erosion and maximum for dilation.
The structuring element is implemented as a NeighborhoodOperator. In particular, the default struc-
turing element is the itk::BinaryBallStructuringElement class. This class is instantiated

using the pixel type and dimension of the input image.
typedef itk::BinaryBallStructuringElement<
InputPixelType ,
Dimension > StructuringElementType;
The structuring element type is then used along with the input and output image types for instanti-
ating the type of the filters.
typedef itk::BinaryErodeImageFilter <
InputImageType ,
OutputImageType ,
StructuringElementType > ErodeFilterType;
typedef itk::BinaryDilateImageFilter <
InputImageType ,
OutputImageType ,
StructuringElementType > DilateFilterType;
The filters can now be created by invoking the New() method and assigning the result to
itk::SmartPointers.
ErodeFilterType::Pointer binaryErode = ErodeFilterType::New();
DilateFilterType::Pointer binaryDilate = DilateFilterType::New ();
The structuring element is not a reference counted class. Thus it is created as a C++ stack object
instead of using New() and SmartPointers. The radius of the neighborhood associated with the struc-
turing element is defined with the SetRadius() method and the CreateStructuringElement()
method is invoked in order to initialize the operator. The resulting structuring element is passed to
the mathematical morphology filter through the SetKernel() method, as illustrated below.
StructuringElementType structuringElement;
structuringElement.SetRadius (1); // 3x3 structuring element
structuringElement.CreateStructuringElement();
binaryErode ->SetKernel (structuringElement);
binaryDilate ->SetKernel (structuringElement);
A binary image is provided as input to the filters. This image might be, for example, the output of a
binary threshold image filter.
thresholder ->SetInput (reader ->GetOutput ());
InputPixelType background = 0;
InputPixelType foreground = 255;
thresholder ->SetOutsideValue(background );
thresholder ->SetInsideValue(foreground );

Figure 8.16: Effect of erosion and dilation in a binary image.
thresholder ->SetLowerThreshold(lowerThreshold);
thresholder ->SetUpperThreshold(upperThreshold);
binaryErode ->SetInput (thresholder ->GetOutput ());
binaryDilate ->SetInput(thresholder ->GetOutput ());
The values that correspond to “objects” in the binary image are specified with the methods
SetErodeValue() and SetDilateValue(). The value passed to these methods will be consid-
ered the value over which the dilation and erosion rules will apply.
binaryErode ->SetErodeValue(foreground );
binaryDilate ->SetDilateValue(foreground );
The filter is executed by invoking its Update() method, or by updating any downstream filter, like,
for example, an image writer.
writerDilation ->SetInput(binaryDilate ->GetOutput ());
writerDilation ->Update ();
Figure 8.16 illustrates the effect of the erosion and dilation filters. The figure shows how these
operations can be used to remove spurious details from segmented images.
Grayscale Filters
Examples/Filtering/MathematicalMorphologyGrayscaleFilters.cxx.
The following section illustrates the use of filters for performing basic mathematical mor-
phology operations on grayscale images. The itk::GrayscaleErodeImageFilter and

itk::GrayscaleDilateImageFilter are covered in this example. The filter names clearly spec-
ify the type of image on which they operate. The header files required for a simple example of the
use of grayscale mathematical morphology filters are presented below.
#include "itkGrayscaleErodeImageFilter.h"
#include "itkGrayscaleDilateImageFilter.h"
The following code defines the input and output pixel types and their associated image types.
Mathematical morphology operations are based on the application of an operator over a neighbor-
hood of each input pixel. The combination of the rule and the neighborhood is known as structuring
element. Although some rules have become the de facto standard in image processing there is a
good deal of freedom as to what kind of algorithmic rule should be applied on the neighborhood.
The implementation in ITK follows the typical rule of minimum for erosion and maximum for dila-
tion.
The structuring element is implemented as a itk::NeighborhoodOperator. In particular, the
default structuring element is the itk::BinaryBallStructuringElement class. This class is
instantiated using the pixel type and dimension of the input image.
typedef itk::BinaryBallStructuringElement<
InputPixelType ,
The structuring element type is then used along with the input and output image types for instanti-
ating the type of the filters.
typedef itk::GrayscaleErodeImageFilter<
InputImageType ,
OutputImageType ,
StructuringElementType > ErodeFilterType;
typedef itk::GrayscaleDilateImageFilter<
InputImageType ,
OutputImageType ,
StructuringElementType > DilateFilterType;
The filters can now be created by invoking the New() method and assigning the result to SmartPoint-
ers.
ErodeFilterType::Pointer grayscaleErode = ErodeFilterType::New ();
DilateFilterType::Pointer grayscaleDilate = DilateFilterType::New ();

Figure 8.17: Effect of erosion and dilation in a grayscale image.
The structuring element is not a reference counted class. Thus it is created as a C++ stack object
instead of using New() and SmartPointers. The radius of the neighborhood associated with the struc-
turing element is defined with the SetRadius() method and the CreateStructuringElement()
method is invoked in order to initialize the operator. The resulting structuring element is passed to
the mathematical morphology filter through the SetKernel() method, as illustrated below.
StructuringElementType structuringElement;
structuringElement.SetRadius (1); // 3x3 structuring element
structuringElement.CreateStructuringElement();
grayscaleErode ->SetKernel (structuringElement);
grayscaleDilate ->SetKernel (structuringElement);
A grayscale image is provided as input to the filters. This image might be, for example, the output
of a reader.
grayscaleErode ->SetInput(reader ->GetOutput ());
grayscaleDilate ->SetInput (reader ->GetOutput ());
The filter is executed by invoking its Update() method, or by updating any downstream filter, like,
for example, an image writer.
writerDilation ->SetInput(grayscaleDilate ->GetOutput ());
writerDilation ->Update ();
Figure 8.17 illustrates the effect of the erosion and dilation filters. The figure shows how these
operations can be used to remove spurious details from segmented images.
8.7 Smoothing Filters
Real image data has a level of uncertainty that is manifested in the variability of measures assigned
to pixels. This uncertainty is usually interpreted as noise and considered an undesirable component

8.7. Smoothing Filters 173
of the image data. This section describes several methods that can be applied to reduce noise on
images.
8.7.1 Blurring
Blurring is the traditional approach for removing noise from images. It is usually implemented in
the form of a convolution with a kernel. The effect of blurring on the image spectrum is to attenuate
high spatial frequencies. Different kernels attenuate frequencies in different ways. One of the most
commonly used kernels is the Gaussian. Two implementations of Gaussian smoothing are available
in the toolkit. The first one is based on a traditional convolution while the other is based on the
application of IIR filters that approximate the convolution with a Gaussian [37, 38].
Discrete Gaussian
Examples/Filtering/DiscreteGaussianImageFilter.cxx.
KernelWidth
Error
01 00
00
11
11010
00
00
1
11
11
0
00
0
1
11
1
00
00
11
1101 001100
0000
00
11
1111
11
00
0000
0000
11
1111
1111
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
Figure 8.18: Discretized Gaussian.
The itk::DiscreteGaussianImageFilter
computes the convolution of the input image
with a Gaussian kernel. This is done in ND
by taking advantage of the separability of the
Gaussian kernel. A one-dimensional Gaussian
function is discretized on a convolution kernel.
The size of the kernel is extended until there
are enough discrete points in the Gaussian to
ensure that a user-provided maximum error is
not exceeded. Since the size of the kernel is
unknown a priori, it is necessary to impose a
limit to its growth. The user can thus provide a
value to be the maximum admissible size of the kernel. Discretization error is defined as the dif-
ference between the area under the discrete Gaussian curve (which has finite support) and the area
under the continuous Gaussian.
Gaussian kernels in ITK are constructed according to the theory of Tony Lindeberg [89] so that
smoothing and derivative operations commute before and after discretization. In other words, finite
difference derivatives on an image I that has been smoothed by convolution with the Gaussian are
equivalent to finite differences computed on I by convolving with a derivative of the Gaussian.
#include "itkDiscreteGaussianImageFilter.h"
Types should be chosen for the pixels of the input and output images. Image types can be instantiated
using the pixel type and dimension.

The discrete Gaussian filter type is instantiated using the input and output image types. A corre-
sponding filter object is created.
typedef itk::DiscreteGaussianImageFilter<
its input.
The filter requires the user to provide a value for the variance associated with the Gaussian kernel.
The method SetVariance() is used for this purpose. The discrete Gaussian is constructed as a
convolution kernel. The maximum kernel size can be set by the user. Note that the combination of
variance and kernel-size values may result in a truncated Gaussian kernel.
filter ->SetVariance (gaussianVariance);
filter ->SetMaximumKernelWidth(maxKernelWidth);
Finally, the filter is executed by invoking the Update() method.
filter ->Update();
If the output of this filter has been connected to other filters down the pipeline, updating any of the
downstream filters would have triggered the execution of this one. For example, a writer could have
been used after the filter.
writer ->Update();
Figure 8.19 illustrates the effect of this filter.
Note that large Gaussian variances will produce large convolution kernels and correspondingly
slower computation times. Unless a high degree of accuracy is required, it may be more desirable to
use the approximating itk::RecursiveGaussianImageFilter with large variances.

Figure 8.19: Effect of the DiscreteGaussianImageFilter.
8.7.2 Edge Preserving Smoothing
Introduction to Anisotropic Diffusion
The drawback of image denoising (smoothing) is that it tends to blur away the sharp boundaries in
the image that help to distinguish between the larger-scale anatomical structures that one is trying
to characterize (which also limits the size of the smoothing kernels in most applications). Even in
cases where smoothing does not obliterate boundaries, it tends to distort the ﬁne structure of the
image and thereby changes subtle aspects of the anatomical shapes in question.
Perona and Malik [109] introduced an alternative to linear-ﬁltering that they called anisotropic diffu-
sion. Anisotropic diffusion is closely related to the earlier work of Grossberg [54], who used similar
nonlinear diffusion processes to model human vision. The motivation for anisotropic diffusion (also
called nonuniform or variable conductance diffusion) is that a Gaussian smoothed image is a single
time slice of the solution to the heat equation, that has the original image as its initial conditions.
Thus, the solution to
∂g(x,y,t)
∂t
= ∇·∇g(x,y,t), (8.4)
where g(x,y,0) = f(x,y) is the input image, is g(x,y,t) = G(
√
2t)⊗ f(x,y), where G(σ) is a Gaus-
sian with standard deviation σ.
Anisotropic diffusion includes a variable conductance term that, in turn, depends on the differential
structure of the image. Thus, the variable conductance can be formulated to limit the smoothing at
“edges” in images, as measured by high gradient magnitude, for example.
gt = ∇·c(|∇g|)∇g, (8.5)
where, for notational convenience, we leave off the independent parameters of g and use the sub-
scripts with respect to those parameters to indicate partial derivatives. The function c(|∇g|) is a
fuzzy cutoff that reduces the conductance at areas of large |∇g|, and can be any one of a number of

functions. The literature has shown
c(|∇g|) = e
−
|∇g|2
2k2 (8.6)
to be quite effective. Notice that conductance term introduces a free parameter k, the conductance
parameter, that controls the sensitivity of the process to edge contrast. Thus, anisotropic diffu-
sion entails two free parameters: the conductance parameter, k, and the time parameter, t, that is
analogous to σ, the effective width of the filter when using Gaussian kernels.
Equation 8.5 is a nonlinear partial differential equation that can be solved on a discrete grid using
finite forward differences. Thus, the smoothed image is obtained only by an iterative process, not a
convolution or non-stationary, linear filter. Typically, the number of iterations required for practical
results are small, and large 2D images can be processed in several tens of seconds using carefully
written code running on modern, general purpose, single-processor computers. The technique ap-
plies readily and effectively to 3D images, but requires more processing time.
In the early 1990’s several research groups [50, 142] demonstrated the effectiveness of anisotropic
diffusion on medical images. In a series of papers on the subject [147, 144, 146, 142, 143, 145],
Whitaker described a detailed analytical and empirical analysis, introduced a smoothing term in
the conductance that made the process more robust, invented a numerical scheme that virtually
eliminated directional artifacts in the original algorithm, and generalized anisotropic diffusion to
vector-valued images, an image processing technique that can be used on vector-valued medical
data (such as the color cryosection data of the Visible Human Project).
For a vector-valued input F : U → ℜm the process takes the form
Ft = ∇·c(DF)F, (8.7)
where DF is a dissimilarity measure of F, a generalization of the gradient magnitude to vector-
valued images, that can incorporate linear and nonlinear coordinate transformations on the range of
F. In this way, the smoothing of the multiple images associated with vector-valued data is coupled
through the conductance term, that fuses the information in the different images. Thus vector-
valued, nonlinear diffusion can combine low-level image features (e.g. edges) across all “channels”
of a vector-valued image in order to preserve or enhance those features in all of image “channels”.
Vector-valued anisotropic diffusion is useful for denoising data from devices that produce multi-
ple values such as MRI or color photography. When performing nonlinear diffusion on a color
image, the color channels are diffused separately, but linked through the conductance term. Vector-
valued diffusion it is also useful for processing registered data from different devices or for denoising
higher-order geometric or statistical features from scalar-valued images [145, 154].
The output of anisotropic diffusion is an image or set of images that demonstrates reduced noise and
texture but preserves, and can also enhance, edges. Such images are useful for a variety of processes
including statistical classification, visualization, and geometric feature extraction. Previous work
has shown [143] that anisotropic diffusion, over a wide range of conductance parameters, offers
quantifiable advantages over linear filtering for edge detection in medical images.
Since the effectiveness of nonlinear diffusion was first demonstrated, numerous variations of this ap-
proach have surfaced in the literature [131]. These include alternatives for constructing dissimilarity

measures [124], directional (i.e., tensor-valued) conductance terms [140, 7] and level set interpreta-
tions [148].
Gradient Anisotropic Diffusion
Examples/Filtering/GradientAnisotropicDiffusionImageFilter.cxx.
The itk::GradientAnisotropicDiffusionImageFilter implements an N-dimensional version
of the classic Perona-Malik anisotropic diffusion equation for scalar-valued images [109].
The conductance term for this implementation is chosen as a function of the gradient magnitude of
the image at each point, reducing the strength of diffusion at edge pixels.
C(x) = e−(
∇U(x)
K )2
(8.8)
The numerical implementation of this equation is similar to that described in the Perona-Malik paper
[109], but uses a more robust technique for gradient magnitude estimation and has been generalized
to N-dimensions.
#include "itkGradientAnisotropicDiffusionImageFilter.h"
Types should be selected based on the pixel types required for the input and output images. The
image types are defined using the pixel type and the dimension.
The filter type is now instantiated using both the input image and the output image types. The filter
object is created by the New() method.
typedef itk::GradientAnisotropicDiffusionImageFilter<
source.
This filter requires three parameters, the number of iterations to be performed, the time
step and the conductance parameter used in the computation of the level set evolution.

Figure 8.20: Effect of the GradientAnisotropicDiffusionImageFilter.
These parameters are set using the methods SetNumberOfIterations(), SetTimeStep() and
SetConductanceParameter() respectively. The filter can be executed by invoking Update().
filter ->SetNumberOfIterations(numberOfIterations);
filter ->SetTimeStep (timeStep );
filter ->SetConductanceParameter(conductance );
filter ->Update();
A typical value for the time step is 0.125. The number of iterations is typically set to 5; more
iterations result in further smoothing and will increase the computing time linearly.
Figure 8.20 illustrates the effect of this filter. In this example the filter was run with a time step of
0.125, and 5 iterations. The figure shows how homogeneous regions are smoothed and edges are
preserved.
• itk::BilateralImageFilter
• itk::CurvatureAnisotropicDiffusionImageFilter
• itk::CurvatureFlowImageFilter
Mean Shift filtering and clustering
Examples/BasicFilters/MeanShiftSegmentationFilterExample.cxx.
This example demonstrates the use of the otb::MeanShiftSegmentationFilter class which
implements filtering and clustering using the mean shift algorithm [30]. For a given pixel, the

mean shift will build a set of neighboring pixels within a given spatial radius and a color range. The
spatial and color center of this set is then computed and the algorithm iterates with this new spatial
and color center. The Mean Shift can be used for edge-preserving smoothing, or for clustering.
We start by including the needed header file.
#include "otbMeanShiftSegmentationFilter.h"
We start by the classical typedefs needed for reading and writing the images.
typedef unsigned int LabelPixelType;
typedef itk::RGBPixel <unsigned char> ColorPixelType;
typedef otb::Image <LabelPixelType , Dimension > LabelImageType;
typedef otb::Image <ColorPixelType , Dimension > RGBImageType;
typedef otb::ImageFileWriter <LabelImageType > LabelWriterType;
typedef otb::MeanShiftSegmentationFilter<ImageType , LabelImageType , ImageType > FilterType ;
We instantiate the filter, the reader, and 2 writers (for the labeled and clustered images).
WriterType ::Pointer writer1 = WriterType ::New ();
LabelWriterType::Pointer writer2 = LabelWriterType::New ();
We set the file names for the reader and the writers:
writer1 ->SetFileName (clusteredfname);
writer2 ->SetFileName (labeledfname);
We can now set the parameters for the filter. There are 3 main parameters: the spatial radius used
for defining the neighborhood, the range radius used for defining the interval in the color space and
the minimum size for the regions to be kept after clustering.
filter ->SetSpatialBandwidth(spatialRadius);
filter ->SetRangeBandwidth(rangeRadius );
filter ->SetMinRegionSize(minRegionSize);
Two another parameters can be set : the maximum iteration number, which defines maximum num-
ber of iteration until convergence. Algorithm iterative scheme will stop if convergence hasn’t been
reached after the maximum number of iterations. Threshold parameter defines mean-shift vector
convergence value. Algorithm iterative scheme will stop if mean-shift vector is below this threshold
or if iteration number reached maximum number of iterations.

Figure 8.21: From top to bottom and left to right: Original image, image filtered by mean shift after clustering ,
and labeled image.
filter ->SetMaxIterationNumber(maxiter );
filter ->SetThreshold(thres);
We can now plug the pipeline and run it.
writer1 ->SetInput(filter ->GetClusteredOutput());
writer2 ->SetInput(filter ->GetLabelOutput());
writer1 ->Update ();
writer2 ->Update ();
Figure 8.21 shows the result of applying the mean shift to a Quickbird image.
8.7.3 Edge Preserving Speckle Reduction Filters
Examples/BasicFilters/LeeImageFilter.cxx.
This example illustrates the use of the otb::LeeImageFilter. This filter belongs to the family of
the edge-preserving smoothing filters which are usually used for speckle reduction in radar images.
The Lee filter [85] aplies a linear regression which minimizes the mean-square error in the frame of
a multiplicative speckle model.

#include "otbLeeImageFilter.h"
Then we must decide what pixel type to use for the image.
typedef otb::Image <PixelType , 2> InputImageType;
typedef otb::Image <PixelType , 2> OutputImageType;
typedef otb::LeeImageFilter <InputImageType , OutputImageType > FilterType ;
to SmartPointers.
The image obtained with the reader is passed as input to the otb::LeeImageFilter.
statistics. The method SetNbLooks() sets the number of looks of the input image.
filter ->SetNbLooks (atoi(argv[4]));
Figure 8.22 shows the result of applying the Lee ﬁlter to a SAR image.
• otb::FrostImageFilter

Figure 8.22: Result of applying the otb::LeeImageFilter to a SAR image.
Examples/BasicFilters/FrostImageFilter.cxx.
This example illustrates the use of the otb::FrostImageFilter. This filter belongs to the fam-
ily of the edge-preserving smoothing filters which are usually used for speckle reduction in radar
images.
This filter uses a negative exponential convolution kernel. The output of the filter for pixel p is:
Îs = ∑p∈ηp
mpIp
where : mp =
KC2
s exp(−KC2
s ds,p)
∑p∈ηp KC2
s exp(−KC2
s ds,p)
and ds,p = (i− ip)2 + (j − jp)2
• K : the decrease coefficient
• (i, j) : the coordinates of the pixel inside the region defined by ηs
• (ip, jp) : the coordinates of the pixels belonging to ηp ⊂ ηs
• Cs : the variation coefficient computed over ηp
Most of this example is similar to the previous one and only the differences will be highlighted.
First, we need to include the header:
#include "otbFrostImageFilter.h"
The filter can be instantiated using the image types defined previously.
typedef otb::FrostImageFilter <InputImageType , OutputImageType > FilterType ;
The image obtained with the reader is passed as input to the otb::FrostImageFilter.

Figure 8.23: Result of applying the otb::FrostImageFilter to a SAR image.
statistics. The method SetDeramp() sets the K coefficient.
filter ->SetDeramp (atof(argv[4]));
Figure 8.23 shows the result of applying the Frost filter to a SAR image.
• otb::LeeImageFilter
8.7.4 Edge preserving Markov Random Field
The Markov Random Field framework for OTB is more detailled in 19.2.6 (p. 492).
Examples/Markov/MarkovRestaurationExample.cxx.
The Markov Random Field framework can be used to apply an edge preserving filtering, thus playing
a role of restauration.
This example applies the otb::MarkovRandomFieldFilter for image restauration. The structure
of the example is similar to the other MRF example. The original image is assumed to be coded
in one byte, thus 256 states are possible for each pixel. The only other modifications reside in the
energy function chosen for the fidelity and for the regularization.
For the regularization energy function, we choose an edge preserving function:

Φ(u) =
u2
1 + u2
(8.9)
and for the fidelity function, we choose a gaussian model.
The starting state of the Markov Random Field is given by the image itself as the final state should
not be too far from it.
The first step toward the use of this filter is the inclusion of the proper header files:
#include "otbMRFEnergyEdgeFidelity.h"
#include "otbMRFEnergyGaussian.h"
#include "otbMRFOptimizerMetropolis.h"
#include "otbMRFSamplerRandom.h"
We declare the usual types:
typedef double InternalPixelType;
typedef unsigned char LabelledPixelType;
typedef otb::Image <InternalPixelType , Dimension > InputImageType;
typedef otb::Image <LabelledPixelType , Dimension > LabelledImageType;
We need to declare an additional reader for the initial state of the MRF. This reader has to be instan-
tiated on the LabelledImageType.
typedef otb::ImageFileReader <LabelledImageType > ReaderLabelledType;
typedef otb::ImageFileWriter <LabelledImageType > WriterType ;
ReaderLabelledType::Pointer reader2 = ReaderLabelledType::New ();
const char * labelledFilename = argv[2];
const char * outputFilename = argv[3];
reader2 ->SetFileName (labelledFilename);
We declare all the necessary types for the MRF:
typedef otb::MarkovRandomFieldFilter
<InputImageType , LabelledImageType > MarkovRandomFieldFilterType;
typedef otb::MRFSamplerRandom <InputImageType , LabelledImageType > SamplerType ;
typedef otb::MRFOptimizerMetropolis OptimizerType;

The regularization and the ﬁdelity energy are declared and instanciated:
typedef otb:: MRFEnergyEdgeFidelity
<LabelledImageType , LabelledImageType > EnergyRegularizationType;
typedef otb::MRFEnergyGaussian
<InputImageType , LabelledImageType > EnergyFidelityType;
MarkovRandomFieldFilterType::Pointer markovFilter =
MarkovRandomFieldFilterType::New();
EnergyRegularizationType::Pointer energyRegularization =
EnergyRegularizationType::New ();
EnergyFidelityType::Pointer energyFidelity = EnergyFidelityType::New();
OptimizerType::Pointer optimizer = OptimizerType::New();
SamplerType ::Pointer sampler = SamplerType ::New ();
The number of possible states for each pixel is 256 as the image is assumed to be coded on one byte
and we pass the parameters to the markovFilter.
unsigned int nClass = 256;
optimizer ->SetSingleParameter(atof(argv[6]));
markovFilter ->SetNumberOfClasses(nClass);
markovFilter ->SetMaximumNumberOfIterations(atoi(argv[5]));
markovFilter ->SetErrorTolerance(0.0);
markovFilter ->SetLambda (atof(argv[4]));
markovFilter ->SetNeighborhoodRadius(1);
markovFilter ->SetEnergyRegularization(energyRegularization);
markovFilter ->SetEnergyFidelity(energyFidelity);
markovFilter ->SetOptimizer(optimizer );
markovFilter ->SetSampler (sampler );
The original state of the MRF ﬁlter is passed through the SetTrainingInput() method:
markovFilter ->SetTrainingInput(reader2->GetOutput ());
And we plug the pipeline:
markovFilter ->SetInput(reader ->GetOutput ());
typedef itk:: RescaleIntensityImageFilter
<LabelledImageType , LabelledImageType > RescaleType ;
RescaleType ::Pointer rescaleFilter = RescaleType ::New();
rescaleFilter ->SetOutputMinimum(0);
rescaleFilter ->SetOutputMaximum(255);
rescaleFilter ->SetInput(markovFilter ->GetOutput ());
writer ->SetInput(rescaleFilter ->GetOutput ());
writer ->Update();

Figure 8.24: Result of applying the otb::MarkovRandomFieldFilter to an extract from a PAN Quickbird
image for restauration. From left to right : original image, restaured image with edge preservation.
Figure 8.24 shows the output of the Markov Random Field restauration.
8.8 Distance Map
Examples/Filtering/DanielssonDistanceMapImageFilter.cxx.
This example illustrates the use of the itk::DanielssonDistanceMapImageFilter. This ﬁlter
generates a distance map from the input image using the algorithm developed by Danielsson [33].
As secondary outputs, a Voronoi partition of the input elements is produced, as well as a vector
image with the components of the distance vector to the closest point. The input to the map is
assumed to be a set of points on the input image. Each point/pixel is considered to be a separate
entity even if they share the same gray level value.
#include "itkDanielssonDistanceMapImageFilter.h"
Then we must decide what pixel types to use for the input and output images. Since the output will
contain distances measured in pixels, the pixel type should be able to represent at least the width of
the image, or said in N − D terms, the maximum extension along all the dimensions. The input and
output image types are now deﬁned using their respective pixel type and dimension.
typedef unsigned short OutputPixelType;

8.8. Distance Map 187
Figure 8.25: DanielssonDistanceMapImageFilter output. Set of pixels, distance map and Voronoi partition.
The filter type can be instantiated using the input and output image types defined above. A filter
object is created with the New() method.
typedef itk::DanielssonDistanceMapImageFilter<
The input to the filter is taken from a reader and its output is passed to a
itk::RescaleIntensityImageFilter and then to a writer.
scaler ->SetInput(filter ->GetOutput ());
writer ->SetInput(scaler ->GetOutput ());
The type of input image has to be specified. In this case, a binary image is selected.
filter ->InputIsBinaryOn();
Figure 8.25 illustrates the effect of this filter on a binary image with a set of points. The input image
is shown at left, the distance map at the center and the Voronoi partition at right. This filter computes
distance maps in N-dimensions and is therefore capable of producing N − D Voronoi partitions.
The Voronoi map is obtained with the GetVoronoiMap() method. In the lines below we connect
this output to the intensity rescaler and save the result in a file.
scaler ->SetInput(filter ->GetVoronoiMap());
writer ->SetFileName (voronoiMapFileName);
writer ->Update();
Execution of the writer is triggered by the invocation of the Update() method. Since this method
can potentially throw exceptions it must be placed in a try/catch block.

8.9 Rasterization
Rasterization is the process of rendering vectorial data on a raster grid. This rasterization can be ei-
ther binary or more complete, including different styles and labels for the vectorial features.For ras-
terization purposes, OTB uses the Mapnick library through the otb::VectorDataToImageFilter.
Hence, rasterization will be only available to users compiling OTB with the Mapnick CMake option
set to ON, or using a binary package with Mapnick activated.
Examples/Filtering/RasterizationExample.cxx.
The otb::VectorDataToMapFilter allows to perform rasterization of a given vector data as a
binary mask. This example will demonstrate how to use this filter to perform rasterization of the
SRTM water body masks available here: http://dds.cr.usgs.gov/srtm/version2_1/SWBD/.
First step to use this filter is to include the appropriate headers:
#include "otbVectorData.h"
#include "otbVectorDataToMapFilter.h"
Then, we need to define the appropriate VectorData and Image type.
typedef otb::VectorData <> VectorDataType;
Using these typedefs, we can define and instantiate the otb::VectorDataToMapFilter.
typedef otb::VectorDataToMapFilter <VectorDataType ,
ImageType > VectorDataToMapFilterType;
VectorDataToMapFilterType::Pointer vectorDataRendering
= VectorDataToMapFilterType::New();
We will also define a otb::VectorDataFileReader to read the VectorData, as well as a
otb::VectorDataProjectionFilter to reproject our data in a given map projection.
typedef otb::VectorDataFileReader <VectorDataType > VectorDataFileReaderType;
typedef otb::VectorDataProjectionFilter<VectorDataType ,
VectorDataType > ProjectionFilterType;
ProjectionFilterType::Pointer projection = ProjectionFilterType::New ();
projection ->SetInput (reader ->GetOutput ());
Next step is to specify the map projection in which to reproject our vector.
projection ->SetOutputProjectionRef(projectionRefWkt);

8.9. Rasterization 189
Since the input vector can be pretty big, we will perform an extract of the region of interest using
the otb::VectorDataExtractROI.
The first step is to define the region of interest.
typedef otb::RemoteSensingRegion <double> RegionType ;
size[0] = 500;
size[1] = 500;
origin[0] = 633602; //UL easting
origin[1] = 4961679; //UL northing
spacing [0] = 56;
spacing [1] = -56;
RegionType region;
RegionType ::SizeType sizeInUnit ;
sizeInUnit [0] = size[0] * spacing [0];
sizeInUnit [1] = size[1] * spacing [1];
region.SetSize(sizeInUnit );
region.SetOrigin (origin);
region.SetRegionProjection(projectionRefWkt);
Then, we define and set-up the otb::VectorDataExtractROI filter using the region.
typedef otb::VectorDataExtractROI <VectorDataType > ExtractROIType;
ExtractROIType::Pointer extractROI = ExtractROIType::New ();
extractROI ->SetRegion (region);
extractROI ->SetInput (projection ->GetOutput ());
Now, we can plug the the ROI filter to the otb::VectorDataToMapFilter.
vectorDataRendering ->SetInput(extractROI ->GetOutput ());
vectorDataRendering ->SetSize(size);
vectorDataRendering ->SetOrigin (origin);
vectorDataRendering ->SetSpacing (spacing );
Since we are interested in binary rendering, we need to set the appropriate rendering style.
vectorDataRendering ->SetRenderingStyleType(VectorDataToMapFilterType::Binary);
The rendering filter will return a binary image with label 0 when outside the rasterized vector features
and 255 when inside. To get a fancier rendering we will substitute a blue color to the foreground
value and green to the background value.
typedef itk::RGBAPixel <unsigned char> RGBAPixelType;

Figure 8.26: Rasterized SRTM water bodies near Arcachon, France.
typedef otb::Image <RGBAPixelType , 2> RGBAImageType;
typedef itk::ChangeLabelImageFilter <ImageType ,
RGBAImageType > ChangeLabelImageFilterType;
ChangeLabelImageFilterType::Pointer
changeLabelFilter = ChangeLabelImageFilterType::New();
RGBAPixelType green , blue;
green.SetAlpha (255);
green.SetGreen (255);
blue.SetAlpha (255);
blue.SetBlue (255);
changeLabelFilter ->SetChange (0, blue);
changeLabelFilter ->SetChange (255, green);
changeLabelFilter ->SetInput (vectorDataRendering ->GetOutput ());
Last step is to write the image to the disk using a otb::ImageFileWriter.
typedef otb::ImageFileWriter <RGBAImageType > WriterType ;
writer ->SetInput(changeLabelFilter ->GetOutput ());
writer ->Update();
Figure 8.26 illustrates the use of the rasterization ﬁlter on SRTM water bodies mask near Arcachon
in France. Ocean appears in blue while land appears in green.

CHAPTER
NINE
IMAGE REGISTRATION
p qT
Figure 9.1: Image registration is the task of finding a spatial
transform mapping on image into another.
This chapter introduces OTB’s (actually
mainly ITK’s) capabilities for perform-
ing image registration. Please note that
the disparity map estimation approach
presented in chapter 10 are very closely
related to image registration. Image reg-
istration is the process of determining
the spatial transform that maps points
from one image to homologous points
on a object in the second image. This
concept is schematically represented in Figure 9.1. In OTB, registration is performed within a
framework of pluggable components that can easily be interchanged. This flexibility means that a
combinatorial variety of registration methods can be created, allowing users to pick and choose the
right tools for their specific application.
9.1 Registration Framework
The components of the registration framework and their interconnections are shown in Figure 9.2.
The basic input data to the registration process are two images: one is defined as the fixed image f(X)
and the other as the moving image m(X). Where X represents a position in N-dimensional space.
Registration is treated as an optimization problem with the goal of finding the spatial mapping that
will bring the moving image into alignment with the fixed image.
The transform component T(X) represents the spatial mapping of points from the fixed image space
to points in the moving image space. The interpolator is used to evaluate moving image intensities
at non-grid positions. The metric component S(f,m◦ T) provides a measure of how well the fixed
image is matched by the transformed moving image. This measure forms the quantitative criterion
to be optimized by the optimizer over the search space defined by the parameters of the transform.

192 Chapter 9. Image Registration
Optimizer
Transform
Interpolator
Metric
Moving Image
Fixed Image
fitness value
points
pixels
pixels
pixels
Transform
parameters
Figure 9.2: The basic components of the registration framework are two input images, a transform, a metric,
an interpolator and an optimizer.
These various OTB/ITK registration components will be described in later sections. First, we begin
with some simple registration examples.
9.2 ”Hello World” Registration
Examples/Registration/ImageRegistration1.cxx.
This example illustrates the use of the image registration framework in ITK/OTB. It should be read
as a ”Hello World” for registration. Which means that for now, you don’t ask “why?”. Instead,
use the example as an introduction to the elements that are typically involved in solving an image
registration problem.
A registration method requires the following set of components: two input images, a transform,
a metric, an interpolator and an optimizer. Some of these components are parameterized by the
image type for which the registration is intended. The following header ﬁles provide declarations of
common types used for these components.
#include "itkImageRegistrationMethod.h"
#include "itkTranslationTransform.h"
#include "itkMeanSquaresImageToImageMetric.h"
#include "itkLinearInterpolateImageFunction.h"
#include "itkRegularStepGradientDescentOptimizer.h"
The types of each one of the components in the registration methods should be instantiated ﬁrst.
With that purpose, we start by selecting the image dimension and the type used for representing
image pixels.

9.2. ”Hello World” Registration 193
The types of the input images are instantiated by the following lines.
typedef otb::Image <PixelType , Dimension > FixedImageType;
typedef otb::Image <PixelType , Dimension > MovingImageType;
The transform that will map the fixed image space into the moving image space is defined below.
typedef itk::TranslationTransform <double, Dimension > TransformType;
An optimizer is required to explore the parameter space of the transform in search of optimal values
of the metric.
typedef itk:: RegularStepGradientDescentOptimizer OptimizerType;
The metric will compare how well the two images match each other. Metric types are usually
parameterized by the image types as it can be seen in the following type declaration.
typedef itk::MeanSquaresImageToImageMetric<
FixedImageType ,
MovingImageType > MetricType ;
Finally, the type of the interpolator is declared. The interpolator will evaluate the intensities of the
moving image at non-grid positions.
typedef itk::LinearInterpolateImageFunction<
MovingImageType ,
double> InterpolatorType;
The registration method type is instantiated using the types of the fixed and moving images. This
class is responsible for interconnecting all the components that we have described so far.
typedef itk::ImageRegistrationMethod <
FixedImageType ,
MovingImageType > RegistrationType;
Each one of the registration components is created using its New() method and is assigned to its
respective itk::SmartPointer.
MetricType ::Pointer metric = MetricType ::New ();
TransformType::Pointer transform = TransformType::New();
InterpolatorType::Pointer interpolator = InterpolatorType::New ();
RegistrationType::Pointer registration = RegistrationType::New ();
Each component is now connected to the instance of the registration method.
registration ->SetMetric (metric);
registration ->SetOptimizer(optimizer );
registration ->SetTransform(transform );
registration ->SetInterpolator(interpolator);

Since we are working with high resolution images and expected shifts are larger than the resolution,
we will need to smooth the images in order to avoid the optimizer to get stucked on local minima.
In order to do this, we will use a simple mean filter.
FixedImageType , FixedImageType > FixedFilterType;
MovingImageType , MovingImageType > MovingFilterType;
FixedFilterType::Pointer fixedFilter = FixedFilterType::New ();
MovingFilterType::Pointer movingFilter = MovingFilterType::New ();
FixedImageType::SizeType indexFRadius;
indexFRadius[0] = 4; // radius along x
indexFRadius[1] = 4; // radius along y
fixedFilter ->SetRadius (indexFRadius);
MovingImageType::SizeType indexMRadius;
indexMRadius[0] = 4; // radius along x
indexMRadius[1] = 4; // radius along y
movingFilter ->SetRadius (indexMRadius);
fixedFilter ->SetInput (fixedImageReader ->GetOutput ());
movingFilter ->SetInput(movingImageReader ->GetOutput ());
Now we can plug the output of the smoothing filters at the input of the registration method.
registration ->SetFixedImage(fixedFilter ->GetOutput ());
registration ->SetMovingImage(movingFilter ->GetOutput ());
The registration can be restricted to consider only a particular region of the fixed image as input to
the metric computation. This region is defined with the SetFixedImageRegion() method. You
could use this feature to reduce the computational time of the registration or to avoid unwanted
objects present in the image from affecting the registration outcome. In this example we use the full
available content of the image. This region is identified by the BufferedRegion of the fixed image.
Note that for this region to be valid the reader must first invoke its Update() method.
fixedFilter ->Update();
registration ->SetFixedImageRegion(
fixedFilter ->GetOutput ()->GetBufferedRegion());
The parameters of the transform are initialized by passing them in an array. This can be used to setup
an initial known correction of the misalignment. In this particular case, a translation transform is
being used for the registration. The array of parameters for this transform is simply composed of the
translation values along each dimension. Setting the values of the parameters to zero initializes the

transform to an Identity transform. Note that the array constructor requires the number of elements
to be passed as an argument.
typedef RegistrationType::ParametersType ParametersType;
ParametersType initialParameters(transform ->GetNumberOfParameters());
initialParameters[0] = 0.0; // Initial offset in mm along X
initialParameters[1] = 0.0; // Initial offset in mm along Y
registration ->SetInitialTransformParameters(initialParameters);
At this point the registration method is ready for execution. The optimizer is the component that
drives the execution of the registration. However, the ImageRegistrationMethod class orchestrates
the ensemble to make sure that everything is in place before control is passed to the optimizer.
It is usually desirable to fine tune the parameters of the optimizer. Each optimizer has particular
parameters that must be interpreted in the context of the optimization strategy it implements. The
optimizer used in this example is a variant of gradient descent that attempts to prevent it from taking
steps that are too large. At each iteration, this optimizer will take a step along the direction of the
itk::ImageToImageMetric derivative. The initial length of the step is defined by the user. Each
time the direction of the derivative abruptly changes, the optimizer assumes that a local extrema has
been passed and reacts by reducing the step length by a half. After several reductions of the step
length, the optimizer may be moving in a very restricted area of the transform parameter space. The
user can define how small the step length should be to consider convergence to have been reached.
This is equivalent to defining the precision with which the final transform should be known.
The initial step length is defined with the method SetMaximumStepLength(), while the tolerance
for convergence is defined with the method SetMinimumStepLength().
optimizer ->SetMaximumStepLength(3);
optimizer ->SetMinimumStepLength(0.01);
In case the optimizer never succeeds reaching the desired precision tolerance, it is prudent to estab-
lish a limit on the number of iterations to be performed. This maximum number is defined with the
method SetNumberOfIterations().
optimizer ->SetNumberOfIterations(200);
The registration process is triggered by an invocation to the Update() method. If something goes
wrong during the initialization or execution of the registration an exception will be thrown. We
should therefore place the Update() method inside a try/catch block as illustrated in the following
lines.
try
{
registration ->Update ();
}
{

return -1;
}
In a real life application, you may attempt to recover from the error by taking more effective actions
in the catch block. Here we are simply printing out a message and then terminating the execution of
the program.
The result of the registration process is an array of parameters that defines the spatial transformation
in an unique way. This final result is obtained using the GetLastTransformParameters() method.
ParametersType finalParameters = registration ->GetLastTransformParameters();
In the case of the itk::TranslationTransform, there is a straightforward interpretation of the
parameters. Each element of the array corresponds to a translation along one spatial dimension.
const double TranslationAlongX = finalParameters[0];
const double TranslationAlongY = finalParameters[1];
The optimizer can be queried for the actual number of iterations performed to reach convergence.
The GetCurrentIteration() method returns this value. A large number of iterations may be an
indication that the maximum step length has been set too small, which is undesirable since it results
in long computational times.
const unsigned int numberOfIterations = optimizer ->GetCurrentIteration();
The value of the image metric corresponding to the last set of parameters can be obtained with the
GetValue() method of the optimizer.
const double bestValue = optimizer ->GetValue ();
Let’s execute this example over two of the images provided in Examples/Data:
• QB Suburb.png
• QB Suburb13x17y.png
The second image is the result of intentionally translating the first image by (13,17) pixels. Both
images have unit-spacing and are shown in Figure 9.3. The registration takes 18 iterations and the
resulting transform parameters are:
Translation X = 12.0192
Translation Y = 16.0231

Figure 9.3: Fixed and Moving image provided as input to the registration method.
As expected, these values match quite well the misalignment that we intentionally introduced in the
moving image.
It is common, as the last step of a registration task, to use the resulting transform to map the moving
image into the fixed image space. This is easily done with the itk::ResampleImageFilter. First,
a ResampleImageFilter type is instantiated using the image types. It is convenient to use the fixed
image type as the output type since it is likely that the transformed moving image will be compared
with the fixed image.
typedef itk::ResampleImageFilter <
MovingImageType ,
FixedImageType > ResampleFilterType;
A resampling filter is created and the moving image is connected as its input.
ResampleFilterType::Pointer resampler = ResampleFilterType::New ();
resampler ->SetInput (movingImageReader ->GetOutput ());
The Transform that is produced as output of the Registration method is also passed as input to the
resampling filter. Note the use of the methods GetOutput() and Get(). This combination is needed
here because the registration method acts as a filter whose output is a transform decorated in the form
of a itk::DataObject. For details in this construction you may want to read the documentation of
the itk::DataObjectDecorator.
resampler ->SetTransform(registration ->GetOutput ()->Get());
The ResampleImageFilter requires additional parameters to be specified, in particular, the spacing,
origin and size of the output image. The default pixel value is also set to a distinct gray level in order
to highlight the regions that are mapped outside of the moving image.
FixedImageType::Pointer fixedImage = fixedImageReader ->GetOutput ();
resampler ->SetSize(fixedImage ->GetLargestPossibleRegion(). GetSize ());

Figure 9.4: Mapped moving image and its difference with the fixed image before and after registration
resampler ->SetOutputOrigin(fixedImage ->GetOrigin ());
resampler ->SetOutputSpacing(fixedImage ->GetSpacing ());
resampler ->SetDefaultPixelValue(100);
The output of the filter is passed to a writer that will store the image in a file. An
itk::CastImageFilter is used to convert the pixel type of the resampled image to the final type
used by the writer. The cast and writer filters are instantiated below.
typedef itk::CastImageFilter <FixedImageType ,
OutputImageType > CastFilterType;
The filters are created by invoking their New() method.
CastFilterType::Pointer caster = CastFilterType::New();
The filters are connected together and the Update() method of the writer is invoked in order to
trigger the execution of the pipeline.
caster ->SetInput(resampler ->GetOutput ());
writer ->SetInput(caster ->GetOutput ());
writer ->Update();
The fixed image and the transformed moving image can easily be compared using the
itk::SubtractImageFilter. This pixel-wise filter computes the difference between homologous
pixels of its two input images.
typedef itk::SubtractImageFilter <
FixedImageType ,
FixedImageType ,
FixedImageType > DifferenceFilterType;
DifferenceFilterType::Pointer difference = DifferenceFilterType::New ();

Optimizer
Transform
Interpolator
MetricFixed Image
Moving Image
Filter
Resample
Transform
Subtract
Filter
Writer
Subtract
Filter
WriterFilter
Resample
Reader
Reader Smooth
Smooth
Registration Method
Parameters
Figure 9.5: Pipeline structure of the registration example.
difference ->SetInput1 (fixedImageReader ->GetOutput ());
difference ->SetInput2 (resampler ->GetOutput ());
Note that the use of subtraction as a method for comparing the images is appropriate here because
we chose to represent the images using a pixel type float. A different filter would have been used
if the pixel type of the images were any of the unsigned integer type.
Since the differences between the two images may correspond to very low values of intensity, we
rescale those intensities with a itk::RescaleIntensityImageFilter in order to make them more
visible. This rescaling will also make possible to visualize the negative values even if we save the
difference image in a file format that only support unsigned pixel values1. We also reduce the
DefaultPixelValue to “1” in order to prevent that value from absorbing the dynamic range of the
differences between the two images.
typedef itk::RescaleIntensityImageFilter<
FixedImageType ,
RescalerType::Pointer intensityRescaler = RescalerType::New();
intensityRescaler ->SetInput (difference ->GetOutput ());
intensityRescaler ->SetOutputMinimum(0);
intensityRescaler ->SetOutputMaximum(255);
Its output can be passed to another writer.
WriterType ::Pointer writer2 = WriterType ::New ();
writer2 ->SetInput(intensityRescaler ->GetOutput ());
For the purpose of comparison, the difference between the fixed image and the moving image before
registration can also be computed by simply setting the transform to an identity transform. Note
that the resampling is still necessary because the moving image does not necessarily have the same
spacing, origin and number of pixels as the fixed image. Therefore a pixel-by-pixel operation cannot
1This is the case of PNG, BMP, JPEG and TIFF among other common file formats.

in general be performed. The resampling process with an identity transform will ensure that we have
a representation of the moving image in the grid of the fixed image.
TransformType::Pointer identityTransform = TransformType::New ();
identityTransform ->SetIdentity ();
resampler ->SetTransform(identityTransform);
The complete pipeline structure of the current example is presented in Figure 9.5. The components
of the registration method are depicted as well. Figure 9.4 (left) shows the result of resampling the
moving image in order to map it onto the fixed image space. The top and right borders of the image
appear in the gray level selected with the SetDefaultPixelValue() in the ResampleImageFilter.
The center image shows the difference between the fixed image and the original moving image.
That is, the difference before the registration is performed. The right image shows the difference
between the fixed image and the transformed moving image. That is, after the registration has been
performed. Both difference images have been rescaled in intensity in order to highlight those pixels
where differences exist. Note that the final registration is still off by a fraction of a pixel, which
results in bands around edges of anatomical structures to appear in the difference image. A perfect
registration would have produced a null difference image.
9.3 Features of the Registration Framework
This section presents a discussion on the two most common difficulties that users encounter when
they start using the ITK registration framework. They are, in order of difficulty
• The direction of the Transform mapping
• The fact that registration is done in physical coordinates
Probably the reason why these two topics tend to create confusion is that they are implemented in
different ways in other systems and therefore users tend to have different expectations regarding
how things should work in OTB. The situation is further complicated by the fact that most people
describe image operations as if they were manually performed in a picture in paper.
9.3.1 Direction of the Transform Mapping
The Transform that is optimized in the ITK registration framework is the one that maps points from
the physical space of the fixed image into the physical space of the moving image. This is illustrated
in Figure 9.6. This implies that the Transform will accept as input points from the fixed image and
it will compute the coordinates of the analogous points in the moving image. What tends to create
confusion is the fact that when the Transform shifts a point on the positive X direction, the visual
effect of this mapping, once the moving image is resampled, is equivalent to manually shifting the

9.3. Features of the Registration Framework 201
Fixed Image Grid
i
j
Moving Image Grid
i
j
x
y
Fixed Image
Physical Coordinates
x
y
Fixed Image
x
y
Moving Image
Physical Coordinates
Space Transform
T2
T1
Figure 9.6: Different coordinate systems involved in the image registration process. Note that the transform
being optimized is the one mapping from the physical space of the ﬁxed image into the physical space of the
moving image.

moving image along the negative X direction. In the same way, when the Transform applies a clock-
wise rotation to the fixed image points, the visual effect of this mapping once the moving image has
been resampled is equivalent to manually rotating the moving image counter-clock-wise.
The reason why this direction of mapping has been chosen for the ITK implementation of the regis-
tration framework is that this is the direction that better fits the fact that the moving image is expected
to be resampled using the grid of the fixed image. The nature of the resampling process is such that
an algorithm must go through every pixel of the fixed image and compute the intensity that should
be assigned to this pixel from the mapping of the moving image. This computation involves taking
the integral coordinates of the pixel in the image grid, usually called the “(i,j)” coordinates, mapping
them into the physical space of the fixed image (transform T1 in Figure 9.6), mapping those physical
coordinates into the physical space of the moving image (Transform to be optimized), then mapping
the physical coordinates of the moving image in to the integral coordinates of the discrete grid of the
moving image (transform T2 in the figure), where the value of the pixel intensity will be computed
by interpolation.
If we have used the Transform that maps coordinates from the moving image physical space into the
fixed image physical space, then the resampling process could not guarantee that every pixel in the
grid of the fixed image was going to receive one and only one value. In other words, the resampling
will have resulted in an image with holes and with redundant or overlapped pixel values.
As you have seen in the previous examples, and you will corroborate in the remaining examples in
this chapter, the Transform computed by the registration framework is the Transform that can be
used directly in the resampling filter in order to map the moving image into the discrete grid of the
fixed image.
There are exceptional cases in which the transform that you want is actually the inverse transform of
the one computed by the ITK registration framework. Only in those cases you may have to recur to
invoking the GetInverse() method that most transforms offer. Make sure that before you consider
following that dark path, you interact with the examples of resampling in order to get familiar with
the correct interpretation of the transforms.
9.3.2 Registration is done in physical space
The second common difficulty that users encounter with the ITK registration framework is related
to the fact that ITK performs registration in the context of physical space and not in the discrete
space of the image grid. Figure 9.6 show this concept by crossing the transform that goes between
the two image grids. One important consequence of this fact is that having the correct image origin
and image pixel size is fundamental for the success of the registration process in ITK. Users must
make sure that they provide correct values for the origin and spacing of both the fixed and moving
images.
A typical case that helps to understand this issue, is to consider the registration of two images where
one has a pixel size different from the other. For example, a SPOt 5 image and a QuickBird image.
Typically a Quickbird image will have a pixel size in the order of 0.6 m, while a SPOT 5 image will

9.4. Multi-Modality Registration 203
have a pixel size of 2.5 m.
A user performing registration between a SPOT 5 image and a Quickbird image may be naively
expecting that because the SPOT 5 image has less pixels, a scaling factor is required in the Transform
in order to map this image into the Quickbird image. At that point, this person is attempting to
interpret the registration process directly between the two image grids, or in pixel space. What ITK
will do in this case is to take into account the pixel size that the user has provided and it will use that
pixel size in order to compute a scaling factor for Transforms T1 and T2 in Figure 9.6. Since these
two transforms take care of the required scaling factor, the spatial Transform to be computed during
the registration process does not need to be concerned about such scaling. The transform that ITK
is computing is the one that will physically map the landscape the moving image into the landscape
of the fixed image.
In order to better understand this concepts, it is very useful to draw sketches of the fixed and moving
image at scale in the same physical coordinate system. That is the geometrical configuration that
the ITK registration framework uses as context. Keeping this in mind helps a lot for interpreting
correctly the results of a registration process performed with ITK.
9.4 Multi-Modality Registration
Some of the most challenging cases of image registration arise when images of different modalities
are involved. In such cases, metrics based on direct comparison of gray levels are not applicable.
It has been extensively shown that metrics based on the evaluation of mutual information are well
suited for overcoming the difficulties of multi-modality registration.
The concept of Mutual Information is derived from Information Theory and its application to image
registration has been proposed in different forms by different groups [29, 93, 138], a more detailed
review can be found in [70]. The OTB, through ITK, currently provides five different implementa-
tions of Mutual Information metrics (see section 9.7 for details). The following example illustrates
the practical use of some of these metrics.
9.4.1 Viola-Wells Mutual Information
The following simple example illustrates how multiple imaging modalities can be registered us-
ing the ITK registration framework. The first difference between this and previous examples is
the use of the itk::MutualInformationImageToImageMetric as the cost-function to be opti-
mized. The second difference is the use of the itk::GradientDescentOptimizer. Due to the
stochastic nature of the metric computation, the values are too noisy to work successfully with
the itk::RegularStepGradientDescentOptimizer. Therefore, we will use the simpler Gra-
dientDescentOptimizer with a user defined learning rate. The following headers declare the basic

components of this registration method.
#include "itkImageRegistrationMethod.h"
#include "itkMutualInformationImageToImageMetric.h"
#include "itkLinearInterpolateImageFunction.h"
#include "itkGradientDescentOptimizer.h"
One way to simplify the computation of the mutual information is to normalize the statistical distri-
bution of the two input images. The itk::NormalizeImageFilter is the perfect tool for this task.
It rescales the intensities of the input images in order to produce an output image with zero mean
and unit variance.
#include "itkNormalizeImageFilter.h"
Additionally, low-pass filtering of the images to be registered will also increase robustness against
noise. In this example, we will use the itk::DiscreteGaussianImageFilter for that purpose.
The characteristics of this filter have been discussed in Section 8.7.1.
#include "itkDiscreteGaussianImageFilter.h"
The moving and fixed images types should be instantiated first.
It is convenient to work with an internal image type because mutual information will perform better
on images with a normalized statistical distribution. The fixed and moving images will be normal-
ized and converted to this internal type.
typedef otb::Image <InternalPixelType , Dimension > InternalImageType;
The rest of the image registration components are instantiated as illustrated in Section 9.2 with the
use of the InternalImageType.
typedef itk::GradientDescentOptimizer OptimizerType;
typedef itk::LinearInterpolateImageFunction<InternalImageType ,
typedef itk::ImageRegistrationMethod <InternalImageType ,
InternalImageType > RegistrationType;
The mutual information metric type is instantiated using the image types.

typedef itk::MutualInformationImageToImageMetric<
InternalImageType ,
InternalImageType > MetricType ;
The metric is created using the New() method and then connected to the registration object.
registration ->SetMetric (metric);
The metric requires a number of parameters to be selected, including the standard deviation of
the Gaussian kernel for the fixed image density estimate, the standard deviation of the kernel for
the moving image density and the number of samples use to compute the densities and entropy
values. Details on the concepts behind the computation of the metric can be found in Section 9.7.4.
Experience has shown that a kernel standard deviation of 0.4 works well for images which have been
normalized to a mean of zero and unit variance. We will follow this empirical rule in this example.
metric ->SetFixedImageStandardDeviation(0.4);
metric ->SetMovingImageStandardDeviation(0.4);
The normalization filters are instantiated using the fixed and moving image types as input and the
internal image type as output.
typedef itk::NormalizeImageFilter <
FixedImageType ,
InternalImageType
> FixedNormalizeFilterType;
typedef itk::NormalizeImageFilter <
MovingImageType ,
InternalImageType
> MovingNormalizeFilterType;
FixedNormalizeFilterType::Pointer fixedNormalizer =
FixedNormalizeFilterType::New ();
MovingNormalizeFilterType::Pointer movingNormalizer =
MovingNormalizeFilterType::New();
The blurring filters are declared using the internal image type as both the input and output types. In
this example, we will set the variance for both blurring filters to 2.0.
typedef itk::DiscreteGaussianImageFilter<
InternalImageType ,
InternalImageType
> GaussianFilterType;
GaussianFilterType::Pointer fixedSmoother = GaussianFilterType::New();
GaussianFilterType::Pointer movingSmoother = GaussianFilterType::New();
fixedSmoother ->SetVariance (4.0);
movingSmoother ->SetVariance (4.0);

The output of the readers becomes the input to the normalization filters. The output of the normal-
ization filters is connected as input to the blurring filters. The input to the registration method is
taken from the blurring filters.
fixedNormalizer ->SetInput (fixedImageReader ->GetOutput ());
movingNormalizer ->SetInput (movingImageReader ->GetOutput ());
fixedSmoother ->SetInput(fixedNormalizer ->GetOutput ());
movingSmoother ->SetInput(movingNormalizer ->GetOutput ());
registration ->SetFixedImage(fixedSmoother ->GetOutput ());
registration ->SetMovingImage(movingSmoother ->GetOutput ());
We should now define the number of spatial samples to be considered in the metric computation.
Note that we were forced to postpone this setting until we had done the preprocessing of the images
because the number of samples is usually defined as a fraction of the total number of pixels in the
fixed image.
The number of spatial samples can usually be as low as 1% of the total number of pixels in the fixed
image. Increasing the number of samples improves the smoothness of the metric from one iteration
to another and therefore helps when this metric is used in conjunction with optimizers that rely of
the continuity of the metric values. The trade-off, of course, is that a larger number of samples result
in longer computation times per every evaluation of the metric.
It has been demonstrated empirically that the number of samples is not a critical parameter for the
registration process. When you start fine tuning your own registration process, you should start
using high values of number of samples, for example in the range of 20% to 50% of the number of
pixels in the fixed image. Once you have succeeded to register your images you can then reduce the
number of samples progressively until you find a good compromise on the time it takes to compute
one evaluation of the Metric. Note that it is not useful to have very fast evaluations of the Metric if
the noise in their values results in more iterations being required by the optimizer to converge.
const unsigned int numberOfPixels = fixedImageRegion.GetNumberOfPixels();
const unsigned int numberOfSamples =
static_cast<unsigned int>(numberOfPixels * 0.01);
metric ->SetNumberOfSpatialSamples(numberOfSamples);
Since larger values of mutual information indicate better matches than smaller values, we need to
maximize the cost function in this example. By default the GradientDescentOptimizer class is set to
minimize the value of the cost-function. It is therefore necessary to modify its default behavior by
invoking the MaximizeOn() method. Additionally, we need to define the optimizer’s step size using
the SetLearningRate() method.
optimizer ->SetLearningRate(150.0);
optimizer ->MaximizeOn ();

Figure 9.7: A SAR image (fixed image) and an aerial photograph (moving image) are provided as input to the
registration method.
Note that large values of the learning rate will make the optimizer unstable. Small values, on the
other hand, may result in the optimizer needing too many iterations in order to walk to the extrema of
the cost function. The easy way of fine tuning this parameter is to start with small values, probably
in the range of {5.0,10.0}. Once the other registration parameters have been tuned for producing
convergence, you may want to revisit the learning rate and start increasing its value until you observe
that the optimization becomes unstable. The ideal value for this parameter is the one that results in
a minimum number of iterations while still keeping a stable path on the parametric space of the
optimization. Keep in mind that this parameter is a multiplicative factor applied on the gradient of
the Metric. Therefore, its effect on the optimizer step length is proportional to the Metric values
themselves. Metrics with large values will require you to use smaller values for the learning rate in
order to maintain a similar optimizer behavior.
• RamsesROISmall.png
• ADS40RoiSmall.png
The moving image after resampling is presented on the left side of Figure 9.8. The center and
right figures present a checkerboard composite of the fixed and moving images before and after
registration. Since the real deformation between the 2 images is not simply a shift, some registration
errors remain, but the left part of the images is correctly registered.

Figure 9.8: Mapped moving image (left) and composition of ﬁxed and moving images before (center) and after
(right) registration.
9.5 Centered Transforms
The OTB/ITK image coordinate origin is typically located in one of the image corners (see section
5.1.4 for details). This results in counter-intuitive transform behavior when rotations and scaling are
involved. Users tend to assume that rotations and scaling are performed around a ﬁxed point at the
center of the image. In order to compensate for this difference in natural interpretation, the concept
of centered transforms have been introduced into the toolkit. The following sections describe the
main characteristics of such transforms.
9.5.1 Rigid Registration in 2D
This example illustrates the use of the itk::CenteredRigid2DTransform for performing rigid
registration in 2D. The example code is for the most part identical to that presented in Sec-
tion 9.2. The main difference is the use of the CenteredRigid2DTransform here instead of the
itk::TranslationTransform.
In addition to the headers included in previous examples, the following header must also be included.
#include "itkCenteredRigid2DTransform.h"
The transform type is instantiated using the code below. The only template parameter for this class
is the representation type of the space coordinates.
typedef itk::CenteredRigid2DTransform<double> TransformType;
The transform object is constructed below and passed to the registration method.

9.5. Centered Transforms 209
FixedFilterType::Pointer fixedFilter = FixedFilterType::New();
MovingImageType:: SizeType indexMRadius;
In this example, the input images are taken from readers. The code below updates the readers in
order to ensure that the image parameters (size, origin and spacing) are valid when used to initialize
the transform. We intend to use the center of the fixed image as the rotation center and then use the
vector between the fixed image center and the moving image center as the initial translation to be
applied after the rotation.
fixedImageReader ->Update ();
movingImageReader ->Update ();
The center of rotation is computed using the origin, size and spacing of the fixed image.
FixedImageType::Pointer fixedImage = fixedImageReader ->GetOutput ();

const SpacingType fixedSpacing = fixedImage ->GetSpacing ();
const OriginType fixedOrigin = fixedImage ->GetOrigin ();
const RegionType fixedRegion = fixedImage ->GetLargestPossibleRegion();
const SizeType fixedSize = fixedRegion .GetSize ();
TransformType::InputPointType centerFixed ;
centerFixed [0] = fixedOrigin [0] + fixedSpacing[0] * fixedSize [0] / 2.0;
centerFixed [1] = fixedOrigin [1] + fixedSpacing[1] * fixedSize [1] / 2.0;
The center of the moving image is computed in a similar way.
MovingImageType::Pointer movingImage = movingImageReader ->GetOutput ();
const SpacingType movingSpacing = movingImage ->GetSpacing ();
const OriginType movingOrigin = movingImage ->GetOrigin ();
const RegionType movingRegion = movingImage ->GetLargestPossibleRegion();
const SizeType movingSize = movingRegion.GetSize ();
TransformType::InputPointType centerMoving;
centerMoving[0] = movingOrigin[0] + movingSpacing[0] * movingSize [0] / 2.0;
centerMoving[1] = movingOrigin[1] + movingSpacing[1] * movingSize [1] / 2.0;
The most straightforward method of initializing the transform parameters is to configure the trans-
form and then get its parameters with the method GetParameters(). Here we initialize the trans-
form by passing the center of the fixed image as the rotation center with the SetCenter() method.
Then the translation is set as the vector relating the center of the moving image to the center of the
fixed image. This last vector is passed with the method SetTranslation().
transform ->SetCenter (centerFixed );
transform ->SetTranslation(centerMoving - centerFixed );
Let’s finally initialize the rotation with a zero angle.
transform ->SetAngle (0.0);
Now we pass the current transform’s parameters as the initial parameters to be used when the regis-
tration process starts.
registration ->SetInitialTransformParameters(transform ->GetParameters());
Keeping in mind that the scale of units in rotation and translation is quite different, we take advan-
tage of the scaling functionality provided by the optimizers. We know that the first element of the
parameters array corresponds to the angle that is measured in radians, while the other parameters
correspond to translations that are measured in the units of the spacin (pixels in our case). For this
reason we use small factors in the scales associated with translations and the coordinates of the
rotation center .

typedef OptimizerType::ScalesType OptimizerScalesType;
OptimizerScalesType optimizerScales(transform ->GetNumberOfParameters());
const double translationScale = 1.0 / 1000.0;
optimizerScales[0] = 1.0;
optimizerScales[1] = translationScale;
optimizer ->SetScales (optimizerScales);
Next we set the normal parameters of the optimization method. In this case we are using an
itk::RegularStepGradientDescentOptimizer. Below, we deﬁne the optimization parameters
like the relaxation factor, initial step length, minimal step length and number of iterations. These
last two act as stopping criteria for the optimization.
double initialStepLength = 0.1;
optimizer ->SetRelaxationFactor(0.6);
optimizer ->SetMaximumStepLength(initialStepLength);
• QB Suburb.png
• QB SuburbRotated10.png
The second image is the result of intentionally rotating the ﬁrst image by 10 degrees around the
geometrical center of the image. Both images have unit-spacing and are shown in Figure 9.9. The
registration takes 21 iterations and produces the results:
[0.176168, 134.515, 103.011, -0.00182313, 0.0717891]
These results are interpreted as
• Angle = 0.176168 radians
• Center = (134.515,103.011) pixels
• Translation = (−0.00182313,0.0717891) pixels

Figure 9.9: Fixed and moving images are provided as input to the registration method using the Centered-
Rigid2D transform.
Figure 9.10: Resampled moving image (left). Differences between the ﬁxed and moving images, before (cen-
ter) and after (right) registration using the CenteredRigid2D transform.

As expected, these values match the misalignment intentionally introduced into the moving image
quite well, since 10 degrees is about 0.174532 radians.
Figure 9.10 shows from left to right the resampled moving image after registration, the difference
between fixed and moving images before registration, and the difference between fixed and resam-
pled moving image after registration. It can be seen from the last difference image that the rotational
component has been solved but that a small centering misalignment persists.
Let’s now consider the case in which rotations and translations are present in the initial registration,
as in the following pair of images:
• QB Suburb.png
• QB SuburbR10X13Y17.png
The second image is the result of intentionally rotating the first image by 10 degrees and then trans-
lating it 13 pixels in X and 17 pixels in Y. Both images have unit-spacing and are shown in Figure
9.11. In order to accelerate convergence it is convenient to use a larger step length as shown here.
optimizer->SetMaximumStepLength( 1.0 );
The registration now takes 34 iterations and produces the following results:
[0.176125, 135.553, 102.159, -11.9102, -15.8045]
These parameters are interpreted as
• Angle = 0.176125 radians
• Center = (135.553,102.159) millimeters
• Translation = (−11.9102,−15.8045) millimeters
These values approximately match the initial misalignment intentionally introduced into the moving
image, since 10 degrees is about 0.174532 radians. The horizontal translation is well resolved while
the vertical translation ends up being off by about one millimeter.
Figure 9.12 shows the output of the registration. The rightmost image of this figure shows the
difference between the fixed image and the resampled moving image after registration.
9.5.2 Centered Affine Transform

Figure 9.11: Fixed and moving images provided as input to the registration method using the CenteredRigid2D
transform.
Figure 9.12: Resampled moving image (left). Differences between the ﬁxed and moving images, before (cen-
ter) and after (right) registration with the CenteredRigid2D transform.

This example illustrates the use of the itk::AffineTransform for performing registration. The
example code is, for the most part, identical to previous ones. The main difference is the use of
the AffineTransform here instead of the itk::CenteredRigid2DTransform. We will focus on the
most relevant changes in the current code and skip the basic elements already explained in previous
examples.
Let’s start by including the header file of the AffineTransform.
#include "itkAffineTransform.h"
We define then the types of the images to be registered.
The transform type is instantiated using the code below. The template parameters of this class are
the representation type of the space coordinates and the space dimension.
typedef itk::AffineTransform <
double,
Dimension > TransformType;
The transform object is constructed below and passed to the registration method.
FixedFilterType::Pointer fixedFilter = FixedFilterType::New();
MovingImageType:: SizeType indexMRadius;

In this example, we use the itk::CenteredTransformInitializer helper class in order to com-
pute a reasonable value for the initial center of rotation and the translation. The initializer is set to
use the center of mass of each image as the initial correspondence correction.
typedef itk::CenteredTransformInitializer<
TransformType ,
FixedImageType ,
MovingImageType > TransformInitializerType;
TransformInitializerType::Pointer initializer = TransformInitializerType::New ();
initializer ->SetTransform(transform );
initializer ->SetFixedImage(fixedImageReader ->GetOutput ());
initializer ->SetMovingImage(movingImageReader ->GetOutput ());
initializer ->MomentsOn ();
initializer ->InitializeTransform();
Now we pass the parameters of the current transform as the initial parameters to be used when the
registration process starts.
registration ->SetInitialTransformParameters(
transform ->GetParameters());
Keeping in mind that the scale of units in scaling, rotation and translation are quite different, we take
advantage of the scaling functionality provided by the optimizers. We know that the ﬁrst N ×N ele-
ments of the parameters array correspond to the rotation matrix factor, the next N correspond to the
rotation center, and the last N are the components of the translation to be applied after multiplication
with the matrix is performed.
typedef OptimizerType::ScalesType OptimizerScalesType;
OptimizerScalesType optimizerScales(transform ->GetNumberOfParameters());
optimizer ->SetScales (optimizerScales);

We also set the usual parameters of the optimization method. In this case we are using an
itk::RegularStepGradientDescentOptimizer. Below, we define the optimization parameters
like initial step length, minimal step length and number of iterations. These last two act as stopping
criteria for the optimization.
optimizer ->SetMaximumStepLength(steplength );
optimizer ->SetNumberOfIterations(maxNumberOfIterations);
We also set the optimizer to do minimization by calling the MinimizeOn() method.
optimizer ->MinimizeOn ();
Finally we trigger the execution of the registration method by calling the Update() method. The
call is placed in a try/catch block in case any exceptions are thrown.
try
{
registration ->Update ();
}
{
return -1;
}
Once the optimization converges, we recover the parameters from the registration method. This
is done with the GetLastTransformParameters() method. We can also recover the final
value of the metric with the GetValue() method and the final number of iterations with the
GetCurrentIteration() method.
OptimizerType::ParametersType finalParameters =
registration ->GetLastTransformParameters();
const double finalRotationCenterX = transform ->GetCenter ()[0];
const double finalRotationCenterY = transform ->GetCenter ()[1];
const double finalTranslationX = finalParameters[4];
const double finalTranslationY = finalParameters[5];
const unsigned int numberOfIterations = optimizer ->GetCurrentIteration();
const double bestValue = optimizer ->GetValue ();
• QB Suburb.png
• QB SuburbR10X13Y17.png

Figure 9.13: Fixed and moving images provided as input to the registration method using the AffineTransform.
The second image is the result of intentionally rotating the first image by 10 degrees and then trans-
lating by (13,17). Both images have unit-spacing and are shown in Figure 9.13. We execute the code
using the following parameters: step length=1.0, translation scale= 0.0001 and maximum number of
iterations = 300. With these images and parameters the registration takes 83 iterations and produces
20.2134 [0.983291, -0.173507, 0.174626, 0.983028, -12.1899, -16.0882]
These results are interpreted as
• Iterations = 83
• Final Metric = 20.2134
• Center = (134.152,104.067) pixels
• Translation = (−12.1899,−16.0882) pixels
• Affine scales = (0.999024,0.997875)
The second component of the matrix values is usually associated with sinθ. We obtain the rota-
tion through SVD of the affine matrix. The value is 10.0401 degrees, which is approximately the
intentional misalignment of 10.0 degrees.
Figure 9.14 shows the output of the registration. The right most image of this figure shows the
squared magnitude difference between the fixed image and the resampled moving image.

9.6. Transforms 219
Figure 9.14: The resampled moving image (left), and the difference between the fixed and moving images
before (center) and after (right) registration with the AffineTransform transform.
Vector
Covariant
Vectors
Point
Figure 9.15: Geometric representation objects in ITK.
9.6 Transforms
In OTB, we use the Insight Toolkit itk::Transform objects encapsulate the mapping of points
and vectors from an input space to an output space. If a transform is invertible, back transform
methods are also provided. Currently, ITK provides a variety of transforms from simple translation,
rotation and scaling to general affine and kernel transforms. Note that, while in this section we
discuss transforms in the context of registration, transforms are general and can be used for other
applications. Some of the most commonly used transforms will be discussed in detail later. Let’s
begin by introducing the objects used in ITK for representing basic spatial concepts.
9.6.1 Geometrical Representation
ITK implements a consistent geometric representation of the space. The characteristics of classes
involved in this representation are summarized in Table 9.1. In this regard, ITK takes full advantage
of the capabilities of Object Oriented programming and resists the temptation of using simple arrays
of float or double in order to represent geometrical objects. The use of basic arrays would have
blurred the important distinction between the different geometrical concepts and would have allowed
for the innumerable conceptual and programming errors that result from using a vector where a point

Class Geometrical concept
itk::Point Position in space. In N-dimensional space it is represented
by an array of N numbers associated with space coordinates.
itk::Vector Relative position between two points. In N-dimensional
space it is represented by an array of N numbers, each one
associated with the distance along a coordinate axis. Vec-
tors do not have a position in space. A vector is defined as
the subtraction of two points.
itk::CovariantVector Orthogonal direction to a (N − 1)-dimensional manifold in
space. For example, in 3D it corresponds to the vector or-
thogonal to a surface. This is the appropriate class for rep-
resenting Gradients of functions. Covariant vectors do not
have a position in space. Covariant vector should not be
added to Points, nor to Vectors.
Table 9.1: Summary of objects representing geometrical concepts in ITK.
is needed or vice versa.
Additional uses of the itk::Point, itk::Vector and itk::CovariantVector classes have been
discussed in Chapter 5. Each one of these classes behaves differently under spatial transformations.
It is therefore quite important to keep their distinction clear. Figure 9.15 illustrates the differences
between these concepts.
Transform classes provide different methods for mapping each one of the basic space-
representation objects. Points, vectors and covariant vectors are transformed using the methods
TransformPoint(), TransformVector() and TransformCovariantVector() respectively.
One of the classes that deserve further comments is the itk::Vector. This ITK class tend to
be misinterpreted as a container of elements instead of a geometrical object. This is a common
misconception originated by the fact that Computer Scientist and Software Engineers misuse the
term “Vector”. The actual word “Vector” is relatively young. It was coined by William Hamilton
in his book “Elements of Quaternions” published in 1886 (post-mortem)[56]. In the same text
Hamilton coined the terms: “Scalar”, “Versor” and “Tensor”. Although the modern term of “Tensor”
is used in Calculus in a different sense of what Hamilton defined in his book at the time [41].
A “Vector” is, by definition, a mathematical object that embodies the concept of “direction in space”.
Strictly speaking, a Vector describes the relationship between two Points in space, and captures both
their relative distance and orientation.
Computer scientists and software engineers misused the term vector in order to represent the concept
of an “Indexed Set” [8]. Mechanical Engineers and Civil Engineers, who deal with the real world
of physical objects will not commit this mistake and will keep the word “Vector” attached to a
geometrical concept. Biologists, on the other hand, will associate “Vector” to a “vehicle” that allows

9.6. Transforms 221
them to direct something in a particular direction, for example, a virus that allows them to insert
pieces of code into a DNA strand [90].
Textbooks in programming do not help to clarify those concepts and loosely use the term “Vector”
for the purpose of representing an “enumerated set of common elements”. STL follows this trend
and continue using the word “Vector” for what it was not supposed to be used [8, 3]. Linear algebra
separates the “Vector” from its notion of geometric reality and makes it an abstract set of numbers
with arithmetic operations associated.
For those of you who are looking for the “Vector” in the Software Engineering sense, please look
at the itk::Array and itk::FixedArray classes that actually provide such functionalities. Ad-
ditionally, the itk::VectorContainer and itk::MapContainer classes may be of interest too.
These container classes are intended for algorithms that require to insert and delete elements, and
that may have large numbers of elements.
The Insight Toolkit deals with real objects that inhabit the physical space. This is particularly true
in the context of the image registration framework. We chose to give the appropriate name to the
mathematical objects that describe geometrical relationships in N-Dimensional space. It is for this
reason that we explicitly make clear the distinction between Point, Vector and CovariantVector,
despite the fact that most people would be happy with a simple use of double[3] for the three
concepts and then will proceed to perform all sort of conceptually flawed operations such as
• Adding two Points
• Dividing a Point by a Scalar
• Adding a Covariant Vector to a Point
• Adding a Covariant Vector to a Vector
In order to enforce the correct use of the Geometrical concepts in ITK we organized these classes
in a hierarchy that supports reuse of code and yet compartmentalize the behavior of the individual
classes. The use of the itk::FixedArray as base class of the itk::Point, the itk::Vector and
the itk::CovariantVector was a design decision based on calling things by their correct name.
An itk::FixedArray is an enumerated collection with a fixed number of elements. You can
instantiate a fixed array of letters, or a fixed array of images, or a fixed array of transforms, or a
fixed array of geometrical shapes. Therefore, the FixedArray only implements the functionality that
is necessary to access those enumerated elements. No assumptions can be made at this point on
any other operations required by the elements of the FixedArray, except the fact of having a default
constructor.
The itk::Point is a type that represents the spatial coordinates of a spatial location. Based on
geometrical concepts we defined the valid operations of the Point class. In particular we made
sure that no operator+() was defined between Points, and that no operator*( scalar ) nor
operator/( scalar ) were defined for Points.
In other words, you could do in ITK operations such as:

• Vector = Point - Point
• Point += Vector
• Point -= Vector
• Point = BarycentricCombination( Point, Point )
and you cannot (because you should not) do operation such as
• Point = Point * Scalar
• Point = Point + Point
• Point = Point / Scalar
The itk::Vector is, by Hamilton’s definition, the subtraction between two points. Therefore a
Vector must satisfy the following basic operations:
• Vector = Point - Point
• Point = Point + Vector
• Point = Point - Vector
• Vector = Vector + Vector
• Vector = Vector - Vector
An itk::Vector object is intended to be instantiated over elements that support mathematical
operation such as addition, subtraction and multiplication by scalars.
9.6.2 Transform General Properties
Each transform class typically has several methods for setting its parameters. For example,
itk::Euler2DTransform provides methods for specifying the offset, angle, and the entire rota-
tion matrix. However, for use in the registration framework, the parameters are represented by a
flat Array of doubles to facilitate communication with generic optimizers. In the case of the Eu-
ler2DTransform, the transform is also defined by three doubles: the first representing the angle,
and the last two the offset. The flat array of parameters is defined using SetParameters(). A
description of the parameters and their ordering is documented in the sections that follow.
In the context of registration, the transform parameters define the search space for optimizers. That
is, the goal of the optimization is to find the set of parameters defining a transform that results in
the best possible value of an image metric. The more parameters a transform has, the longer its

9.6. Transforms 223
Behavior Number of
Parameters
Parameter
Ordering
Restrictions
Maps every point to
itself, every vector to
itself and every co-
variant vector to it-
self.
0 NA Only defined when the in-
put and output space has the
same number of dimensions.
Table 9.2: Characteristics of the identity transform.
computational time will be when used in a registration method since the dimension of the search
space will be equal to the number of transform parameters.
Another requirement that the registration framework imposes on the transform classes is the com-
putation of their Jacobians. In general, metrics require the knowledge of the Jacobian in order to
compute Metric derivatives. The Jacobian is a matrix whose element are the partial derivatives of
the output point with respect to the array of parameters that defines the transform:2
J =






∂x1
∂p1
∂x1
∂p2
··· ∂x1
∂pm
∂x2
∂p1
∂x2
∂p2
··· ∂x2
∂pm
...
...
...
...
∂xn
∂p1
∂xn
∂p2
··· ∂xn
∂pm






(9.1)
where {pi} are the transform parameters and {xi} are the coordinates of the output point. Within
this framework, the Jacobian is represented by an itk::Array2D of doubles and is obtained from
the transform by method GetJacobian(). The Jacobian can be interpreted as a matrix that indicates
for a point in the input space how much its mapping on the output space will change as a response
to a small variation in one of the transform parameters. Note that the values of the Jacobian matrix
depend on the point in the input space. So actually the Jacobian can be noted as J(X), where
X = {xi}. The use of transform Jacobians enables the efficient computation of metric derivatives.
When Jacobians are not available, metrics derivatives have to be computed using finite difference at
a price of 2M evaluations of the metric value, where M is the number of transform parameters.
The following sections describe the main characteristics of the transform classes available in ITK.
9.6.3 Identity Transform
The identity transform itk::IdentityTransform is mainly used for debugging purposes. It is
provided to methods that require a transform and in cases where we want to have the certainty
2Note that the term Jacobian is also commonly used for the matrix representing the derivatives of output point coordinates
with respect to input point coordinates. Sometimes the term is loosely used to refer to the determinant of such a matrix. [41]

Behavior Number of
Parameters
Parameter
Ordering
Restrictions
Represents a simple
translation of points
in the input space and
has no effect on vec-
tors or covariant vec-
tors.
Same as the
input space
dimension.
The i-th parame-
ter represents the
translation in the
i-th dimension.
Only deﬁned when the in-
Table 9.3: Characteristics of the TranslationTransform class.
that the transform will have no effect whatsoever in the outcome of the process. It is just a NULL
operation. The main characteristics of the identity transform are summarized in Table 9.2
9.6.4 Translation Transform
The itk::TranslationTransform is probably the simplest yet one of the most useful transforma-
tions. It maps all Points by adding a Vector to them. Vector and covariant vectors remain unchanged
under this transformation since they are not associated with a particular position in space. Trans-
lation is the best transform to use when starting a registration method. Before attempting to solve
for rotations or scaling it is important to overlap the anatomical objects in both images as much as
possible. This is done by resolving the translational misalignment between the images. Translations
also have the advantage of being fast to compute and having parameters that are easy to interpret.
The main characteristics of the translation transform are presented in Table 9.3.
9.6.5 Scale Transform
The itk::ScaleTransform represents a simple scaling of the vector space. Different scaling
factors can be applied along each dimension. Points are transformed by multiplying each one of their
coordinates by the corresponding scale factor for the dimension. Vectors are transformed in the same
way as points. Covariant vectors, on the other hand, are transformed differently since anisotropic
scaling does not preserve angles. Covariant vectors are transformed by dividing their components
by the scale factor of the corresponding dimension. In this way, if a covariant vector was orthogonal
to a vector, this orthogonality will be preserved after the transformation. The following equations
summarize the effect of the transform on the basic geometric objects.
Point P′ = T(P) : P′
i = Pi ·Si
Vector V′ = T(V) : V′
i = Vi ·Si
CovariantVector C′
= T(C) : C′
i = Ci/Si
(9.2)

9.6. Transforms 225
Behavior Number of
Parameters
Parameter
Ordering
Restrictions
Points are trans-
formed by multi-
plying each one of
their coordinates by
the corresponding
scale factor for the
dimension. Vectors
are transformed as
points. Covariant
vectors are trans-
formed by dividing
their components by
the scale factor in
the corresponding
dimension.
Same as the
input space
dimension.
The i-th parame-
ter represents the
scaling in the i-th
dimension.
Table 9.4: Characteristics of the ScaleTransform class.
where Pi, Vi and Ci are the point, vector and covariant vector i-th components while Si is the scaling
factor along dimension i−th. The following equation illustrates the effect of the scaling transform
on a 3D point.


x′
y′
z′

 =


S1 0 0
0 S2 0
0 0 S3

·


x
y
z

 (9.3)
Scaling appears to be a simple transformation but there are actually a number of issues to keep
in mind when using different scale factors along every dimension. There are subtle effects—for
example, when computing image derivatives. Since derivatives are represented by covariant vectors,
their values are not intuitively modified by scaling transforms.
One of the difficulties with managing scaling transforms in a registration process is that typical opti-
mizers manage the parameter space as a vector space where addition is the basic operation. Scaling
is better treated in the frame of a logarithmic space where additions result in regular multiplicative
increments of the scale. Gradient descent optimizers have trouble updating step length, since the
effect of an additive increment on a scale factor diminishes as the factor grows. In other words, a
scale factor variation of (1.0 + ε) is quite different from a scale variation of (5.0 + ε).
Registrations involving scale transforms require careful monitoring of the optimizer parameters in
order to keep it progressing at a stable pace. Note that some of the transforms discussed in following
sections, for example, the AffineTransform, have hidden scaling parameters and are therefore subject

Behavior Number of
Parameters
Parameter
Ordering
Restrictions
Points are trans-
formed by multi-
plying each one of
their coordinates by
the corresponding
scale factor for the
dimension. Vectors
are transformed as
points. Covariant
vectors are trans-
formed by dividing
their components by
the scale factor in
the corresponding
dimension.
Same as the
input space
dimension.
The i-th parame-
ter represents the
scaling in the i-th
dimension.
put and output space has
the same number of dimen-
sions. The difference be-
tween this transform and
the ScaleTransform is that
here the scaling factors are
passed as logarithms, in this
way their behavior is closer
to the one of a Vector space.
Table 9.5: Characteristics of the ScaleLogarithmicTransform class.
to the same vulnerabilities of the ScaleTransform.
In cases involving misalignments with simultaneous translation, rotation and scaling components it
may be desirable to solve for these components independently. The main characteristics of the scale
transform are presented in Table 9.4.
9.6.6 Scale Logarithmic Transform
The itk::ScaleLogarithmicTransform is a simple variation of the itk::ScaleTransform. It
is intended to improve the behavior of the scaling parameters when they are modiﬁed by optimiz-
ers. The difference between this transform and the ScaleTransform is that the parameter factors
are passed here as logarithms. In this way, multiplicative variations in the scale become additive
variations in the logarithm of the scaling factors.
9.6.7 Euler2DTransform
itk::Euler2DTransform implements a rigid transformation in 2D. It is composed of a plane
rotation and a two-dimensional translation. The rotation is applied ﬁrst, followed by the translation.
The following equation illustrates the effect of this transform on a 2D point,

9.6. Transforms 227
Behavior Number of
Parameters
Parameter
Ordering
Restrictions
Represents a 2D rota-
tion and a 2D trans-
lation. Note that
the translation com-
ponent has no effect
on the transformation
of vectors and covari-
ant vectors.
3 The first param-
eter is the angle
in radians and the
last two parame-
ters are the trans-
lation in each di-
mension.
Only defined for two-
dimensional input and
output spaces.
Table 9.6: Characteristics of the Euler2DTransform class.
x′
y′ =
cosθ −sinθ
sinθ cosθ
·
x
y
+
Tx
Ty
(9.4)
where θ is the rotation angle and (Tx,Ty) are the components of the translation.
A challenging aspect of this transformation is the fact that translations and rotations do not form
a vector space and cannot be managed as linear independent parameters. Typical optimizers make
the loose assumption that parameters exist in a vector space and rely on the step length to be small
enough for this assumption to hold approximately.
In addition to the non-linearity of the parameter space, the most common difficulty found when using
this transform is the difference in units used for rotations and translations. Rotations are measured
in radians; hence, their values are in the range [−π,π]. Translations are measured in millimeters and
their actual values vary depending on the image modality being considered. In practice, translations
have values on the order of 10 to 100. This scale difference between the rotation and translation
parameters is undesirable for gradient descent optimizers because they deviate from the trajectories
of descent and make optimization slower and more unstable. In order to compensate for these
differences, ITK optimizers accept an array of scale values that are used to normalize the parameter
space.
Registrations involving angles and translations should take advantage of the scale normalization
functionality in order to obtain the best performance out of the optimizers. The main characteristics
of the Euler2DTransform class are presented in Table 9.6.
9.6.8 CenteredRigid2DTransform
itk::CenteredRigid2DTransform implements a rigid transformation in 2D. The main difference
between this transform and the itk::Euler2DTransform is that here we can specify an arbitrary
center of rotation, while the Euler2DTransform always uses the origin of the coordinate system as

Behavior Number of
Parameters
Parameter
Ordering
Restrictions
Represents a 2D ro-
tation around a user-
provided center fol-
lowed by a 2D trans-
lation.
5 The first parame-
ter is the angle in
radians. Second
and third are the
center of rota-
tion coordinates
and the last two
parameters are
the translation in
each dimension.
output spaces.
Table 9.7: Characteristics of the CenteredRigid2DTransform class.
the center of rotation. This distinction is quite important in image registration since ITK images usu-
ally have their origin in the corner of the image rather than the middle. Rotational mis-registrations
usually exist, however, as rotations around the center of the image, or at least as rotations around a
point in the middle of the anatomical structure captured by the image. Using gradient descent opti-
mizers, it is almost impossible to solve non-origin rotations using a transform with origin rotations
since the deep basin of the real solution is usually located across a high ridge in the topography of
the cost function.
In practice, the user must supply the center of rotation in the input space, the angle of rotation
and a translation to be applied after the rotation. With these parameters, the transform initializes a
rotation matrix and a translation vector that together perform the equivalent of translating the center
of rotation to the origin of coordinates, rotating by the specified angle, translating back to the center
of rotation and finally translating by the user-specified vector.
As with the Euler2DTransform, this transform suffers from the difference in units used for rotations
and translations. Rotations are measured in radians; hence, their values are in the range [−π,π].
The center of rotation and the translations are measured in millimeters, and their actual values vary
depending on the image modality being considered. Registrations involving angles and translations
should take advantage of the scale normalization functionality of the optimizers in order to get the
best performance out of them.
The following equation illustrates the effect of the transform on an input point (x,y) that maps to the
output point (x′,y′),
x′
y′ =
cosθ −sinθ
sinθ cosθ
·
x−Cx
y−Cy
+
Tx +Cx
Ty +Cy
(9.5)
where θ is the rotation angle, (Cx,Cy) are the coordinates of the rotation center and (Tx,Ty) are the
components of the translation. Note that the center coordinates are subtracted before the rotation and

9.6. Transforms 229
Behavior Number of
Parameters
Parameter
Ordering
Restrictions
Represents a 2D ro-
tation, homogeneous
scaling and a 2D
translation. Note that
the translation com-
ponent has no effect
on the transformation
of vectors and covari-
ant vectors.
4 The ﬁrst pa-
rameter is the
scaling factor for
all dimensions,
the second is the
angle in radians,
and the last
two parameters
are the transla-
tions in (x,y)
respectively.
output spaces.
Table 9.8: Characteristics of the Similarity2DTransform class.
added back after the rotation. The main features of the CenteredRigid2DTransform are presented in
Table 9.7.
9.6.9 Similarity2DTransform
The itk::Similarity2DTransform can be seen as a rigid transform combined with an isotropic
scaling factor. This transform preserves angles between lines. In its 2D implementation, the four
parameters of this transformation combine the characteristics of the itk::ScaleTransform and
itk::Euler2DTransform. In particular, those relating to the non-linearity of the parameter space
and the non-uniformity of the measurement units. Gradient descent optimizers should be used with
caution on such parameter spaces since the notions of gradient direction and step length are ill-
deﬁned.
The following equation illustrates the effect of the transform on an input point (x,y) that maps to the
output point (x′,y′),
x′
y′ =
λ 0
0 λ
·
cosθ −sinθ
sinθ cosθ
·
x−Cx
y−Cy
+
Tx +Cx
Ty +Cy
(9.6)
where λ is the scale factor, θ is the rotation angle, (Cx,Cy) are the coordinates of the rotation center
and (Tx,Ty) are the components of the translation. Note that the center coordinates are subtracted
before the rotation and scaling, and they are added back afterwards. The main features of the Simi-
larity2DTransform are presented in Table 9.8.
A possible approach for controlling optimization in the parameter space of this transform is to dy-
namically modify the array of scales passed to the optimizer. The effect produced by the parameter

Behavior Number of
Parameters
Parameter
Ordering
Restrictions
Represents a 3D rotation and
a 3D translation. The rota-
tion is specified as a quater-
nion, defined by a set of four
numbers q. The relationship
between quaternion and ro-
tation about vector n by an-
gle θ is as follows:
q = (nsin(θ/2),cos(θ/2))
Note that if the quaternion
is not of unit length, scaling
will also result.
7 The first four pa-
rameters defines
the quaternion
and the last three
parameters the
translation in
each dimension.
Only defined for
three-dimensional
input and output
spaces.
Table 9.9: Characteristics of the QuaternionRigidTransform class.
scaling can be used to steer the walk in the parameter space (by giving preference to some of the
parameters over others). For example, perform some iterations updating only the rotation angle, then
balance the array of scale factors in the optimizer and perform another set of iterations updating only
the translations.
9.6.10 QuaternionRigidTransform
The itk::QuaternionRigidTransform class implements a rigid transformation in 3D space. The
rotational part of the transform is represented using a quaternion while the translation is represented
with a vector. Quaternions components do not form a vector space and hence raise the same concerns
as the itk::Similarity2DTransform when used with gradient descent optimizers.
The itk::QuaternionRigidTransformGradientDescentOptimizer was introduced into the
toolkit to address these concerns. This specialized optimizer implements a variation of a gradi-
ent descent algorithm adapted for a quaternion space. This class insures that after advancing in any
direction on the parameter space, the resulting set of transform parameters is mapped back into the
permissible set of parameters. In practice, this comes down to normalizing the newly-computed
quaternion to make sure that the transformation remains rigid and no scaling is applied. The main
characteristics of the QuaternionRigidTransform are presented in Table 9.9.
The Quaternion rigid transform also accepts a user-defined center of rotation. In this way, the trans-
form can easily be used for registering images where the rotation is mostly relative to the center
of the image instead one of the corners. The coordinates of this rotation center are not subject to

9.6. Transforms 231
Behavior Number of
Parameters
Parameter
Ordering
Restrictions
Represents a 3D ro-
tation. The rotation
is specified by a ver-
sor or unit quater-
nion. The rotation
is performed around
a user-specified cen-
ter of rotation.
3 The three param-
eters define the
versor.
Only defined for three-
output spaces.
Table 9.10: Characteristics of the Versor Transform
optimization. They only participate in the computation of the mappings for Points and in the com-
putation of the Jacobian. The transformations for Vectors and CovariantVector are not affected by
the selection of the rotation center.
9.6.11 VersorTransform
By definition, a Versor is the rotational part of a Quaternion. It can also be defined as a unit-
quaternion [56, 75]. Versors only have three independent components, since they are restricted to
reside in the space of unit-quaternions. The implementation of versors in the toolkit uses a set of
three numbers. These three numbers correspond to the first three components of a quaternion. The
fourth component of the quaternion is computed internally such that the quaternion is of unit length.
The main characteristics of the itk::VersorTransform are presented in Table 9.10.
This transform exclusively represents rotations in 3D. It is intended to rapidly solve the rotational
component of a more general misalignment. The efficiency of this transform comes from using a
parameter space of reduced dimensionality. Versors are the best possible representation for rotations
in 3D space. Sequences of versors allow the creation of smooth rotational trajectories; for this
reason, they behave stably under optimization methods.
The space formed by versor parameters is not a vector space. Standard gradient descent algorithms
are not appropriate for exploring this parameter space. An optimizer specialized for the versor space
is available in the toolkit under the name of itk::VersorTransformOptimizer. This optimizer
implements versor derivatives as originally defined by Hamilton [56].
The center of rotation can be specified by the user with the SetCenter() method. The center is not
part of the parameters to be optimized, therefore it remains the same during an optimization process.
Its value is used during the computations for transforming Points and when computing the Jacobian.

Behavior Number of
Parameters
Parameter
Ordering
Restrictions
Represents a 3D rota-
tion and a 3D trans-
lation. The rotation
is specified by a ver-
sor or unit quater-
nion, while the trans-
lation is represented
by a vector. Users
can specify the coor-
dinates of the center
of rotation.
6 The first three
parameters define
the versor and
the last three
parameters the
translation in
each dimension.
output spaces.
Table 9.11: Characteristics of the VersorRigid3DTransform class.
9.6.12 VersorRigid3DTransform
The itk::VersorRigid3DTransform implements a rigid transformation in 3D space. It is a vari-
ant of the itk::QuaternionRigidTransform and the itk::VersorTransform. It can be seen as
a itk::VersorTransform plus a translation defined by a vector. The advantage of this class with
respect to the QuaternionRigidTransform is that it exposes only six parameters, three for the versor
components and three for the translational components. This reduces the search space for the op-
timizer to six dimensions instead of the seven dimensional used by the QuaternionRigidTransform.
This transform also allows the users to set a specific center of rotation. The center coordinates are
not modified during the optimization performed in a registration process. The main features of this
transform are summarized in Table 9.11. This transform is probably the best option to use when
dealing with rigid transformations in 3D.
Given that the space of Versors is not a Vector space, typical gradient descent opti-
mizers are not well suited for exploring the parametric space of this transform. The
itk::VersorRigid3DTranformOptimizer has been introduced in the ITK toolkit with the pur-
pose of providing an optimizer that is aware of the Versor space properties on the rotational part of
this transform, as well as the Vector space properties on the translational part of the transform.
9.6.13 Euler3DTransform
The itk::Euler3DTransform implements a rigid transformation in 3D space. It can be seen as
a rotation followed by a translation. This class exposes six parameters, three for the Euler angles
that represent the rotation and three for the translational components. This transform also allows the
users to set a specific center of rotation. The center coordinates are not modified during the opti-
mization performed in a registration process. The main features of this transform are summarized in

9.6. Transforms 233
Behavior Number of
Parameters
Parameter
Ordering
Restrictions
Represents a rigid ro-
tation in 3D space.
That is, a rotation fol-
lowed by a 3D trans-
lation. The rotation is
specified by three an-
gles representing ro-
tations to be applied
around the X, Y and
Z axis one after an-
other. The translation
part is represented by
a Vector. Users can
also specify the coor-
dinates of the center
of rotation.
6 The first three
parameters are
the rotation an-
gles around X, Y
and Z axis, and
the last three pa-
rameters are the
translations along
each dimension.
output spaces.
Table 9.12: Characteristics of the Euler3DTransform class.
Table 9.12.
The fact that the three rotational parameters are non-linear and do not behave like Vector spaces must
be taken into account when selecting an optimizer to work with this transform and when fine tuning
the parameters of such optimizer. It is strongly recommended to use this transform by introducing
very small variations on the rotational components. A small rotation will be in the range of 1 degree,
which in radians is approximately 0.0.1745.
You should not expect this transform to be able to compensate for large rotations just by being driven
with the optimizer. In practice you must provide a reasonable initialization of the transform angles
and only need to correct for residual rotations in the order of 10 or 20 degrees.
9.6.14 Similarity3DTransform
The itk::Similarity3DTransform implements a similarity transformation in 3D space. It can
be seen as an homogeneous scaling followed by a itk::VersorRigid3DTransform. This class
exposes seven parameters, one for the scaling factor, three for the versor components and three for
the translational components. This transform also allows the users to set a specific center of rotation.
The center coordinates are not modified during the optimization performed in a registration process.
Both the rotation and scaling operations are performed with respect to the center of rotation. The
main features of this transform are summarized in Table 9.13.

Behavior Number of
Parameters
Parameter
Ordering
Restrictions
Represents a 3D ro-
tation, a 3D trans-
lation and homoge-
neous scaling. The
scaling factor is spec-
ified by a scalar, the
rotation is specified
by a versor, and the
translation is repre-
sented by a vector.
Users can also spec-
ify the coordinates of
the center of rotation,
that is the same cen-
ter used for scaling.
7 The first parame-
ter is the scaling
factor, the next
three parameters
define the versor
and the last three
parameters the
translation in
each dimension.
output spaces.
Table 9.13: Characteristics of the Similarity3DTransform class.
The fact that the scaling and rotational spaces are non-linear and do not behave like Vector spaces
must be taken into account when selecting an optimizer to work with this transform and when fine
tuning the parameters of such optimizer.
9.6.15 Rigid3DPerspectiveTransform
The itk::Rigid3DPerspectiveTransform implements a rigid transformation in 3D space fol-
lowed by a perspective projection. This transform is intended to be used in 3D/2D registration
problems where a 3D object is projected onto a 2D plane. This is the case of Fluoroscopic images
used for image guided intervention, and it is also the case for classical radiography. Users must
provide a value for the focal distance to be used during the computation of the perspective trans-
form. This transform also allows users to set a specific center of rotation. The center coordinates
are not modified during the optimization performed in a registration process. The main features of
this transform are summarized in Table 9.14. This transform is also used when creating Digitally
Reconstructed Radiographs (DRRs).
The strategies for optimizing the parameters of this transform are the same ones used for optimiz-
ing the VersorRigid3DTransform. In particular, you can use the same VersorRigid3DTranform-
Optimizer in order to optimize the parameters of this class.

9.6. Transforms 235
Behavior Number of
Parameters
Parameter
Ordering
Restrictions
Represents a rigid
3D transformation
followed by a per-
spective projection.
The rotation is spec-
ified by a Versor,
while the translation
is represented by a
Vector. Users can
specify the coordi-
nates of the center of
rotation. They must
specifically a focal
distance to be used
for the perspective
projection. The
rotation center and
the focal distance
parameters are not
modified during the
optimization process.
6 The first three
parameters define
the Versor and
the last three
parameters the
Translation in
each dimension.
two-dimensional output
spaces. This is one of the
few transforms where the
input space has a different
dimension from the output
space.
Table 9.14: Characteristics of the Rigid3DPerspectiveTransform class.

Behavior Number of
Parameters
Parameter
Ordering
Restrictions
Represents an affine
transform composed
of rotation, scaling,
shearing and transla-
tion. The transform
is specified by a N ×
N matrix and a N × 1
vector where N is the
space dimension.
(N + 1)× N The first N × N
parameters define
the matrix in
column-major
order (where
the column in-
dex varies the
fastest). The last
N parameters
define the trans-
lations for each
dimension.
Only defined when the input
and output space have the
same dimension.
Table 9.15: Characteristics of the AffineTransform class.
9.6.16 AffineTransform
The itk::AffineTransform is one of the most popular transformations used for image registra-
tion. Its main advantage comes from the fact that it is represented as a linear transformation. The
main features of this transform are presented in Table 9.15.
The set of AffineTransform coefficients can actually be represented in a vector space of dimension
(N + 1) × N. This makes it possible for optimizers to be used appropriately on this search space.
However, the high dimensionality of the search space also implies a high computational complexity
of cost-function derivatives. The best compromise in the reduction of this computational time is to
use the transform’s Jacobian in combination with the image gradient for computing the cost-function
derivatives.
The coefficients of the N ×N matrix can represent rotations, anisotropic scaling and shearing. These
coefficients are usually of a very different dynamic range compared to the translation coefficients.
Coefficients in the matrix tend to be in the range [−1 : 1], but are not restricted to this interval.
Translation coefficients, on the other hand, can be on the order of 10 to 100, and are basically
related to the image size and pixel spacing.
This difference in scale makes it necessary to take advantage of the functionality offered by the
optimizers for rescaling the parameter space. This is particularly relevant for optimizers based on
gradient descent approaches. This transform lets the user set an arbitrary center of rotation. The
coordinates of the rotation center do not make part of the parameters array passed to the optimizer.
Equation 9.7 illustrates the effect of applying the AffineTransform in a point in 3D space.

9.6. Transforms 237
Behavior Number of
Parameters
Parameter
Ordering
Restrictions
Represents a free
from deformation
by providing a de-
formation field from
the interpolation of
deformations in a
coarse grid.
M × N Where M is the
number of nodes
in the BSpline
grid and N is the
dimension of the
space.
put and output space have
the same dimension. This
transform has the advantage
of allowing to compute de-
formable registration. It also
has the disadvantage of hav-
ing a very high dimensional
parametric space, and there-
fore requiring long compu-
tation times.
Table 9.16: Characteristics of the BSplineDeformableTransform class.


x′
y′
z′

 =


M00 M01 M02
M10 M11 M12
M20 M21 M22

·


x−Cx
y−Cy
z−Cz

+


Tx +Cx
Ty +Cy
Tz +Cz

 (9.7)
A registration based on the affine transform may be more effective when applied after simpler trans-
formations have been used to remove the major components of misalignment. Otherwise it will
incur an overwhelming computational cost. For example, using an affine transform, the first set of
optimization iterations would typically focus on removing large translations. This task could instead
be accomplished by a translation transform in a parameter space of size N instead of the (N +1)×N
associated with the affine transform.
Tracking the evolution of a registration process that uses AffineTransforms can be challenging, since
it is difficult to represent the coefficients in a meaningful way. A simple printout of the transform
coefficients generally does not offer a clear picture of the current behavior and trend of the optimiza-
tion. A better implementation uses the affine transform to deform wire-frame cube which is shown
in a 3D visualization display.
9.6.17 BSplineDeformableTransform
The itk::BSplineDeformableTransform is designed to be used for solving deformable registra-
tion problems. This transform is equivalent to generation a deformation field where a deformation
vector is assigned to every point in space. The deformation vectors are computed using BSpline
interpolation from the deformation values of points located in a coarse grid, that is usually referred
to as the BSpline grid.
The BSplineDeformableTransform is not flexible enough for accounting for large rotations or shear-

ing, or scaling differences. In order to compensate for this limitation, it provides the functionality of
being composed with an arbitrary transform. This transform is known as the Bulk transform and it
is applied to points before they are mapped with the displacement field.
This transform do not provide functionalities for mapping Vectors nor CovariantVectors, only Points
can be mapped. The reason is that the variations of a vector under a deformable transform actually
depend on the location of the vector in space. In other words, Vector only make sense as the relative
position between two points.
The BSplineDeformableTransform has a very large number of parameters and therefore is well
suited for the itk::LBFGSOptimizer and itk::LBFGSBOptimizer. The use of this transform for
was proposed in the following papers [123, 95, 96].
9.6.18 KernelTransforms
Kernel Transforms are a set of Transforms that are also suitable for performing deformable registra-
tion. These transforms compute on the fly the displacements corresponding to a deformation field.
The displacement values corresponding to every point in space are computed by interpolation from
the vectors defined by a set of Source Landmarks and a set of Target Landmarks.
Several variations of these transforms are available in the toolkit. They differ on the type of interpo-
lation kernel that is used when computing the deformation in a particular point of space. Note that
these transforms are computationally expensive and that their numerical complexity is proportional
to the number of landmarks and the space dimension.
The following is the list of Transforms based on the KernelTransform.
• itk::ElasticBodySplineKernelTransform
• itk::ElasticBodyReciprocalSplineKernelTransform
• itk::ThinPlateSplineKernelTransform
• itk::ThinPlateR2LogRSplineKernelTransform
• itk::VolumeSplineKernelTransform
Details about the mathematical background of these transform can be found in the paper by Davis
et. al [34] and the papers by Rohr et. al [120, 121].

9.7. Metrics 239
9.7 Metrics
In OTB, itk::ImageToImageMetric objects quantitatively measure how well the transformed
moving image fits the fixed image by comparing the gray-scale intensity of the images. These
metrics are very flexible and can work with any transform or interpolation method and do not require
reduction of the gray-scale images to sparse extracted information such as edges.
The metric component is perhaps the most critical element of the registration framework. The
selection of which metric to use is highly dependent on the registration problem to be solved. For
example, some metrics have a large capture range while others require initialization close to the
optimal position. In addition, some metrics are only suitable for comparing images obtained from
the same type of sensor, while others can handle multi-sensor comparisons. Unfortunately, there are
no clear-cut rules as to how to choose a metric.
The basic inputs to a metric are: the fixed and moving images, a transform and an interpolator. The
method GetValue() can be used to evaluate the quantitative criterion at the transform parameters
specified in the argument. Typically, the metric samples points within a defined region of the fixed
image. For each point, the corresponding moving image position is computed using the transform
with the specified parameters, then the interpolator is used to compute the moving image intensity
at the mapped position.
The metrics also support region based evaluation. The SetFixedImageMask() and
SetMovingImageMask() methods may be used to restrict evaluation of the metric within a spec-
ified region. The masks may be of any type derived from itk::SpatialObject.
Besides the measure value, gradient-based optimization schemes also require derivatives of
the measure with respect to each transform parameter. The methods GetDerivatives() and
GetValueAndDerivatives() can be used to obtain the gradient information.
The following is the list of metrics currently available in OTB:
• Mean squares
itk::MeanSquaresImageToImageMetric
• Normalized correlation
itk::NormalizedCorrelationImageToImageMetric
• Mean reciprocal squared difference
itk::MeanReciprocalSquareDifferenceImageToImageMetric
• Mutual information by Viola and Wells
itk::MutualInformationImageToImageMetric
• Mutual information by Mattes
itk::MattesMutualInformationImageToImageMetric
• Kullback Liebler distance metric by Kullback and Liebler
itk::KullbackLeiblerCompareHistogramImageToImageMetric

• Normalized mutual information
itk::NormalizedMutualInformationHistogramImageToImageMetric
• Mean squares histogram
itk::MeanSquaresHistogramImageToImageMetric
• Correlation coefficient histogram
itk::CorrelationCoefficientHistogramImageToImageMetric
• Cardinality Match metric
itk::MatchCardinalityImageToImageMetric
• Kappa Statistics metric
itk::KappaStatisticImageToImageMetric
• Gradient Difference metric
itk::GradientDifferenceImageToImageMetric
In the following sections, we describe each metric type in detail. For ease of notation, we will refer
to the fixed image f(X) and transformed moving image (m◦ T(X)) as images A and B.
9.7.1 Mean Squares Metric
The itk::MeanSquaresImageToImageMetric computes the mean squared pixel-wise difference
in intensity between image A and B over a user defined region:
MS(A,B) =
1
N
N
∑
i=1
(Ai − Bi)2
(9.8)
Ai is the i-th pixel of Image A
Bi is the i-th pixel of Image B
N is the number of pixels considered
The optimal value of the metric is zero. Poor matches between images A and B result in large values
of the metric. This metric is simple to compute and has a relatively large capture radius.
This metric relies on the assumption that intensity representing the same homologous point must be
the same in both images. Hence, its use is restricted to images of the same modality. Additionally,
any linear changes in the intensity result in a poor match value.
Exploring a Metric
Getting familiar with the characteristics of the Metric as a cost function is fundamental in order
to find the best way of setting up an optimization process that will use this metric for solving a
registration problem.

9.7. Metrics 241
9.7.2 Normalized Correlation Metric
The itk::NormalizedCorrelationImageToImageMetric computes pixel-wise cross-correlation
and normalizes it by the square root of the autocorrelation of the images:
NC(A,B) = −1 ×
∑N
i=1 (Ai ·Bi)
∑N
i=1 A2
i ·∑N
i=1 B2
i
(9.9)
Note the −1 factor in the metric computation. This factor is used to make the metric be optimal when
its minimum is reached. The optimal value of the metric is then minus one. Misalignment between
the images results in small measure values. The use of this metric is limited to images obtained
using the same imaging modality. The metric is insensitive to multiplicative factors – illumination
changes – between the two images. This metric produces a cost function with sharp peaks and well
deﬁned minima. On the other hand, it has a relatively small capture radius.
9.7.3 Mean Reciprocal Square Differences
The itk::MeanReciprocalSquareDifferenceImageToImageMetric computes pixel-wise dif-
ferences and adds them after passing them through a bell-shaped function 1
1+x2 :
PI(A,B) =
N
∑
i=1
1
1 + (Ai−Bi)2
λ2
(9.10)
λ controls the capture radius
The optimal value is N and poor matches results in small measure values. The characteristics of this
metric have been studied by Penney and Holden [59][108].
This image metric has the advantage of producing poor values when few pixels are considered. This
makes it consistent when its computation is subject to the size of the overlap region between the
images. The capture radius of the metric can be regulated with the parameter λ. The proﬁle of this
metric is very peaky. The sharp peaks of the metric help to measure spatial misalignment with high
precision. Note that the notion of capture radius is used here in terms of the intensity domain, not

the spatial domain. In that regard, λ should be given in intensity units and be associated with the
differences in intensity that will make drop the metric by 50%.
The metric is limited to images of the same image modality. The fact that its derivative is large at the
central peak is a problem for some optimizers that rely on the derivative to decrease as the extrema
are reached. This metric is also sensitive to linear changes in intensity.
9.7.4 Mutual Information Metric
The itk::MutualInformationImageToImageMetric computes the mutual information between
image A and image B. Mutual information (MI) measures how much information one random vari-
able (image intensity in one image) tells about another random variable (image intensity in the other
image). The major advantage of using MI is that the actual form of the dependency does not have
to be specified. Therefore, complex mapping between two images can be modeled. This flexibility
makes MI well suited as a criterion of multi-modality registration [112].
Mutual information is defined in terms of entropy. Let
H(A) = − pA(a)log pA(a)da (9.11)
be the entropy of random variable A, H(B) the entropy of random variable B and
H(A,B) = pAB(a,b)log pAB(a,b)dadb (9.12)
be the joint entropy of A and B. If A and B are independent, then
pAB(a,b) = pA(a)pB(b) (9.13)
and
H(A,B) = H(A)+ H(B). (9.14)
However, if there is any dependency, then
H(A,B) < H(A)+ H(B). (9.15)
The difference is called Mutual Information : I(A,B)
I(A,B) = H(A)+ H(B)− H(A,B) (9.16)
Parzen Windowing

9.7. Metrics 243
Sigma
Gray levels
Figure 9.16: In Parzen windowing, a continuous den-
sity function is constructed by superimposing kernel
functions (Gaussian function in this case) centered on
the intensity samples obtained from the image.
In a typical registration problem, direct access
to the marginal and joint probability densities
is not available and hence the densities must be
estimated from the image data. Parzen win-
dows (also known as kernel density estima-
tors) can be used for this purpose. In this
scheme, the densities are constructed by taking
intensity samples S from the image and super-
positioning kernel functions K(·) centered on
the elements of S as illustrated in Figure 9.16:
A variety of functions can be used as the
smoothing kernel with the requirement that
they are smooth, symmetric, have zero mean
and integrate to one. For example, box-
car, Gaussian and B-spline functions are suit-
able candidates. A smoothing parameter is
used to scale the kernel function. The larger
the smoothing parameter, the wider the kernel
function used and hence the smoother the density estimate. If the parameter is too large, features
such as modes in the density will get smoothed out. On the other hand, if the smoothing parameter is
too small, the resulting density may be too noisy. The estimation is given by the following equation.
p(a) ≈ P∗
(a) =
1
N ∑
sj∈S
K (a − sj) (9.17)
Choosing the optimal smoothing parameter is a difficult research problem and beyond the scope of
this software guide. Typically, the optimal value of the smoothing parameter will depend on the data
and the number of samples used.
Viola and Wells Implementation
OTB, through ITK, has multiple implementations of the mutual information metric. One of the
most commonly used is itk::MutualInformationImageToImageMetric and follows the method
specified by Viola and Wells in [138].
In this implementation, two separate intensity samples S and R are drawn from the image: the first
to compute the density, and the second to approximate the entropy as a sample mean:
H(A) =
1
N ∑
rj∈R
logP∗
(rj). (9.18)
Gaussian density is used as a smoothing kernel, where the standard deviation σ acts as the smoothing
parameter.

The number of spatial samples used for computation is defined using the
SetNumberOfSpatialSamples() method. Typical values range from 50 to 100. Note that
computation involves an N × N loop and hence, the computation burden becomes very expensive
when a large number of samples is used.
The quality of the density estimates depends on the choice of the standard deviation of the
Gaussian kernel. The optimal choice will depend on the content of the images. In our expe-
rience with the toolkit, we have found that a standard deviation of 0.4 works well for images
that have been normalized to have a mean of zero and standard deviation of 1.0. The stan-
dard deviation of the fixed image and moving image kernel can be set separately using methods
SetFixedImageStandardDeviation() and SetMovingImageStandardDeviation().
Mattes et al. Implementation
Another form of mutual information metric available in ITK follows the method specified by Mattes
et al. in [95] and is implemented by the itk::MattesMutualInformationImageToImageMetric
class.
In this implementation, only one set of intensity samples is drawn from the image. Using this set,
the marginal and joint probability density function (PDF) is evaluated at discrete positions or bins
uniformly spread within the dynamic range of the images. Entropy values are then computed by
summing over the bins.
The number of spatial samples used is set using method SetNumberOfSpatialSamples(). The
number of bins used to compute the entropy values is set via SetNumberOfHistogramBins().
Since the fixed image PDF does not contribute to the metric derivatives, it does not need to be
smooth. Hence, a zero order (boxcar) B-spline kernel is used for computing the PDF. On the other
hand, to ensure smoothness, a third order B-spline kernel is used to compute the moving image
intensity PDF. The advantage of using a B-spline kernel over a Gaussian kernel is that the B-spline
kernel has a finite support region. This is computationally attractive, as each intensity sample only
affects a small number of bins and hence does not require a N ×N loop to compute the metric value.
During the PDF calculations, the image intensity values are linearly scaled to have a minimum of
zero and maximum of one. This rescaling means that a fixed B-spline kernel bandwidth of one can
be used to handle image data with arbitrary magnitude and dynamic range.
9.7.5 Kullback-Leibler distance metric
The itk::KullbackLeiblerCompareHistogramImageToImageMetric is yet another informa-
tion based metric. Kullback-Leibler distance measures the relative entropy between two discrete
probability distributions. The distributions are obtained from the histograms of the two input im-
ages, A and B.

9.7. Metrics 245
The Kullback-Liebler distance between two histograms is given by
KL(A,B) =
N
∑
i
pA(i)× log
pA(i)
pB(i)
(9.19)
The distance is always non-negative and is zero only if the two distributions are the same. Note
that the distance is not symmetric. In other words, KL(A,B) = KL(B,A). Nevertheless, if the
distributions are not too dissimilar, the difference between KL(A,B) and KL(B,A) is small.
The implementation in ITK is based on [26].
9.7.6 Normalized Mutual Information Metric
Given two images, A and B, the normalized mutual information may be computed as
NMI(A,B) = 1 +
I(A,B)
H(A,B)
=
H(A)+ H(B)
H(A,B)
(9.20)
where the entropy of the images, H(A), H(B), the mutual information, I(A,B) and the joint entropy
H(A,B) are computed as mentioned in 9.7.4. Details of the implementation may be found in the
[55].
9.7.7 Mean Squares Histogram
The itk::MeanSquaresHistogramImageToImageMetric is an alternative implementation of the
Mean Squares Metric. In this implementation the joint histogram of the fixed and the mapped
moving image is built first. The user selects the number of bins to use in this joint histogram.
Once the joint histogram is computed, the bins are visited with an iterator. Given that each bin is
associated to a pair of intensities of the form: {fixed intensity, moving intensity}, along with the
number of pixels pairs in the images that fell in this bin, it is then possible to compute the sum of
square distances between the intensities of both images at the quantization levels defined by the joint
histogram bins.
This metric can be represented with Equation 9.21
MSH = ∑
f
∑
m
H(f,m)( f − m)2
(9.21)
where H(f,m) is the count on the joint histogram bin identified with fixed image intensity f and
moving image intensity m.

9.7.8 Correlation Coefficient Histogram
The itk::CorrelationCoefficientHistogramImageToImageMetric computes the cross cor-
relation coefficient between the intensities in the fixed image and the intensities on the mapped
moving image. This metric is intended to be used in images of the same modality where the rela-
tionship between the intensities of the fixed image and the intensities on the moving images is given
by a linear equation.
The correlation coefficient is computed from the Joint histogram as
CC =
∑f ∑m H(f,m) f ·m− f ·m
∑f H(f) (f − f)2 · ∑m H(m)((m− m)2)
(9.22)
Where H(f,m) is the joint histogram count for the bin identified with the fixed image intensity f
and the moving image intensity m. The values f and m are the mean values of the fixed and moving
images respectively. H(f) and H(m) are the histogram counts of the fixed and moving images
respectively. The optimal value of the correlation coefficient is 1, which would indicate a perfect
straight line in the histogram.
9.7.9 Cardinality Match Metric
The itk::MatchCardinalityImageToImageMetric computes cardinality of the set of pixels that
match exactly between the moving and fixed images. In other words, it computes the number of
pixel matches and mismatches between the two images. The match is designed for label maps. All
pixel mismatches are considered equal whether they are between label 1 and label 2 or between label
1 and label 500. In other words, the magnitude of an individual label mismatch is not relevant, or
the occurrence of a label mismatch is important.
The spatial correspondence between the fixed and moving images is established us-
ing a itk::Transform using the SetTransform() method and an interpolator using
SetInterpolator(). Given that we are matching pixels with labels, it is advisable to use Nearest
Neighbor interpolation.
9.7.10 Kappa Statistics Metric
The itk::KappaStatisticImageToImageMetric computes spatial intersection of two binary im-
ages. The metric here is designed for matching pixels in two images with the same exact value,
which may be set using SetForegroundValue(). Given two images A and B, the κ coefficient is
computed as
κ =
|A|∩|B|
|A|+ |B|
(9.23)

9.8. Optimizers 247
itk::ConjugateGradientOptimizer
itk::OnePlusOneEvolutionaryOptimizer itk::SingleValuedNonLinearVnlOptimizer
itk::MultipleValuedNonLinearOptimizer
itk::NonLinearOptimizer
itk::SingleValuedNonLinearOptimizer
itk::AmoebaOptimizer
itk::LevenbergMarquardtOptimizer
itk::MultipleValuedNonLinearVnlOptimizer
itk::MultipleValuedCostFunctionitk::SingleValuedCostFunction
itk::CostFunction itk::CostFunction
itk::Object
itk::Optimizer
itk::LBFGSOptimizer
itk::RegularStepGradientDescentBaseOptimizer
itk::RegularStepGradientDescentOptimizer
itk::VersorTransformOptimizer
itk::VersorRigid3DTransformOptimizer
itk::SPSAOptimizer
itk::PowellOptimizer
itk::FRPROptimizer
itk::GradientDescentOptimizer
itk::QuaternionRigidTransformGradientDescentOptimizer
VxL/vnl
vnl_levenberg_marquardt
vnl_amoeba
vnl_conjugate_gradient
vnl_lbgs
Figure 9.17: Class diagram of the optimizers hierarchy.
where |A| is the number of foreground pixels in image A. This computes the fraction of area in the
two images that is common to both the images. In the computation of the metric, only foreground
pixels are considered.
9.7.11 Gradient Difference Metric
This itk::GradientDifferenceImageToImageMetric metric evaluates the difference in the
derivatives of the moving and fixed images. The derivatives are passed through a function 1
1+x
and then they are added. The purpose of this metric is to focus the registration on the edges of struc-
tures in the images. In this way the borders exert larger influence on the result of the registration
than do the inside of the homogeneous regions on the image.
9.8 Optimizers
Optimization algorithms are encapsulated as itk::Optimizer objects within OTB. Optimizers are
generic and can be used for applications other than registration. Within the registration framework,
subclasses of itk::SingleValuedNonLinearOptimizer are used to optimize the metric criterion
with respect to the transform parameters.
The basic input to an optimizer is a cost function object. In the context of registra-
tion, itk::ImageToImageMetric classes provides this functionality. The initial param-
eters are set using SetInitialPosition() and the optimization algorithm is invoked by
StartOptimization(). Once the optimization has finished, the final parameters can be obtained
using GetCurrentPosition().

Some optimizers also allow rescaling of their individual parameters. This is convenient for normal-
izing parameters spaces where some parameters have different dynamic ranges. For example, the
first parameter of itk::Euler2DTransform represents an angle while the last two parameters rep-
resent translations. A unit change in angle has a much greater impact on an image than a unit change
in translation. This difference in scale appears as long narrow valleys in the search space making
the optimization problem more difficult. Rescaling the translation parameters can help to fix this
problem. Scales are represented as an itk::Array of doubles and set defined using SetScales().
There are two main types of optimizers in OTB. In the first type we find optimizers that are suitable
for dealing with cost functions that return a single value. These are indeed the most common type of
cost functions, and are known as Single Valued functions, therefore the corresponding optimizers are
known as Single Valued optimizers. The second type of optimizers are those suitable for managing
cost functions that return multiple values at each evaluation. These cost functions are common in
model-fitting problems and are known as Multi Valued or Multivariate functions. The corresponding
optimizers are therefore called MultipleValued optimizers in OTB.
The itk::SingleValuedNonLinearOptimizer is the base class for the first type of optimizers
while the itk::MultipleValuedNonLinearOptimizer is the base class for the second type of
optimizers.
The types of single valued optimizer currently available in OTB are:
• Amoeba: Nelder-Meade downhill simplex. This optimizer is actually implemented in the
vxl/vnl numerics toolkit. The ITK class itk::AmoebaOptimizer is merely an adaptor
class.
• Conjugate Gradient: Fletcher-Reeves form of the conjugate gradient with or without pre-
conditioning ( itk::ConjugateGradientOptimizer). It is also an adaptor to an optimizer
in vnl.
• Gradient Descent: Advances parameters in the direction of the gradient where the step size
is governed by a learning rate ( itk::GradientDescentOptimizer).
• Quaternion Rigid Transform Gradient Descent: A specialized version of GradientDescen-
tOptimizer for QuaternionRigidTransform parameters, where the parameters representing the
quaternion are normalized to a magnitude of one at each iteration to represent a pure rotation
( itk::QuaternionRigidTransformGradientDescent).
• LBFGS: Limited memory Broyden, Fletcher, Goldfarb and Shannon minimization. It is an
adaptor to an optimizer in vnl ( itk::LBFGSOptimizer).
• LBFGSB: A modified version of the LBFGS optimizer that allows to specify bounds for the
parameters in the search space. It is an adaptor to an optimizer in netlib. Details on this
optimizer can be found in [19, 20] ( itk::LBFGSBOptimizer).
• One Plus One Evolutionary: Strategy that simulates the biological evolution of a set of
samples in the search space ( itk::OnePlusOneEvolutionaryOptimizer.). Details on this
optimizer can be found in [130].

9.9. Landmark-based registration 249
• Regular Step Gradient Descent: Advances parameters in the direction of
the gradient where a bipartition scheme is used to compute the step size (
itk::RegularStepGradientDescentOptimizer).
• Powell Optimizer: Powell optimization method. For an N-dimensional parameter space,
each iteration minimizes(maximizes) the function in N (initially orthogonal) directions. This
optimizer is described in [114]. ( itk::PowellOptimizer).
• SPSA Optimizer: Simultaneous Perturbation Stochastic Approximation Method.
This optimizer is described in http://www.jhuapl.edu/SPSA and in [128]. (
itk::SPSAOptimizer).
• Versor Transform Optimizer: A specialized version of the RegularStepGradientDescentOp-
timizer for VersorTransform parameters, where the current rotation is composed with the gra-
dient rotation to produce the new rotation versor. It follows the definition of versor gradients
defined by Hamilton [56] ( itk::VersorTransformOptimizer).
• Versor Rigid3D Transform Optimizer: A specialized version of the RegularStepGradi-
entDescentOptimizer for VersorRigid3DTransform parameters, where the current rotation
is composed with the gradient rotation to produce the new rotation versor. The transla-
tional part of the transform parameters are updated as usually done in a vector space. (
itk::VersorRigid3DTransformOptimizer).
A parallel hierarchy exists for optimizing multiple-valued cost functions. The base optimizer in this
branch of the hierarchy is the itk::MultipleValuedNonLinearOptimizer whose only current
derived class is:
• Levenberg Marquardt: Non-linear least squares minimization. Adapted to an optimizer in
vnl ( itk::LevenbergMarquardtOptimizer). This optimizer is described in [114].
Figure 9.17 illustrates the full class hierarchy of optimizers in OTB. Optimizers in the lower right
corner are adaptor classes to optimizers existing in the vxl/vnl numerics toolkit. The optimizers
interact with the itk::CostFunction class. In the registration framework this cost function is
reimplemented in the form of ImageToImageMetric.
9.9 Landmark-based registration
Examples/Patented/EstimateAffineTransformationExample.cxx.
This example demonstrates the use of the otb::KeyPointSetsMatchingFilter for transfor-
mation estimation between 2 images. The idea here is to match SIFTs extracted from both
the fixed and the moving images. The use of SIFTs is demonstrated in section 14.2.2. The

otb::LeastSquareAffineTransformEstimator will be used to generate a transformation field
by using mean square optimisation to estimate an affine transfrom from the point set.
The first step toward the use of these filters is to include the appropriate header files.
#include "otbKeyPointSetsMatchingFilter.h"
#include "otbImageToSIFTKeyPointSetFilter.h"
// Disabling deprecation warning if on visual
#ifdef _MSC_VER
#pragma warning(push)
#pragma warning(disable :4996)
#endif
#include "itkDeformationFieldSource.h"
// Enabling remaining deprecation warning
#ifdef _MSC_VER
#pragma warning(pop)
#endif
Then we must decide what pixel type to use for the image. We choose to do all the computations in
typedef double RealType;
typedef otb::Image <RealType , Dimension > ImageType ;
The SIFTs obtained for the matching will be stored in vector form inside a point set. So we need the
following types:
typedef itk::VariableLengthVector <RealType > RealVectorType;
typedef itk::PointSet <RealVectorType , Dimension > PointSetType;
The filter for computing the SIFTs has a type defined as follows:
typedef otb::ImageToSIFTKeyPointSetFilter<ImageType , PointSetType >
ImageToSIFTKeyPointSetFilterType;
Although many choices for evaluating the distances during the matching procedure exist, we choose
here to use a simple Euclidean distance. We can then define the type for the matching filter.
typedef itk::Statistics ::EuclideanDistance <RealVectorType > DistanceType;
typedef otb::KeyPointSetsMatchingFilter<PointSetType , DistanceType >
EuclideanDistanceMatchingFilterType;
The following types are needed for dealing with the matched points.
typedef std::pair <PointType , PointType > MatchType ;
typedef std::vector <MatchType > MatchVectorType;
typedef EuclideanDistanceMatchingFilterType:: LandmarkListType

LandmarkListType;
typedef PointSetType::PointsContainer PointsContainerType;
typedef PointsContainerType::Iterator PointsIteratorType;
typedef PointSetType::PointDataContainer PointDataContainerType;
typedef PointDataContainerType::Iterator PointDataIteratorType;
We define the type for the image reader.
Two readers are instantiated : one for the fixed image, and one for the moving image.
ReaderType ::Pointer fixedReader = ReaderType ::New ();
ReaderType ::Pointer movingReader = ReaderType ::New ();
fixedReader ->SetFileName (fixedfname );
movingReader ->SetFileName (movingfname );
fixedReader ->UpdateOutputInformation();
movingReader ->UpdateOutputInformation();
We will now instantiate the 2 SIFT filters and the filter used for the matching of the points.
ImageToSIFTKeyPointSetFilterType::Pointer filter1 =
ImageToSIFTKeyPointSetFilterType::New();
ImageToSIFTKeyPointSetFilterType::Pointer filter2 =
ImageToSIFTKeyPointSetFilterType::New();
EuclideanDistanceMatchingFilterType::Pointer euclideanMatcher =
EuclideanDistanceMatchingFilterType::New ();
We plug the pipeline and set the parameters.
filter1 ->SetInput(fixedReader ->GetOutput ());
filter2 ->SetInput(movingReader ->GetOutput ());
filter1 ->SetOctavesNumber(octaves );
filter1 ->SetScalesNumber(scales);
filter1 ->SetDoGThreshold(threshold );
filter1 ->SetEdgeThreshold(ratio);
filter2 ->SetOctavesNumber(octaves );
filter2 ->SetScalesNumber(scales);
filter2 ->SetDoGThreshold(threshold );
filter2 ->SetEdgeThreshold(ratio);
We use a simple Euclidean distance to select matched points.
euclideanMatcher ->SetInput1 (filter1 ->GetOutput ());
euclideanMatcher ->SetInput2 (filter2 ->GetOutput ());

euclideanMatcher ->SetDistanceThreshold(secondOrderThreshold);
euclideanMatcher ->SetUseBackMatching(useBackMatching);
euclideanMatcher ->Update ();
The matched points will be stored into a landmark list.
LandmarkListType::Pointer landmarkList;
landmarkList = euclideanMatcher ->GetOutput ();
Apply Mean square algorithm
typedef itk::Point <double, 2> MyPointType ;
typedef otb::LeastSquareAffineTransformEstimator<MyPointType > EstimatorType;
// instantiation of the estimator of the affine transformation
EstimatorType::Pointer estimator = EstimatorType::New();
std::cout << "landmark list size " << landmarkList ->Size() << std::endl;
for (LandmarkListType::Iterator it = landmarkList ->Begin();
it != landmarkList ->End (); ++it)
{
estimator ->AddTiePoints(it.Get()->GetPoint1 (), it.Get()->GetPoint2 ());
}
// Trigger computation
estimator ->Compute ();
Resample the initial image with the transformation evaluated
It is common, as the last step of a registration task, to use the resulting transform to map the moving
image into the fixed image space. This is easily done with the itk::ResampleImageFilter. First,
a ResampleImageFilter type is instantiated using the image types.
typedef itk::ResampleImageFilter <
ImageType ,
OutputImageType > ResampleFilterType;
A resampling filter is created and the moving image is connected as its input.
ResampleFilterType::Pointer resampler = ResampleFilterType::New ();
resampler ->SetInput (movingReader ->GetOutput ());
The Transform that is produced as output do not need to be inversed because we apply here the
resampling algorithm to the ”moving” image to produce the fixed image.
ImageType ::Pointer fixedImage = fixedReader ->GetOutput ();
resampler ->SetTransform(estimator ->GetAffineTransform());
resampler ->SetSize(fixedImage ->GetLargestPossibleRegion(). GetSize ());
resampler ->SetOutputOrigin(fixedImage ->GetOrigin ());
resampler ->SetOutputSpacing(fixedImage ->GetSpacing ());

Figure 9.18: From left to right and top to bottom: ﬁxed input image, moving image, resampled moving image.
Write resampled image to png
writer ->SetInput(resampler ->GetOutput ());
writer ->SetFileName (outputImageFilename);
writer ->Update();
Figure 9.18 shows the result of the resampled image using the estimated transformation based on
SIFT points

CHAPTER
TEN
DISPARITY MAP ESTIMATION
This chapter introduces the tools available in OTB for the estimation of geometric disparities be-
tween images.
10.1 Disparity Maps
The problem we want to deal with is the one of the automatic disparity map estimation of images
acquired with different sensors. By different sensors, we mean sensors which produce images
with different radiometric properties, that is, sensors which measure different physical magnitudes:
optical sensors operating in different spectral bands, radar and optical sensors, etc.
For this kind of image pairs, the classical approach of fine correlation [82, 44], can not always be
used to provide the required accuracy, since this similarity measure (the correlation coefficient) can
only measure similarities up to an affine transformation of the radiometries.
There are two main questions which can be asked about what we want to do:
1. Can we define what the similarity is between, for instance, a radar and an optical image?
2. What does fine registration mean in the case where the geometric distortions are so big and
the source of information can be located in different places (for instance, the same edge can
be produced by the edge of the roof of a building in an optical image and by the wall-ground
bounce in a radar image)?
We can answer by saying that the images of the same object obtained by different sensors are two
different representations of the same reality. For the same spatial location, we have two different
measures. Both informations come from the same source and thus they have a lot of common
information. This relationship may not be perfect, but it can be evaluated in a relative way: different

256 Chapter 10. Disparity Map Estimation
geometrical distortions are compared and the one leading to the strongest link between the two
measures is kept.
When working with images acquired with the same (type of) sensor one can use a very effective
approach. Since a correlation coefficient measure is robust and fast for similar images, one can
afford to apply it in every pixel of one image in order to search for the corresponding HP in the
other image. One can thus build a deformation grid (a sampling of the deformation map). If the
sampling step of this grid is short enough, the interpolation using an analytical model is not needed
and high frequency deformations can be estimated. The obtained grid can be used as a re-sampling
grid and thus obtain the registered images.
No doubt, this approach, combined with image interpolation techniques (in order to estimate
sub-pixel deformations) and multi-resolution strategies allows for obtaining the best performances
in terms of deformation estimation, and hence for the automatic image registration.
Unfortunately, in the multi-sensor case, the correlation coefficient can not be used. We will thus try
to find similarity measures which can be applied in the multi-sensor case with the same approach as
the correlation coefficient.
We start by giving several definitions which allow for the formalization of the image registration
problem. First of all, we define the master image and the slave image:
Definition 1 Master image: image to which other images will be registered; its geometry is consid-
ered as the reference.
Definition 2 Slave image: image to be geometrically transformed in order to be registered to the
master image.
Two main concepts are the one of similarity measure and the one of geometric transformation:
Definition 3 Let I and J be two images and let c a similarity criterion, we call similarity measure
any scalar, strictly positive function
Sc(I,J) = f(I,J,c). (10.1)
Sc has an absolute maximum when the two images I and J are identical in the sense of the criterion c.
Definition 4 A geometric transformation T is an operator which, applied to the coordinates (x,y)
of a point in the slave image, gives the coordinates (u,v) of its HP in the master image:

10.1. Disparity Maps 257
u
v
= T
x
y
(10.2)
Finally we introduce a definition for the image registration problem:
Definition 5 Registration problem:
1. determine a geometric transformation T which maximizes the similarity between a master
image I and the result of the transformation T ◦ J:
Argmax
T
(Sc(I,T ◦ J)); (10.3)
2. re-sampling of J by applying T.
10.1.1 Geometric deformation modeling
The geometric transformation of definition 4 is used for the correction of the existing deformation
between the two images to be registered. This deformation contains informations which are linked
to the observed scene and the acquisition conditions. They can be classified into 3 classes depending
on their physical source:
1. deformations linked to the mean attitude of the sensor (incidence angle, presence or absence
of yaw steering, etc.);
2. deformations linked to a stereo vision (mainly due to the topography);
3. deformations linked to attitude evolution during the acquisition (vibrations which are mainly
present in push-broom sensors).
These deformations are characterized by their spatial frequencies and intensities which are summa-
rized in table 10.1.
Depending on the type of deformation to be corrected, its model will be different. For example, if
the only deformation to be corrected is the one introduced by the mean attitude, a physical model
Intensity Spatial Frequency
Mean Attitude Strong Low
Stereo Medium High and Medium
Attitude evolution Low Low to Medium
Table 10.1: Characterization of the geometric deformation sources

for the acquisition geometry (independent of the image contents) will be enough. If the sensor is
not well known, this deformation can be approximated by a simple analytical model. When the
deformations to be modeled are high frequency, analytical (parametric) models are not suitable for
a fine registration. In this case, one has to use a fine sampling of the deformation, that means the
use of deformation grids. These grids give, for a set of pixels of the master image, their location in
the slave image.
The following points summarize the problem of the deformation modeling:
1. An analytical model is just an approximation of the deformation. It is often obtained as
follows:
(a) Directly from a physical model without using any image content information.
(b) By estimation of the parameters of an a priori model (polynomial, affine, etc.). These
parameters can be estimated:
i. Either by solving the equations obtained by taking HP. The HP can be manually or
automatically extracted.
ii. Or by maximization of a global similarity measure.
2. A deformation grid is a sampling of the deformation map.
The last point implies that the sampling period of the grid must be short enough in order to account
for high frequency deformations (Shannon theorem). Of course, if the deformations are non
stationary (it is usually the case of topographic deformations), the sampling can be irregular.
As a conclusion, we can say that definition 5 poses the registration problem as an optimization
problem. This optimization can be either global or local with a similarity measure which can also
be either local or global. All this is synthesized in table 10.2.
Geometric model Similarity measure Optimization of the
deformation
Physical model None Global
Analytical model Local Global
with a priori HP
Analytical model Global Global
without a priori HP
Grid Local Local
Table 10.2: Approaches to image registration

10.1. Disparity Maps 259
The ideal approach would consist in a registration which is locally optimized, both in similarity and
deformation, in order to have the best registration quality. This is the case when deformation grids
with dense sampling are used. Unfortunately, this case is the most computationally heavy and one
often uses either a low sampling rate of the grid, or the evaluation of the similarity in a small set
of pixels for the estimation of an analytical model. Both of these choices lead to local registration
errors which, depending on the topography, can amount several pixels.
Even if this registration accuracy can be enough in many applications, (ortho-registration, import
into a GIS, etc.), it is not acceptable in the case of data fusion, multi-channel segmentation or
change detection [133]. This is why we will focus on the problem of deformation estimation using
dense grids.
10.1.2 Similarity measures
The fine modeling of the geometric deformation we are looking for needs for the estimation of
the coordinates of nearly every pixel in the master image inside the slave image. In the classical
mono-sensor case where we use the correlation coefficient we proceed as follows.
The geometric deformation is modeled by local rigid displacements. One wants to estimate the
coordinates of each pixel of the master image inside the slave image. This can be represented by
a displacement vector associated to every pixel of the master image. Each of the two components
(lines and columns) of this vector field will be called deformation grid.
We use a small window taken in the master image and we test the similarity for every possible shift
within an exploration area inside the slave image (figure 10.1).
That means that for each position we compute the correlation coefficient. The result is a correlation
surface whose maximum gives the most likely local shift between both images:
ρI,J(∆x,∆y) =
1
N
∑x,y(I(x,y)− mI)(J(x+ ∆x,y+ ∆y)− mJ)
σIσJ
.
(10.4)
In this expression, N is the number of pixels of the analysis window, mI and mJ are the estimated
mean values inside the analysis window of respectively image I and image J and σI and σJ are their
standard deviations.
Quality criteria can be applied to the estimated maximum in order to give a confidence factor to

Reference Image Secondary Image
Candidate points
Estimation window
Search window
Similarity estimation
Similarity optimization
Optimum
∆x,∆y
Figure 10.1: Estimation of the correlation surface.
the estimated shift: width of the peak, maximum value, etc. Sub-pixel shifts can be measured by
applying fractional shifts to the sliding window. This can be done by image interpolation.
The interesting parameters of the procedure are:
• The size of the exploration area: it determines the computational load of the algorithm (we
want to reduce it), but it has to be large enough in order to cope with large deformations.
• The size of the sliding window: the robustness of the correlation coefficient estimation in-
creases with the window size, but the hypothesis of local rigid shifts may not be valid for
large windows.
The correlation coefficient cannot be used with original grey-level images in the multi-sensor
case. It could be used on extracted features (edges, etc.), but the feature extraction can introduce
localization errors. Also, when the images come from sensors using very different modalities, it
can be difficult to find similar features in both images. In this case, one can try to find the similarity
at the pixel level, but with other similarity measures and apply the same approach as we have just
described.
The concept of similarity measure has been presented in definition 3. The difficulty of the procedure
lies in finding the function f which properly represents the criterion c. We also need that f be easily
and robustly estimated with small windows. We extend here what we proposed in [68].

10.2. Regular grid disparity map estimation 261
10.1.3 The correlation coefficient
We remind here the computation of the correlation coefficient between two image windows I and J.
The coordinates of the pixels inside the windows are represented by (x,y):
ρ(I,J) =
1
N
∑x,y(I(x,y)− mI)(J(x,y)− mJ)
σIσJ
. (10.5)
In order to qualitatively characterize the different similarity measures we propose the following
experiment. We take two images which are perfectly registered and we extract a small window
of size N × M from each of the images (this size is set to 101 × 101 for this experiment). For
the master image, the window will be centered on coordinates (x0,y0) (the center of the image)
and for the slave image, it will be centered on coordinates (x0 + ∆x,y0). With different values
of ∆x (from -10 pixels to 10 pixels in our experiments), we obtain an estimate of ρ(I,J) as a
function of ∆x, which we write as ρ(∆x) for short. The obtained curve should have a maximum for
∆x = 0, since the images are perfectly registered. We would also like to have an absolute maxi-
mum with a high value and with a sharp peak, in order to have a good precision for the shift estimate.
10.2 Regular grid disparity map estimation
Examples/DisparityMap/FineRegistrationImageFilterExample.cxx.
This example demonstrates the use of the otb::FineRegistrationImageFilter. This filter per-
forms deformation estimation using the classical extrema of image-to-image metric look-up in a
search window.
#include "otbFineRegistrationImageFilter.h"
Several type of otb::Image are required to represent the input image, the metric field, and the
deformation field.
typedef otb::Image <PixelType , ImageDimension > InputImageType;
typedef otb::Image <PixelType , ImageDimension > MetricImageType;
typedef otb::Image <DeformationPixelType ,
ImageDimension > DeformationFieldType;
To make the metric estimation more robust, the first required step is to blur the input images. This
is done using the itk::RecursiveGaussianImageFilter:
typedef itk::RecursiveGaussianImageFilter<InputImageType ,
InputImageType > InputBlurType;

InputBlurType::Pointer fBlur = InputBlurType::New ();
fBlur ->SetInput(fReader->GetOutput ());
fBlur ->SetSigma(atof(argv[7]));
InputBlurType::Pointer mBlur = InputBlurType::New ();
mBlur ->SetInput(mReader->GetOutput ());
mBlur ->SetSigma(atof(argv[7]));
Now, we declare and instantiate the otb::FineCorrelationImageFilter which is going to per-
form the registration:
typedef otb::FineRegistrationImageFilter<InputImageType ,
MetricImageType ,
DeformationFieldType >
RegistrationFilterType;
RegistrationFilterType::Pointer registrator = RegistrationFilterType::New();
registrator ->SetMovingInput(mBlur ->GetOutput ());
registrator ->SetFixedInput(fBlur ->GetOutput ());
Some parameters need to be specified to the filter:
• The area where the search is performed. This area is defined by its radius:
typedef RegistrationFilterType::SizeType RadiusType ;
RadiusType searchRadius;
searchRadius[0] = atoi(argv[8]);
searchRadius[1] = atoi(argv[8]);
registrator ->SetSearchRadius(searchRadius);
• The window used to compute the local metric. This window is also defined by its radius:
RadiusType metricRadius;
metricRadius[0] = atoi(argv[9]);
metricRadius[1] = atoi(argv[9]);
registrator ->SetRadius (metricRadius);
We need to set the sub-pixel accuracy we want to obtain:
The default matching metric used by the FineRegistrationImageFilter::is standard
correlation. However, we may also use any other image-to-image metric provided by ITK. For
instance, here is how we would use the itk::MutualInformationImageToImageMetric
(do not forget to include the proper header).
typedef itk:: MeanReciprocalSquareDifferenceImageToImageMetric

10.3. Irregular grid disparity map estimation 263
Figure 10.2: From left to right and top to bottom: fixed input image, moving image with a low stereo angle, local
correlation field, local mean reciprocal square difference field, resampled image based on correlation, resam-
pled image based on mean reciprocal square difference, estimated epipolar deformation using on correlation,
estimated epipolar deformation using mean reciprocal square difference.
<InputImageType , InputImageType > MRSDMetricType;
MRSDMetricType::Pointer mrsdMetric = MRSDMetricType::New ();
registrator ->SetMetric (mrsdMetric );
The itk::MutualInformationImageToImageMetric produces low value for poor matches,
therefore, the filter has to maximize the metric :
registrator ->MinimizeOff ();
The execution of the otb::FineRegistrationImageFilter will be triggered by the Update()
call on the writer at the end of the pipeline. Make sure to use a otb::ImageFileWriter if you
want to benefit from the streaming features.
Figure 10.2 shows the result of applying the otb::FineRegistrationImageFilter.
10.3 Irregular grid disparity map estimation
Taking figure 10.1 as a starting point, we can generalize the approach by letting the user choose:
• the similarity measure;
• the geometric transform to be estimated (see definition 4);
In order to do this, we will use the ITK registration framework locally on a set of nodes. Once the
disparity is estimated on a set of nodes, we will use it to generate a deformation field: the dense,

regular vector field which gives the translation to be applied to a pixel of the secondary image to be
positioned on its homologous point of the master image.
Examples/DisparityMap/SimpleDisparityMapEstimationExample.cxx.
This example demonstrates the use of the otb::DisparityMapEstimationMethod, along with the
otb::NearestPointDeformationFieldGenerator. The first filter performs deformation estima-
tion according to a given transform, using embedded ITK registration framework. It takes as input
a possibly non regular point set and produces a point set with associated point data representing the
deformation.
The second filter generates a deformation field by using nearest neighbor interpolation on the de-
formation values from the point set. More advanced methods for deformation field interpolation are
also available.
#include "otbDisparityMapEstimationMethod.h"
#include "itkNormalizedCorrelationImageToImageMetric.h"
#include "itkWindowedSincInterpolateImageFunction.h"
#include "itkZeroFluxNeumannBoundaryCondition.h"
#include "itkGradientDescentOptimizer.h"
#include "otbBSplinesInterpolateDeformationFieldGenerator.h"
#include "otbWarpImageFilter.h"
Then we must decide what pixel type to use for the image. We choose to do all the computation in
The images are defined using the pixel type and the dimension. Please note that the
otb::NearestPointDeformationFieldGenerator generates a otb::VectorImage to represent
the deformation field in both image directions.
The next step is to define the transform we have chosen to model the deformation. In this example
the deformation is modeled as a itk::TranslationTransform.
typedef TransformType::ParametersType ParametersType;
Then we define the metric we will use to evaluate the local registration be-
tween the fixed and the moving image. In this example we choosed the
itk::NormalizedCorrelationImageToImageMetric.

typedef itk::NormalizedCorrelationImageToImageMetric<ImageType ,
ImageType > MetricType ;
Disparity map estimation implies evaluation of the moving image at non-grid po-
sition. Therefore, an interpolator is needed. In this example we choosed the
itk::WindowedSincInterpolateImageFunction.
typedef itk::Function ::HammingWindowFunction <3> WindowFunctionType;
typedef itk::ZeroFluxNeumannBoundaryCondition<ImageType > ConditionType;
typedef itk::WindowedSincInterpolateImageFunction<ImageType , 3,
WindowFunctionType ,
ConditionType ,
To perform local registration, an optimizer is needed. In this example we choosed the
itk::GradientDescentOptimizer.
typedef itk:: GradientDescentOptimizer OptimizerType;
Now we will define the point set to represent the point where to compute local disparity.
typedef itk::PointSet <ParametersType , Dimension > PointSetType;
Now we define the disparity map estimation filter.
typedef otb::DisparityMapEstimationMethod<ImageType ,
ImageType ,
PointSetType > DMEstimationType;
typedef DMEstimationType::SizeType SizeType;
The input image reader also has to be defined.
Two readers are instantiated : one for the fixed image, and one for the moving image.
ReaderType ::Pointer fixedReader = ReaderType ::New ();
ReaderType ::Pointer movingReader = ReaderType ::New ();
fixedReader ->SetFileName (argv[1]);
movingReader ->SetFileName (argv[2]);
fixedReader ->UpdateOutputInformation();
movingReader ->UpdateOutputInformation();
We will the create a regular point set where to compute the local disparity.
SizeType fixedSize =
fixedReader ->GetOutput ()->GetLargestPossibleRegion(). GetSize ();
unsigned int NumberOfXNodes = (fixedSize [0] - 2 * atoi(argv[7]) - 1)
/ atoi(argv[5]);

unsigned int NumberOfYNodes = (fixedSize [1] - 2 * atoi(argv[7]) - 1)
/ atoi(argv[6]);
ImageType ::IndexType firstNodeIndex;
firstNodeIndex[0] = atoi(argv[7]);
firstNodeIndex[1] = atoi(argv[7]);
PointSetType::Pointer nodes = PointSetType::New();
unsigned int nodeCounter = 0;
for (unsigned int x = 0; x < NumberOfXNodes; x++)
{
for (unsigned int y = 0; y < NumberOfYNodes; y++)
{
PointType p;
p[0] = firstNodeIndex[0] + x*atoi(argv[5]);
p[1] = firstNodeIndex[1] + y*atoi(argv[6]);
nodes ->SetPoint (nodeCounter ++, p);
}
}
We build the transform, interpolator, metric and optimizer for the disparity map estimation filter.
optimizer ->MinimizeOn ();
optimizer ->SetLearningRate(atof(argv[9]));
optimizer ->SetNumberOfIterations(atoi(argv[10]));
metric ->SetSubtractMean(true);
We then set up the disparity map estimation filter. This filter will perform a local registration at each
point of the given point set using the ITK registration framework. It will produce a point set whose
point data reflects the disparity locally around the associated point.
Point data will contains the following data :
1. The final metric value found in the registration process,
2. the deformation value in the first image direction,
3. the deformation value in the second image direction,
4. the final parameters of the transform.
Please note that in the case of a itk::TranslationTransform, the deformation values and the
transform parameters are the same.

DMEstimationType::Pointer dmestimator = DMEstimationType::New();
dmestimator ->SetTransform(transform );
dmestimator ->SetOptimizer(optimizer );
dmestimator ->SetInterpolator(interpolator);
dmestimator ->SetMetric (metric);
SizeType windowSize , explorationSize;
explorationSize.Fill(atoi(argv[7]));
windowSize .Fill(atoi(argv[8]));
dmestimator ->SetWinSize (windowSize );
dmestimator ->SetExploSize(explorationSize);
The initial transform parameters can be set via the SetInitialTransformParameters() method.
In our case, we simply fill the parameter array with null values.
DMEstimationType:: ParametersType
initialParameters(transform ->GetNumberOfParameters());
initialParameters[0] = 0.0;
initialParameters[1] = 0.0;
dmestimator ->SetInitialTransformParameters(initialParameters);
Now we can set the input for the deformation field estimation filter. Fixed image can be set using
the SetFixedImage() method, moving image can be set using the SetMovingImage(), and input
point set can be set using the SetPointSet() method.
dmestimator ->SetFixedImage(fixedReader ->GetOutput ());
dmestimator ->SetMovingImage(movingReader ->GetOutput ());
dmestimator ->SetPointSet (nodes);
Once the estimation has been performed by the otb::DisparityMapEstimationMethod, one can
generate the associated deformation field (that means translation in first and second image direction).
It will be represented as a otb::VectorImage.
typedef otb::VectorImage <PixelType , Dimension > DeformationFieldType;
For the deformation field estimation, we will use the otb::BSplinesInterpolateDeformationFieldGenerato
This filter will perform a nearest neighbor interpolation on the deformation values in the point set
data.
typedef otb:: BSplinesInterpolateDeformationFieldGenerator<PointSetType ,
DeformationFieldType >
GeneratorType;
The disparity map estimation filter is instanciated.
GeneratorType::Pointer generator = GeneratorType::New();

We must then specify the input point set using the SetPointSet() method.
generator ->SetPointSet (dmestimator ->GetOutput ());
One must also specify the origin, size and spacing of the output deformation field.
generator ->SetOutputOrigin(fixedReader ->GetOutput ()->GetOrigin ());
generator ->SetOutputSpacing(fixedReader ->GetOutput ()->GetSpacing ());
generator ->SetOutputSize(fixedReader ->GetOutput ()
->GetLargestPossibleRegion(). GetSize ());
The local registration process can lead to wrong deformation values and transform parameters. To
Select only points in point set for which the registration process was succesful, one can set a thresh-
old on the final metric value : points for which the absolute final metric value is below this threshold
will be discarded. This threshold can be set with the SetMetricThreshold() method.
generator ->SetMetricThreshold(atof(argv[11]));
• otb::NNearestPointsLinearInterpolateDeformationFieldGenerator
• otb::BSplinesInterpolateDeformationFieldGenerator
• otb::NearestTransformDeformationFieldGenerator
• otb::NNearestTransformsLinearInterpolateDeformationFieldGenerator
• otb::BSplinesInterpolateTransformDeformationFieldGenerator
Now we can warp our fixed image according to the estimated deformation field. This will be per-
formed by the itk::WarpImageFilter. First, we define this filter.
typedef otb::WarpImageFilter <ImageType , ImageType ,
DeformationFieldType > ImageWarperType;
Then we instantiate it.
ImageWarperType::Pointer warper = ImageWarperType::New();
We set the input image to warp using the SetInput() method, and the deformation field using the
SetDeformationField() method.
warper ->SetInput(movingReader ->GetOutput ());
warper ->SetDeformationField(generator ->GetOutput ());
warper ->SetOutputOrigin(fixedReader ->GetOutput ()->GetOrigin ());
warper ->SetOutputSpacing(fixedReader ->GetOutput ()->GetSpacing ());

10.4. Stereo reconstruction 269
In order to write the result to a PNG file, we will rescale it on a proper range.
typedef itk::RescaleIntensityImageFilter<ImageType ,
RescalerType::Pointer outputRescaler = RescalerType::New ();
outputRescaler ->SetInput(warper ->GetOutput ());
outputRescaler ->SetOutputMaximum(255);
outputRescaler ->SetOutputMinimum(0);
We can now write the image to a file. The filters are executed by invoking the Update() method.
WriterType ::Pointer outputWriter = WriterType ::New ();
outputWriter ->SetInput(outputRescaler ->GetOutput ());
outputWriter ->SetFileName (argv[4]);
outputWriter ->Update();
We also want to write the deformation field along the first direction to a file. To achieve this we will
use the otb::MultiToMonoChannelExtractROI filter.
typedef otb::MultiToMonoChannelExtractROI<PixelType ,
PixelType >
ChannelExtractionFilterType;
ChannelExtractionFilterType::Pointer channelExtractor
= ChannelExtractionFilterType::New ();
channelExtractor ->SetInput (generator ->GetOutput ());
channelExtractor ->SetChannel (1);
RescalerType::Pointer fieldRescaler = RescalerType::New ();
fieldRescaler ->SetInput(channelExtractor ->GetOutput ());
fieldRescaler ->SetOutputMaximum(255);
fieldRescaler ->SetOutputMinimum(0);
WriterType ::Pointer fieldWriter = WriterType ::New ();
fieldWriter ->SetInput (fieldRescaler ->GetOutput ());
fieldWriter ->SetFileName (argv[3]);
fieldWriter ->Update();
Figure 10.3 shows the result of applying disparity map estimation on a stereo pair using a regular
point set, followed by deformation field estimation using Splines and fixed image resampling.
10.4 Stereo reconstruction
Examples/DisparityMap/StereoReconstructionExample.cxx.

Figure 10.3: From left to right and top to bottom: fixed input image, moving image with a sinusoid deformation,
estimated deformation field in the horizontal direction, resampled moving image.
This example demonstrates the use of the stereo reconstruction chain from an image pair. The images
are assumed to come from the same sensor but with different positions. The approach presented here
has the following steps:
• Epipolar resampling of the image pair
• Dense disparity map estimation
• Projection of the disparities on an existing Digital Elevation Model (DEM)
It is important to note that this method requires the sensor models with a pose estimate for each
image.
#include "otbStereorectificationDeformationFieldSource.h"
#include "otbStreamingWarpImageFilter.h"
#include "otbPixelWiseBlockMatchingImageFilter.h"
#include "otbBandMathImageFilter.h"
#include "otbSubPixelDisparityImageFilter.h"
#include "otbDisparityMapMedianFilter.h"
#include "otbDisparityMapToDEMFilter.h"
#include "otbBCOInterpolateImageFunction.h"
#include "itkVectorCastImageFilter.h"
#include "otbDEMHandler.h"
This example demonstrates the use of the following filters :
• otb::StereorectificationDeformationFieldSource
• otb::StreamingWarpImageFilter
• otb::PixelWiseBlockMatchingImageFilter

• otb::otbSubPixelDisparityImageFilter
• otb::otbDisparityMapMedianFilter
• otb::DisparityMapToDEMFilter
typedef otb:: StereorectificationDeformationFieldSource
<FloatImageType ,FloatVectorImageType > DeformationFieldSourceType;
typedef itk::Vector <double,2> DeformationType;
typedef otb::Image <DeformationType > DeformationFieldType;
typedef itk:: VectorCastImageFilter
<FloatVectorImageType ,
DeformationFieldType > DeformationFieldCastFilterType;
typedef otb:: StreamingWarpImageFilter
<FloatImageType ,
FloatImageType ,
DeformationFieldType > WarpFilterType;
typedef otb:: BCOInterpolateImageFunction
<FloatImageType > BCOInterpolationType;
typedef otb::Functor ::NCCBlockMatching
<FloatImageType ,FloatImageType > NCCBlockMatchingFunctorType;
typedef otb:: PixelWiseBlockMatchingImageFilter
<FloatImageType ,
FloatImageType ,
FloatImageType ,
FloatImageType ,
NCCBlockMatchingFunctorType> NCCBlockMatchingFilterType;
typedef otb::BandMathImageFilter
<FloatImageType > BandMathFilterType;
typedef otb:: SubPixelDisparityImageFilter
<FloatImageType ,
FloatImageType ,
FloatImageType ,
FloatImageType ,
NCCBlockMatchingFunctorType> NCCSubPixelDisparityFilterType;
typedef otb:: DisparityMapMedianFilter
<FloatImageType ,
FloatImageType ,
FloatImageType > MedianFilterType;
typedef otb:: DisparityMapToDEMFilter
<FloatImageType ,
FloatImageType ,
FloatImageType ,
FloatVectorImageType ,

FloatImageType > DisparityToElevationFilterType;
The image pair is supposed to be in sensor geometry. From two images covering nearly
the same area, one can estimate a common epipolar geometry. In this geometry, an al-
titude variation corresponds to an horizontal shift between the two images. The filter
otb::StereorectificationDeformationFieldSource computes the deformation grids for each
image.
These grids are sampled in epipolar geometry. They have two bands, containing the position offset
(in physical space units) between the current epipolar point and the corresponding sensor point in
horizontal and vertical direction. They can be computed at a lower resolution than sensor resolution.
The application StereoRectificationGridGenerator also provides a simple tool to generate the
epipolar grids for your image pair.
DeformationFieldSourceType::Pointer m_DeformationFieldSource = DeformationFieldSourceType::New();
m_DeformationFieldSource->SetLeftImage(leftReader ->GetOutput ());
m_DeformationFieldSource->SetRightImage(rightReader ->GetOutput ());
m_DeformationFieldSource->SetGridStep (4);
m_DeformationFieldSource->SetScale (1.0);
//m_DeformationFieldSource->SetAverageElevation(avgElevation);
m_DeformationFieldSource->Update();
Then, the sensor images can be resampled in epipolar geometry, using the
otb::StreamingWarpImageFilter. The application GridBasedImageResampling also gives
an easy access to this filter. The user can choose the epipolar region to resample, as well as the
resampling step and the interpolator.
Note that the epipolar image size can be retrieved from the stereo rectification grid filter.
FloatImageType::SpacingType epipolarSpacing;
epipolarSpacing[0] = 1.0;
epipolarSpacing[1] = 1.0;
FloatImageType::SizeType epipolarSize;
epipolarSize = m_DeformationFieldSource->GetRectifiedImageSize();
FloatImageType::PointType epipolarOrigin;
epipolarOrigin[0] = 0.0;
epipolarOrigin[1] = 0.0;
FloatImageType::PixelType defaultValue = 0;
The deformation grids are casted into deformation fields, then the left and right sensor images are
resampled.
DeformationFieldCastFilterType::Pointer m_LeftDeformationFieldCaster = DeformationFieldCastFilterTyp
m_LeftDeformationFieldCaster->SetInput(m_DeformationFieldSource->GetLeftDeformationFieldOutput());
m_LeftDeformationFieldCaster->GetOutput ()->UpdateOutputInformation();
BCOInterpolationType::Pointer leftInterpolator = BCOInterpolationType::New();

leftInterpolator ->SetRadius (2);
WarpFilterType::Pointer m_LeftWarpImageFilter = WarpFilterType::New();
m_LeftWarpImageFilter ->SetInput(leftReader ->GetOutput ());
m_LeftWarpImageFilter ->SetDeformationField(m_LeftDeformationFieldCaster->GetOutput ());
m_LeftWarpImageFilter ->SetInterpolator(leftInterpolator);
m_LeftWarpImageFilter ->SetOutputSize(epipolarSize);
m_LeftWarpImageFilter ->SetOutputSpacing(epipolarSpacing);
m_LeftWarpImageFilter ->SetOutputOrigin(epipolarOrigin);
m_LeftWarpImageFilter ->SetEdgePaddingValue(defaultValue);
DeformationFieldCastFilterType::Pointer m_RightDeformationFieldCaster = DeformationFieldCast
m_RightDeformationFieldCaster->SetInput (m_DeformationFieldSource->GetRightDeformationFieldOu
m_RightDeformationFieldCaster->GetOutput ()->UpdateOutputInformation();
BCOInterpolationType::Pointer rightInterpolator = BCOInterpolationType::New ();
rightInterpolator ->SetRadius (2);
WarpFilterType::Pointer m_RightWarpImageFilter = WarpFilterType::New();
m_RightWarpImageFilter ->SetInput (rightReader ->GetOutput ());
m_RightWarpImageFilter ->SetDeformationField(m_RightDeformationFieldCaster->GetOutput ());
m_RightWarpImageFilter ->SetInterpolator(rightInterpolator);
m_RightWarpImageFilter ->SetOutputSize(epipolarSize);
m_RightWarpImageFilter ->SetOutputSpacing(epipolarSpacing);
m_RightWarpImageFilter ->SetOutputOrigin(epipolarOrigin);
m_RightWarpImageFilter ->SetEdgePaddingValue(defaultValue);
Since the resampling produces black regions around the image, it is useless to estimate disparities
on these no-data regions. We use a otb::BandMathImageFilter to produce a mask on left and
right epipolar images.
BandMathFilterType::Pointer m_LBandMathFilter = BandMathFilterType::New ();
m_LBandMathFilter ->SetNthInput (0,m_LeftWarpImageFilter ->GetOutput (),"inleft");
std::string leftExpr = "if(inleft != 0,255,0)";
m_LBandMathFilter ->SetExpression(leftExpr );
BandMathFilterType::Pointer m_RBandMathFilter = BandMathFilterType::New ();
m_RBandMathFilter ->SetNthInput (0,m_RightWarpImageFilter ->GetOutput (),"inright");
std::string rightExpr = "if(inright != 0,255,0)";
m_RBandMathFilter ->SetExpression(rightExpr );
Once the two sensor images have been resampled in epipolar geometry, the disparity map can be
computed. The approach presented here is a 2D matching based on a pixel-wise metric optimiza-
tion. This approach doesn’t give the best results compared to global optimization methods, but it is
suitable for streaming and threading on large images.
The major ﬁlter used for this step is otb::PixelWiseBlockMatchingImageFilter. The metric
is computed on a window centered around the tested epipolar position. It performs a pixel-to-pixel
matching between the two epipolar images. The output disparities are given as index offset from left
to right position. The following features are available in this ﬁlter:
• Available metrics : SSD, NCC and Lp pseudo norm (computed on a square window)

• Rectangular disparity exploration area.
• Input masks for left and right images (optional).
• Output metric values (optional).
• Possibility to use input disparity estimate (as a uniform value or a full map) and an exploration
radius around these values to reduce the size of the exploration area (optional).
NCCBlockMatchingFilterType::Pointer m_NCCBlockMatcher = NCCBlockMatchingFilterType::New();
m_NCCBlockMatcher ->SetLeftInput(m_LeftWarpImageFilter ->GetOutput ());
m_NCCBlockMatcher ->SetRightInput(m_RightWarpImageFilter ->GetOutput ());
m_NCCBlockMatcher ->SetRadius (3);
m_NCCBlockMatcher ->SetMinimumHorizontalDisparity(-24);
m_NCCBlockMatcher ->SetMaximumHorizontalDisparity(0);
m_NCCBlockMatcher ->SetMinimumVerticalDisparity(0);
m_NCCBlockMatcher ->SetMaximumVerticalDisparity(0);
m_NCCBlockMatcher ->MinimizeOff ();
m_NCCBlockMatcher ->SetLeftMaskInput(m_LBandMathFilter ->GetOutput ());
m_NCCBlockMatcher ->SetRightMaskInput(m_RBandMathFilter ->GetOutput ());
Some other filters have been added to enhance these pixel-to-pixel disparities. The filter
otb::SubPixelDisparityImageFilter can estimate the disparities with sub-pixel precision. Sev-
eral interpolation methods can be used : parabolic fit, triangular fit, and dichotomy search.
NCCSubPixelDisparityFilterType::Pointer m_NCCSubPixFilter = NCCSubPixelDisparityFilterType::New ();
m_NCCSubPixFilter ->SetInputsFromBlockMatchingFilter(m_NCCBlockMatcher);
m_NCCSubPixFilter ->SetRefineMethod(NCCSubPixelDisparityFilterType::DICHOTOMY );
The filter otb::DisparityMapMedianFilter can be used to remove outliers. It has two parame-
ters:
• The radius of the local neighborhood to compute the median
• An incoherence threshold to reject disparities whose distance from the local median is superior
to the threshold.
MedianFilterType::Pointer m_HMedianFilter = MedianFilterType::New();
m_HMedianFilter ->SetInput (m_NCCSubPixFilter ->GetHorizontalDisparityOutput());
m_HMedianFilter ->SetRadius (2);
m_HMedianFilter ->SetIncoherenceThreshold(2.0);
m_HMedianFilter ->SetMaskInput(m_LBandMathFilter ->GetOutput ());
MedianFilterType::Pointer m_VMedianFilter = MedianFilterType::New();
m_VMedianFilter ->SetInput (m_NCCSubPixFilter ->GetVerticalDisparityOutput());
m_VMedianFilter ->SetRadius (2);
m_VMedianFilter ->SetIncoherenceThreshold(2.0);
m_VMedianFilter ->SetMaskInput(m_LBandMathFilter ->GetOutput ());

The application PixelWiseBlockMatching contains all these filters and provides a single interface
to compute your disparity maps.
The disparity map obtained with the previous step usually gives a good idea of the altitude profile.
However, it is more useful to study altitude with a DEM (Digital Elevation Model) representation.
The filter otb::DisparityMapToDEMFilter performs this last step. The behavior of this filter is
to :
• Compute the DEM extent from the left sensor image envelope (spacing is set by the user)
• Compute the left and right rays corresponding to each valid disparity
• Compute the intersection with the mid-point method
• If the 3D point falls inside a DEM cell and has a greater elevation than the current height, the
cell height is updated
The rule of keeping the highest elevation makes sense for buildings seen from the side because the
roof edges elevation has to be kept. However this rule is not suited for noisy disparities.
The application DisparityMapToElevationMap also gives an example of use.
DisparityToElevationFilterType::Pointer m_DispToElev = DisparityToElevationFilterType::New ()
m_DispToElev ->SetHorizontalDisparityMapInput(m_HMedianFilter ->GetOutput ());
m_DispToElev ->SetVerticalDisparityMapInput(m_VMedianFilter ->GetOutput ());
m_DispToElev ->SetLeftInput(leftReader ->GetOutput ());
m_DispToElev ->SetRightInput(rightReader ->GetOutput ());
m_DispToElev ->SetLeftEpipolarGridInput(m_DeformationFieldSource->GetLeftDeformationFieldOutp
m_DispToElev ->SetRightEpipolarGridInput(m_DeformationFieldSource->GetRightDeformationFieldOu
m_DispToElev ->SetElevationMin(avgElevation -10.0);
m_DispToElev ->SetElevationMax(avgElevation+80.0);
m_DispToElev ->SetDEMGridStep(2.5);
m_DispToElev ->SetDisparityMaskInput(m_LBandMathFilter ->GetOutput ());
//m_DispToElev->SetAverageElevation(avgElevation);
WriterType ::Pointer m_DEMWriter = WriterType ::New ();
m_DEMWriter ->SetInput (m_DispToElev ->GetOutput ());
m_DEMWriter ->SetFileName (argv[3]);
m_DEMWriter ->Update();
RescalerType::Pointer fieldRescaler = RescalerType::New ();
fieldRescaler ->SetInput(m_DispToElev ->GetOutput ());
fieldRescaler ->SetOutputMaximum(255);
fieldRescaler ->SetOutputMinimum(0);
OutputWriterType::Pointer fieldWriter = OutputWriterType::New();
fieldWriter ->SetInput (fieldRescaler ->GetOutput ());
fieldWriter ->SetFileName (argv[4]);
fieldWriter ->Update();

Figure 10.4: DEM image estimated from the disparity.
Figure 10.4 shows the result of applying terrain reconstruction based using pixel-wise block match-
ing, sub-pixel interpolation and DEM estimation using a pair of Pleiades images over the Stadium
in Toulouse, France.

CHAPTER
ELEVEN
ORTHORECTIFICATION AND MAP
PROJECTION
Input Series
Sensor
Model
DEM
Geographic Geometry
Homologous
Points
Bundle-block
Adjustment
Map Pro-
jections
Cartographic Geometry
Figure 11.1: Image Ortho-registration Procedure.
This chapter introduces the functionnalities available in OTB for image ortho-registration. We
deﬁne ortho-registration as the procedure allowing to transform an image in sensor geometry to a
geographic or cartographic projection.
Figure 11.1 shows a synoptic view of the different steps involved in a classical ortho-registration
processing chain able to deal with image series. These steps are the following:

278 Chapter 11. Orthorectification and Map Projection
• Sensor modelling: the geometric sensor model allows to convert image coordinates (line,
column) into geographic coordinates (latitude, longitude); a rigorous modelling needs a digital
elevation model (DEM) in order to take into account the terrain topography.
• Bundle-block adjustment: in the case of image series, the geometric models and their param-
eters can be refined by using homologous points between the images. This is an optional step
and not currently implemented in OTB.
• Map projection: this step allows to go from geographic coordinates to some specific carto-
graphic projection as Lambert, Mercator or UTM.
11.1 Sensor Models
A sensor model is a set of equations giving the relationship between image pixel (l,c) coordinates
and ground (X,Y) coordinates for every pixel in the image. Typically, the ground coordinates are
given in a geographic projection (latitude, longitude). The sensor model can be expressed either
from image to ground – forward model – or from ground to image – inverse model. This can be
written as follows:
Forward
X = fx(l,c,h,θ) Y = fy(l,c,h,θ)
Inverse
l = gl(X,Y,h,θ) c = gc(X,Y,h,θ)
Where θ is the set of parameters which describe the sensor and the acquisition geometry (plat-
form altitude, viewing angle, focal length for optical sensors, doppler centroid for SAR images, etc.).
In OTB, sensor models are implemented as itk::Transforms (see section 9.6 for details),
which is the appropriate way to express coordinate changes. The base class for sensor mod-
els is otb::SensorModelBase from which the classes otb::InverseSensorModel and
otb::ForwardSensorModel inherit.
As one may note from the model equations, the height of the ground, h, must be known. Usually, it
means that a Digital Elevation Model, DEM, will be used.
11.1.1 Types of Sensor Models
There exists two main types of sensor models. On one hand, we have the so-called physical models,
which are rigorous, complex, eventually highly non-linear equations of the sensor geometry. As

11.1. Sensor Models 279
such, they are difficult to inverse (obtain the inverse model from the forward one and vice-versa).
They have the significant advantage of having parameters with physical meaning (angles, distances,
etc.). They are specific of each sensor, which means that a library of models is required in the
software. A library which has to be updated every time a new sensor is available.
On the other hand, we have general analytical models, which approximate the physical models.
These models can take the form of polynomials or ratios of polynomials, the so-called rational
polynomial functions or Rational Polynomial Coefficients, RPC, also known as Rapid Positioning
Capability. Since they are approximations, they are less accurate than the physical models.
However, the achieved accuracy is usually high: in the case of Pléiades, RPC models have errors
lower than 0.02 pixels with respect to the physical model. Since these models have a standard
form they are easier to use and implement. However, they have the drawback of having parameters
(coefficients, actually) without physical meaning.
OTB, through the use of the OSSIM library – http://www.ossim.org – offers models for most
of current sensors either through a physical or an analytical approach. This is transparent for the
user, since the geometrical model for a given image is instantiated using the information stored in
its meta-data.
11.1.2 Using Sensor Models
The transformation of an image in sensor geometry to geographic geometry can be done using the
following steps.
1. Read image meta-data and instantiate the model with the given parameters.
2. Define the ROI in ground coordinates (this is your output pixel array)
3. Iterate through the pixels of coordinates (X,Y):
(a) Get h from the DEM
(b) Compute (c,l) = G(X,Y,h,θ)
(c) Interpolate pixel values if (c,l) are not grid coordinates.
Actually, in OTB, you don’t have to manually instantiate the sensor model which is appropriate to
your image. That is, you don’t have to manually choose a SPOT5 or a Quickbird sensor model.
This task is automatically performed by the otb::ImageFileReader class in a similar way as the
image format recognition is done. The appropriate sensor model will then be included in the image
meta-data, so you can access it when needed.
Examples/Projections/SensorModelExample.cxx.

This example illustrates how to use the sensor model read from image meta-data in order to perform
ortho-rectification. This is a very basic, step-by-step example, so you understand the different com-
ponents involved in the process. When performing real ortho-rectifications, you can use the example
presented in section 11.3.
We will start by including the header file for the inverse sensor model.
#include "otbInverseSensorModel.h"
As explained before, the first thing to do is to create the sensor model in order to transform the ground
coordinates in sensor geometry coordinates. The geoetric model will automatically be created by
the image file reader. So we begin by declaring the types for the input image and the image reader.
ImageType ::Pointer inputImage = reader ->GetOutput ();
We have just instantiated the reader and set the file name, but the image data and meta-data has not
yet been accessed by it. Since we need the creation of the sensor model and all the image information
(size, spacing, etc.), but we do not want to read the pixel data – it could be huge – we just ask the
reader to generate the output information needed.
std::cout << "Original input imagine spacing: " <<
reader ->GetOutput ()->GetSpacing () << std::endl;
We can now instantiate the sensor model – an inverse one, since we want to convert ground coordi-
nates to sensor geometry. Note that the choice of the specific model (SPOT5, Ikonos, etc.) is done
by the reader and we only need to instantiate a generic model.
typedef otb::InverseSensorModel <double> ModelType ;
ModelType ::Pointer model = ModelType ::New ();
The model is parameterized by passing to it the keyword list containing the needed information.
model ->SetImageGeometry(reader ->GetOutput ()->GetImageKeywordlist());
Since we can not be sure that the image we read actually has sensor model information, we must
check for the model validity.
if (model ->IsValidSensorModel() == false)
{
std::cerr << "Unable to create a model" << std::endl;
return 1;
}

The types for the input and output coordinate points can be now declared. The same is done for the
index types.
ModelType :: OutputPointType inputPoint ;
typedef itk::Point <double, 2> PointType ;
PointType outputPoint ;
ImageType ::IndexType currentIndex;
ImageType ::IndexType currentIndexBis;
ImageType ::IndexType pixelIndexBis;
We will now create the output image over which we will iterate in order to transform ground coor-
dinates to sensor coordinates and get the corresponding pixel values.
ImageType ::Pointer outputImage = ImageType ::New();
ImageType ::PixelType pixelValue ;
start[0] = 0;
start[1] = 0;
The spacing in y direction is negative since origin is the upper left corner.
spacing [0] = 0.00001;
spacing [1] = -0.00001;
origin[0] = strtod(argv[3], NULL); //longitude
origin[1] = strtod(argv[4], NULL); //latitude
ImageType :: RegionType region;
outputImage ->SetOrigin (origin);
outputImage ->SetRegions (region);
outputImage ->SetSpacing (spacing );
outputImage ->Allocate ();
We will now instantiate an extractor ﬁlter in order to get input regions by manual tiling.
typedef itk::ExtractImageFilter <ImageType , ImageType > ExtractType ;
ExtractType ::Pointer extract = ExtractType ::New();

Since the transformed coordinates in sensor geometry may not be integer ones, we will need an
interpolator to retrieve the pixel values (note that this assumes that the input image was correctly
sampled by the acquisition system).
typedef itk::LinearInterpolateImageFunction<ImageType , double>
InterpolatorType;
We proceed now to create the image writer. We will also use a writer plugged to the output of the
extractor filter which will write the temporary extracted regions. This is just for monitoring the
process.
typedef otb::Image <unsigned char, 2> CharImageType;
typedef otb::ImageFileWriter <CharImageType > CharWriterType;
WriterType ::Pointer extractorWriter = WriterType ::New ();
CharWriterType::Pointer writer = CharWriterType::New();
extractorWriter ->SetFileName ("image_temp .jpeg");
extractorWriter ->SetInput (extract ->GetOutput ());
Since the output pixel type and the input pixel type are different, we will need to rescale the intensity
values before writing them to a file.
<ImageType , CharImageType > RescalerType;
The tricky part starts here. Note that this example is only intended for educationnal purposes
and that you do not need to proceed as this. See the example in section 11.3 in order to code
ortho-rectification chains in a very simple way.
You want to go on? OK. You have been warned.
We will start by declaring an image region iterator and some convenience variables.
typedef itk::ImageRegionIteratorWithIndex<ImageType > IteratorType;
unsigned int NumberOfStreamDivisions;
if (atoi(argv[7]) == 0)
{
NumberOfStreamDivisions = 10;
}
else
{
NumberOfStreamDivisions = atoi(argv[7]);
}
unsigned int count = 0;

unsigned int It, j, k;
int max_x , max_y , min_x , min_y;
ImageType ::IndexType iterationRegionStart;
ImageType ::SizeType iteratorRegionSize;
ImageType :: RegionType iteratorRegion;
The loop starts here.
for (count = 0; count < NumberOfStreamDivisions; count++)
{
iteratorRegionSize[0] = atoi(argv[5]);
if (count == NumberOfStreamDivisions - 1)
{
iteratorRegionSize[1] = (atoi(argv[6])) - ((int) ((( atoi(argv[6])) /
NumberOfStreamDivisions)
+ 0.5)) * (count);
iterationRegionStart[1] = (atoi(argv[5])) - (iteratorRegionSize[1]);
}
else
{
iteratorRegionSize[1] = (int) ((( atoi(argv[6])) /
NumberOfStreamDivisions) + 0.5);
iterationRegionStart[1] = count * iteratorRegionSize[1];
}
iterationRegionStart[0] = 0;
iteratorRegion.SetSize(iteratorRegionSize);
iteratorRegion.SetIndex(iterationRegionStart);
We create an array for storing the pixel indexes.
unsigned int pixelIndexArrayDimension = iteratorRegionSize[0] *
iteratorRegionSize[1] * 2;
int *pixelIndexArray = new int[pixelIndexArrayDimension];
int *currentIndexArray = new int[pixelIndexArrayDimension];
We create an iterator for each piece of the image, and we iterate over them.
IteratorType outputIt(outputImage , iteratorRegion);
It = 0;
for (outputIt.GoToBegin (); !outputIt.IsAtEnd (); ++outputIt )
{
We get the current index.
currentIndex = outputIt .GetIndex ();
We transform the index to physical coordinates.
outputImage ->TransformIndexToPhysicalPoint(currentIndex , outputPoint );

We use the sensor model to get the pixel coordinates in the input image and we transform this
coodinates to an index. Then we store the index in the array. Note that the TransformPoint()
method of the model has been overloaded so that it can be used with a 3D point when the height of
the ground point is known (DEM availability).
inputPoint = model ->TransformPoint(outputPoint );
pixelIndexArray[It] = static_cast<int>(inputPoint [0]);
pixelIndexArray[It + 1] = static_cast<int>(inputPoint [1]);
currentIndexArray[It] = static_cast<int>(currentIndex[0]);
currentIndexArray[It + 1] = static_cast<int>(currentIndex[1]);
It = It + 2;
}
At this point, we have stored all the indexes we need for the piece of image we are processing. We
can now compute the bounds of the area in the input image we need to extract.
max_x = pixelIndexArray[0];
min_x = pixelIndexArray[0];
max_y = pixelIndexArray[1];
min_y = pixelIndexArray[1];
for (j = 0; j < It; ++j)
{
if (j % 2 == 0 && pixelIndexArray[j] > max_x)
{
max_x = pixelIndexArray[j];
}
if (j % 2 == 0 && pixelIndexArray[j] < min_x)
{
min_x = pixelIndexArray[j];
}
if (j % 2 != 0 && pixelIndexArray[j] > max_y)
{
max_y = pixelIndexArray[j];
}
if (j % 2 != 0 && pixelIndexArray[j] < min_y)
{
min_y = pixelIndexArray[j];
}
}
We can now set the parameters for the extractor using a little bit of margin in order to cope with
irregular geometric distortions which could be due to topography, for instance.
ImageType ::RegionType extractRegion;
ImageType ::IndexType extractStart;
if (min_x < 10 && min_y < 10)
{

extractStart[0] = 0;
extractStart[1] = 0;
}
else
{
extractStart[0] = min_x - 10;
extractStart[1] = min_y - 10;
}
ImageType ::SizeType extractSize ;
extractSize [0] = (max_x - min_x) + 20;
extractSize [1] = (max_y - min_y) + 20;
extractRegion.SetSize(extractSize );
extractRegion.SetIndex(extractStart);
extract->SetExtractionRegion(extractRegion);
extract->SetInput (reader ->GetOutput ());
extractorWriter ->Update ();
We give the input image to the interpolator and we loop through the index array in order to get the
corresponding pixel values. Note that for every point we check whether it is inside the extracted
region.
interpolator ->SetInputImage(extract->GetOutput ());
for (k = 0; k < It / 2; ++k)
{
pixelIndexBis[0] = pixelIndexArray[2 * k];
pixelIndexBis[1] = pixelIndexArray[2 * k + 1];
currentIndexBis[0] = currentIndexArray[2 * k];
currentIndexBis[1] = currentIndexArray[2 * k + 1];
if (interpolator ->IsInsideBuffer(pixelIndexBis))
{
pixelValue = int(interpolator ->EvaluateAtIndex(pixelIndexBis));
}
else
{
pixelValue = 0;
}
outputImage ->SetPixel(currentIndexBis , pixelValue );
}
delete pixelIndexArray;
delete currentIndexArray;
}
So we are done. We can now write the output image to a ﬁle after performing the intensity rescaling.

rescaler ->SetInput (outputImage );
writer ->Update();
11.1.3 Evaluating Sensor Model
If no appropriate sensor model is available in the image meta-data, OTB offers the possibility to
estimate a sensor model from the image.
Examples/Projections/EstimateRPCSensorModelExample.cxx.
The following example illustrates the application of estimation of a sensor model to an image (lim-
ited to a RPC sensor model for now).
The otb::GCPsToRPCSensorModelImageFilter estimates a RPC sensor model from a list of user
deﬁned GCPs. Internally, it uses an ossimRpcSolver, which performs the estimation using the well
known least-square method.
the otb::GCPsToRPCSensorModelImageFilter class must be included.
#include <ios>
#include "otbGCPsToRPCSensorModelImageFilter.h"
We declare the image type based on a particular pixel type and dimension. In this case the float
type is used for the pixels.
typedef otb::Image <float, 2> ImageType ;
typedef otb::GCPsToRPCSensorModelImageFilter<ImageType >
GCPsToSensorModelFilterType;
typedef GCPsToSensorModelFilterType:: Point2DType Point2DType ;
typedef GCPsToSensorModelFilterType:: Point3DType Point3DType ;
The otb::GCPsToRPCSensorModelImageFilter is instantiated.
GCPsToSensorModelFilterType::Pointer rpcEstimator =
GCPsToSensorModelFilterType::New();
rpcEstimator ->SetInput(reader ->GetOutput ());
We retrieve the command line parameters and put them in the correct variables. Firstly, We deter-
mine the number of GCPs set from the command line parameters and they are stored in:

• otb::Point3DType : Store the sensor point (3D ground point)
• otb::Point2DType : Pixel assotiated in the image (2D physical coordinates)
Here we do not use DEM or MeanElevation. It is also possible to give a 2D ground point and
use the DEM or MeanElevation to get the corresponding elevation.
for (unsigned int gcpId = 0; gcpId < nbGCPs; ++gcpId)
{
Point2DType sensorPoint ;
sensorPoint [0] = atof(argv[3 + gcpId * 5]);
sensorPoint [1] = atof(argv[4 + gcpId * 5]);
Point3DType geoPoint;
geoPoint [0] = atof(argv[5 + 5 * gcpId]);
std::cout << "Adding GCP sensor: " << sensorPoint << " <-> geo: " <<
geoPoint << std::endl;
rpcEstimator ->AddGCP(sensorPoint , geoPoint );
}
Note that the otb::GCPsToRPCSensorModelImageFilter needs at least 16 GCPs to estimate
a proper RPC sensor model, although no warning will be reported to the user if the num-
ber of GCPs is lower than 16. Actual estimation of the sensor model takes place in the
GenerateOutputInformation() method.
rpcEstimator ->GetOutput ()->UpdateOutputInformation();
The result of the RPC model estimation and the residual ground error is then save in a txt file. Note
that This filter does not modify the image buffer, but only the metadata.
std::ofstream ofs;
ofs.open(outfname );
// Set floatfield to format properly
ofs.setf(std::ios::fixed , std::ios::floatfield );
ofs.precision (10);
ofs << (ImageType ::Pointer) rpcEstimator ->GetOutput () << std::endl;
ofs << "Residual ground error: " << rpcEstimator ->GetRMSGroundError() <<
std::endl;
ofs.close();
The output image can be now given to the otb::orthorectificationFilter. Note that this filter
allows also to import GCPs from the image metadata, if any.

11.1.4 Limits of the Approach
As you may understand by now, accurate geo-referencing needs accurate DEM and also accurate
sensor models and parameters. In the case where we have several images acquired over the same
area by different sensors or different geometric configurations, geo-referencing (geographical
coordinates) or ortho-rectification (cartographic coordinates) is not usually enough. Indeed, when
working with image series we usually want to compare them (fusion, change detection, etc.) at the
pixel level.
Since common DEM and sensor parameters do not allow for such an accuracy, we have to use clever
strategies to improve the co-registration of the images. The classical one consists in refining the
sensor parameters by taking homologous points between the images to co-register. This is called
bundle block adjustment and will be implemented in coming versions of OTB.
Even if the model parameters are refined, errors due to DEM accuracy can not be eliminated. In this
case, image to image registration can be applied. These approaches are presented in chapters 9 and
10.
11.2 Map Projections
Map projections describe the link between geographic coordinates and cartographic ones. So map
projections allow to represent a 2-dimensional manifold of a 3-dimensional space (the Earth surface)
in a 2-dimensional space (a map which used to be a sheet of paper!). This geometrical transforma-
tion doesn’t have a unique solution, so over the cartography history, every country or region in the
world has been able to express the belief of being the center of the universe. In other words, every
cartographic projection tries to minimize the distortions of the 3D to 2D transformation for a given
point of the Earth surface1
.
In OTB the otb::MapProjection class is derived from the itk::Transform class, so
the coordinate transformation points are overloaded with map projection equations. The
otb::MapProjection class is templated over the type of cartographic projection, which is pro-
vided by the OSSIM library. In order to hide the complexity of the approach, some type definitions
for the more common projections are given in the file otbMapProjections.h file.
Sometimes, you don’t know at compile time what map projection you will need in your application.
In this case, the otb::GenericMapProjection allow you to set the map projection at run-time by
passing the WKT identification for the projection.
Examples/Projections/MapProjectionExample.cxx.
Map projection is an important issue when working with satellite images. In the orthorectification
1We proposed to optimize an OTB map projection for Toulouse, but we didn’t get any help from OTB users.

11.2. Map Projections 289
process, converting between geographic and cartographic coordinates is a key step. In this process,
everything is integrated and you don’t need to know the details.
However, sometimes, you need to go hands-on and find out the nitty-gritty details. This example
shows you how to play with map projections in OTB and how to convert coordinates. In most cases,
the underlying work is done by OSSIM.
First, we start by including the otbMapProjections header. In this file, over 30 projections are defined
and ready to use. It is easy to add new one.
The otbGenericMapProjection enables you to instanciate a map projection from a WKT (Well
Known Text) string, which is popular with OGR for example.
#include "otbMapProjections.h"
#include "otbGenericMapProjection.h"
We retrieve the command line parameters and put them in the correct variables. The transforms are
going to work with an itk::Point.
const char * outFileName = argv[1];
itk::Point <double, 2> point;
point[0] = 1.4835345;
point[1] = 43.55968261;
The output of this program will be saved in a text file. We also want to make sure that the precision
of the digits will be enough.
std::ofstream file;
file.open(outFileName );
file << std::setprecision(15);
We can now instantiate our first map projection. Here, it is a UTM projection. We also need to
provide the information about the zone and the hemisphere for the projection. These are specific to
the UTM projection.
otb:: UtmForwardProjection::Pointer utmProjection
= otb::UtmForwardProjection::New();
utmProjection ->SetZone (31);
utmProjection ->SetHemisphere(’N’);
The TransformPoint() method returns the coordinates of the point in the new projection.
file << "Forward UTM projection : " << std::endl;
file << point << " -> ";
file << utmProjection ->TransformPoint(point);
file << std::endl << std::endl;
We follow the same path for the Lambert93 projection:

otb:: Lambert93ForwardProjection::Pointer lambertProjection
= otb::Lambert93ForwardProjection::New();
file << "Forward Lambert93 projection : " << std::endl;
file << point << " -> ";
file << lambertProjection ->TransformPoint(point);
If you followed carefully the previous examples, you’ve noticed that the target projections have been
directly coded, which means that they can’t be changed at run-time. What happens if you don’t know
the target projection when you’re writing the program? It can depend on some input provided by the
user (image, shapefile).
In this situation, you can use the otb::GenericMapProjection. It will accept a string to set the
projection. This string should be in the WKT format.
For example:
std::string projectionRefWkt = "PROJCS[" UTM Zone 31, Northern Hemisphere ","
"GEOGCS[" WGS 84", DATUM[" WGS_1984 ", SPHEROID [" WGS 84", 6378137 ,
"AUTHORITY [" EPSG" ,"7030"]] , TOWGS84 [0, 0, 0, 0, 0, 0, 0],"
"AUTHORITY [" EPSG" ,"6326"]] , PRIMEM [" Greenwich ", 0, AUTHORITY ["
"UNIT[" degree", 0.0174532925199433 , AUTHORITY ["EPSG" ,"9108"]] ,"
"AXIS["Lat", NORTH], AXIS[" Long", EAST],"
"AUTHORITY [" EPSG" ,"4326"]] , PROJECTION [" Transverse_Mercator"],"
"PARAMETER [" latitude_of_origin", 0], PARAMETER [" central_meridian"
"PARAMETER [" scale_factor", 0.9996] , PARAMETER [" false_easting", 50
"PARAMETER [" false_northing", 0], UNIT[" Meter", 1]]";
This string is then passed to the projection using the SetWkt() method.
typedef otb::GenericMapProjection <otb:: TransformDirection::FORWARD> GenericMapProjection;
GenericMapProjection::Pointer genericMapProjection =
GenericMapProjection::New();
genericMapProjection ->SetWkt(projectionRefWkt);
file << "Forward generic projection : " << std::endl;
file << point << " -> ";
file << genericMapProjection ->TransformPoint(point);
And of course, we don’t forget to close the file:
file.close();
The final output of the program should be:
Forward UTM projection:
[1.4835345, 43.55968261] -> [377522.448427013, 4824086.71129131]

11.3. Orthorectification with OTB 291
Forward Lambert93 projection:
[1.4835345, 43.55968261] -> [577437.889798954, 6274578.791561]
Forward generic projection:
[1.4835345, 43.55968261] -> [377522.448427013, 4824086.71129131]
You will seldom use a map projection by itself, but rather in an ortho-rectification framework. An
example is given in the next section.
11.3 Orthorectification with OTB
Examples/Projections/OrthoRectificationExample.cxx.
This example demonstrates the use of the otb::OrthoRectificationFilter. This filter is in-
tended to orthorectify images which are in a distributor format with the appropriate meta-data de-
scribing the sensor model. In this example, we will choose to use an UTM projection for the output
image.
The first step toward the use of these filters is to include the proper header files: the one for the
ortho-rectification filter and the one defining the different projections available in OTB.
#include "otbOrthoRectificationFilter.h"
#include "otbMapProjections.h"
We will start by defining the types for the images, the image file reader and the image file writer. The
writer will be a otb::ImageFileWriter which will allow us to set the number of stream divisions
we want to apply when writing the output image, which can be very large.
typedef otb::Image <int, 2> ImageType ;
typedef otb::VectorImage <int, 2> VectorImageType;
typedef otb::ImageFileWriter <VectorImageType > WriterType ;
We can now proceed to declare the type for the ortho-rectification filter. The class
otb::OrthoRectificationFilter is templated over the input and the output image types as well
as over the cartographic projection. We define therefore the type of the projection we want, which
is an UTM projection for this case.

typedef otb::UtmInverseProjection utmMapProjectionType;
typedef otb::OrthoRectificationFilter<VectorImageType , VectorImageType ,
utmMapProjectionType >
OrthoRectifFilterType;
OrthoRectifFilterType::Pointer orthoRectifFilter =
OrthoRectifFilterType::New ();
Now we need to instanciate the map projection, set the zone and hemisphere parameters and pass
this projection to the orthorectification filter.
utmMapProjectionType::Pointer utmMapProjection =
utmMapProjectionType::New();
utmMapProjection ->SetZone(atoi(argv[3]));
utmMapProjection ->SetHemisphere(*(argv[4]));
orthoRectifFilter ->SetMapProjection(utmMapProjection);
We then wire the input image to the orthorectification filter.
orthoRectifFilter ->SetInput (reader ->GetOutput ());
Using the user-provided information, we define the output region for the image generated by the
orthorectification filter. We also define the spacing of the deformation grid where actual deformation
values are estimated. Choosing a bigger deformation field spacing will speed up computation.
start[0] = 0;
start[1] = 0;
orthoRectifFilter ->SetOutputStartIndex(start);
orthoRectifFilter ->SetOutputSize(size);
orthoRectifFilter ->SetOutputSpacing(spacing);
ImageType ::SpacingType gridSpacing ;
gridSpacing [0] = 2.*atof(argv[9]);
gridSpacing [1] = 2.*atof(argv[10]);
orthoRectifFilter ->SetDeformationFieldSpacing(gridSpacing );
orthoRectifFilter ->SetOutputOrigin(origin);
We can now set plug the ortho-rectification filter to the writer and set the number of tiles we want to
split the output image in for the writing step.

11.4. Vector data projection manipulation 293
writer ->SetInput(orthoRectifFilter ->GetOutput ());
writer ->SetAutomaticTiledStreaming();
Finally, we trigger the pipeline execution by calling the Update() method on the writer. Please note
that the ortho-rectification filter is derived from the otb::StreamingResampleImageFilter in or-
der to be able to compute the input image regions which are needed to build the output image. Since
the resampler applies a geometric transformation (scale, rotation, etc.), this region computation is
not trivial.
writer ->Update();
11.4 Vector data projection manipulation
Examples/Projections/VectorDataProjectionExample.cxx.
Let’s assume that you have a KML file (hence in geographical coordinates) that you would like
to superpose to some image with a specific map projection. Of course, you could use the handy
ogr2ogr tool to do that, but it won’t integrate so seamlessly into your OTB application.
You can also suppose that the image on which you want to superpose the data is not in a specific
map projection but a raw image from a particular sensor. Thanks to OTB, the same code below will
be able to do the appropriate conversion.
This example demonstrates the use of the otb::VectorDataProjectionFilter.
Declare the vector data type that you would like to use in your application.
typedef otb::VectorData <double> InputVectorDataType;
typedef otb::VectorData <double> OutputVectorDataType;
Declare and instantiate the vector data reader: otb::VectorDataFileReader. The call to the
UpdateOutputInformation() method fill up the header information.
typedef otb::VectorDataFileReader <InputVectorDataType >
VectorDataFileReaderType;
We need the image only to retrieve its projection information, i.e. map projection or sensor
model parameters. Hence, the image pixels won’t be read, only the header information using the
UpdateOutputInformation() method.

typedef otb::Image <unsigned short int, 2> ImageType ;
typedef otb::ImageFileReader <ImageType > ImageReaderType;
ImageReaderType::Pointer imageReader = ImageReaderType::New();
imageReader ->SetFileName (argv[2]);
The otb::VectorDataProjectionFilter will do the work of converting the vector data coor-
dinates. It is usually a good idea to use it when you design applications reading or saving vector
data.
typedef otb::VectorDataProjectionFilter<InputVectorDataType ,
OutputVectorDataType >
VectorDataFilterType;
VectorDataFilterType::Pointer vectorDataProjection =
VectorDataFilterType::New();
Information concerning the original projection of the vector data will be automatically retrieved
from the metadata. Nothing else is needed from you:
vectorDataProjection ->SetInput(reader ->GetOutput ());
Information about the target projection is retrieved directly from the image:
vectorDataProjection ->SetOutputKeywordList(
imageReader ->GetOutput ()->GetImageKeywordlist());
vectorDataProjection ->SetOutputOrigin(
imageReader ->GetOutput ()->GetOrigin ());
vectorDataProjection ->SetOutputSpacing(
imageReader ->GetOutput ()->GetSpacing ());
vectorDataProjection ->SetOutputProjectionRef(
imageReader ->GetOutput ()->GetProjectionRef());
Finally, the result is saved into a new vector file.
typedef otb::VectorDataFileWriter <OutputVectorDataType >
VectorDataFileWriterType;
VectorDataFileWriterType::Pointer writer = VectorDataFileWriterType::New();
writer ->SetInput(vectorDataProjection ->GetOutput ());
writer ->Update();
It is worth noting that none of this code is specific to the vector data format. Whether you pass a
shapefile, or a KML file, the correct driver will be automatically instantiated.
11.5 Geometries projection manipulation
Examples/Projections/GeometriesProjectionExample.cxx.

11.5. Geometries projection manipulation 295
Instead of using otb::VectorData to apply projections as explained in 11.4, we can also directly
work on OGR data types thanks to otb::GeometriesProjectionFilter.
This example demonstrates how to proceed with this alternative set of vector data types.
Declare the geometries type that you would like to use in your application. Unlike
otb::VectorData, otb::GeometriesSet is a single type for any kind of geometries set (OGR
data source, or OGR layer).
typedef otb::GeometriesSet InputGeometriesType;
typedef otb::GeometriesSet OutputGeometriesType;
First, declare and instantiate the data source otb::ogr::DataSource. Then, encapsulate this data
source into a otb::GeometriesSet.
otb::ogr ::DataSource ::Pointer input = otb::ogr::DataSource ::New(
argv[1], otb::ogr::DataSource ::Modes::Read);
InputGeometriesType::Pointer in_set = InputGeometriesType::New(input);
We need the image only to retrieve its projection information, i.e. map projection or sensor
model parameters. Hence, the image pixels won’t be read, only the header information using the
UpdateOutputInformation() method.
typedef otb::Image <unsigned short int, 2> ImageType ;
typedef otb::ImageFileReader <ImageType > ImageReaderType;
ImageReaderType::Pointer imageReader = ImageReaderType::New();
imageReader ->SetFileName (argv[2]);
The otb::GeometriesProjectionFilter will do the work of converting the geometries coor-
dinates. It is usually a good idea to use it when you design applications reading or saving vector
data.
typedef otb:: GeometriesProjectionFilter GeometriesFilterType;
GeometriesFilterType::Pointer filter = GeometriesFilterType::New ();
Information concerning the original projection of the vector data will be automatically retrieved
from the metadata. Nothing else is needed from you:
filter ->SetInput(in_set);
Information about the target projection is retrieved directly from the image:
// necessary for sensors
filter ->SetOutputKeywordList(imageReader ->GetOutput ()->GetImageKeywordlist());
filter ->SetOutputOrigin(imageReader ->GetOutput ()->GetOrigin ());
filter ->SetOutputSpacing(imageReader ->GetOutput ()->GetSpacing ());
// ˜ wkt
filter ->SetOutputProjectionRef( imageReader ->GetOutput ()->GetProjectionRef());

Finally, the result is saved into a new vector file. Unlike other OTB filters,
otb::GeometriesProjectionFilter expects to be given a valid output geometries set where to
store the result of its processing – otherwise the result will be an in-memory data source, and not
stored in a file nor a data base.
Then, the processing is started by calling Update(). The actual serialization of the results is guaran-
teed to be completed when the ouput geometries set object goes out of scope, or when SyncToDisk
is called.
otb::ogr:: DataSource ::Pointer output = otb::ogr::DataSource ::New(
argv[3], otb::ogr::DataSource ::Modes::Update_LayerCreateOnly);
OutputGeometriesType::Pointer out_set = OutputGeometriesType::New(output);
filter ->SetOutput (out_set );
filter ->Update();
Once again, it is worth noting that none of this code is specific to the vector data format. Whether
you pass a shapefile, or a KML file, the correct driver will be automatically instantiated.
11.6 Elevation management with OTB
Examples/IO/DEMHandlerExample.cxx.
OTB relies on OSSIM for elevation handling. Since release 3.16, there is a single configuration
class otb::DEMHandler to manage elevation (in image projections or localization functions for
example). This configuration is managed by the a proper instanciation and parameters setting of
this class. These instanciations must be done before any call to geometric filters or functionalities.
Ossim internal accesses to elevation are also configured by this class and this will ensure consistency
throughout the library.
This class is a singleton, the New() method is deprecated and will be removed in future release. We
need to use the Instance() method instead.
otb::DEMHandler ::Pointer demHandler = otb ::DEMHandler ::Instance ();
It allows to configure a directory containing DEM tiles (DTED or SRTM supported) using the
OpenDEMDirectory() method. The OpenGeoidFile() method allows to input a geoid file as well.
Last, a default height above ellipsoid can be set using the SetDefaultHeightAboveEllipsoid()
method.
demHandler ->SetDefaultHeightAboveEllipsoid(defaultHeight);
if(!demHandler ->IsValidDEMDirectory(demdir.c_str()))
{
std::cerr <<"IsValidDEMDirectory("<<demdir <<") = false"<<std ::endl;
fail = true;

11.6. Elevation management with OTB 297
}
demHandler ->OpenDEMDirectory(demdir);
demHandler ->OpenGeoidFile(geoid);
We can now retrieve height above ellipsoid or height above Mean Sea Level (MSL) using the meth-
ods GetHeightAboveEllipsoid() and GetHeightAboveMSL(). Outputs of these methods depend
on the conﬁguration of the class otb::DEMHandler and the different cases are:
For GetHeightAboveEllipsoid():
• DEM and geoid both available: dem value+ geoid o f fset
• No DEM but geoid available: geoid offset
• DEM available, but no geoid: dem value
• No DEM and no geoid available: default height above ellipsoid
For GetHeightAboveMSL():
• DEM and geoid both available: srtm value
• No DEM but geoid available: 0
• DEM available, but no geoid: srtm value
• No DEM and no geoid available: 0
otb::DEMHandler :: PointType point;
point[0] = longitude ;
point[1] = latitude ;
double height = -32768;
height = demHandler ->GetHeightAboveMSL(point);
std::cout <<"height above MSL ("<<longitude <<","
<<latitude <<") = "<<height <<" meters"<<std::endl;
height = demHandler ->GetHeightAboveEllipsoid(point);
std::cout <<"height above ellipsoid ("<<longitude
<<", "<<latitude <<") = "<<height <<" meters"<<std::endl;
Note that OSSIM internal calls for sensor modelling use the height above ellipsoid, and follow the
same logic as the GetHeightAboveEllipsoid() method.

11.7 Vector data area extraction
Examples/Projections/VectorDataExtractROIExample.cxx.
There is some vector data sets widely available on the internet. These data sets can be huge, covering
an entire country, with hundreds of thousands objects.
Most of the time, you won’t be interested in the whole area and would like to focuss only on the area
corresponding to your satellite image.
The otb::VectorDataExtractROI is able to extract the area corresponding to your satellite image,
even if the image is still in sensor geometry (provided the sensor model is supported by OTB). Let’s
see how we can do that.
This example demonstrates the use of the otb::VectorDataExtractROI.
After the usual declaration (you can check the source ﬁle for the details), we can declare the
otb::VectorDataExtractROI:
typedef otb::VectorDataExtractROI <VectorDataType > FilterType ;
Then, we need to specify the region to extract. This region is a bit special as it contains also in-
formation related to its reference system (cartographic projection or sensor model projection). We
retrieve all these information from the image we gave as an input.
TypedRegion region;
TypedRegion ::SizeType size;
TypedRegion ::IndexType index;
size[0] = imageReader ->GetOutput ()->GetLargestPossibleRegion(). GetSize ()[0]
* imageReader ->GetOutput ()->GetSpacing ()[0];
size[1] = imageReader ->GetOutput ()->GetLargestPossibleRegion(). GetSize ()[1]
* imageReader ->GetOutput ()->GetSpacing ()[1];
index[0] = imageReader ->GetOutput ()->GetOrigin ()[0]
- 0.5 * imageReader ->GetOutput ()->GetSpacing ()[0];
index[1] = imageReader ->GetOutput ()->GetOrigin ()[1]
- 0.5 * imageReader ->GetOutput ()->GetSpacing ()[1];
region.SetOrigin (index);
otb:: ImageMetadataInterfaceBase::Pointer imageMetadataInterface
= otb::ImageMetadataInterfaceFactory::CreateIMI (
imageReader ->GetOutput ()->GetMetaDataDictionary());
region.SetRegionProjection(
imageMetadataInterface ->GetProjectionRef());
region.SetKeywordList(imageReader ->GetOutput ()->GetImageKeywordlist());
filter ->SetRegion (region);

11.7. Vector data area extraction 299
And ﬁnally, we can plug the ﬁlter in the pipeline:

CHAPTER
TWELVE
RADIOMETRY
Remote sensing is not just a matter of taking pictures, but also – mostly – a matter of measuring
physical values. In order to properly deal with physical magnitudes, the numerical values provided
by the sensors have to be calibrated. After that, several indices with physical meaning can be com-
puted.
12.1 Radiometric Indices
12.1.1 Introduction
With multispectral sensors, several indices can be computed, combining several spectral bands to
show features that are not obvious using only one band. Indices can show:
• Vegetation (Tab 12.1)
• Soil (Tab 12.2)
• Water (Tab 12.3)
• Built up areas (Tab 12.4)
A vegetation index is a quantitative measure used to measure biomass or vegetative vigor, usually
formed from combinations of several spectral bands, whose values are added, divided, or multiplied
in order to yield a single value that indicates the amount or vigor of vegetation.
Numerous indices are available in OTB and are listed in table 12.1 to 12.4 with their references.
The use of the different indices is very similar, and only few example are given in the next sections.

302 Chapter 12. Radiometry
NDVI Normalized Difference Vegetation Index [122]
RVI Ratio Vegetation Index [106]
PVI Perpendicular Vegetation Index [119, 149]
SAVI Soil Adjusted Vegetation Index [65]
TSAVI Transformed Soil Adjusted Vegetation Index [10, 9]
MSAVI Modified Soil Adjusted Vegetation Index [115]
MSAVI2 Modified Soil Adjusted Vegetation Index [115]
GEMI Global Environment Monitoring Index [111]
WDVI Weighted Difference Vegetation Index [27, 28]
AVI Angular Vegetation Index [113]
ARVI Atmospherically Resistant Vegetation Index [79]
TSARVI Transformed Soil Adjusted Vegetation Index [79]
EVI Enhanced Vegetation Index [66, 76]
IPVI Infrared Percentage Vegetation Index [32]
TNDVI Transformed NDVI [36]
Table 12.1: Vegetation indices
IR Redness Index [46]
IC Color Index [46]
IB Brilliance Index [101]
IB2 Brilliance Index [101]
Table 12.2: Soil indices
12.1.2 NDVI
NDVI was one of the most successful of many attempts to simply and quickly identify vegetated
areas and their condition, and it remains the most well-known and used index to detect live green
plant canopies in multispectral remote sensing data. Once the feasibility to detect vegetation had
been demonstrated, users tended to also use the NDVI to quantify the photosynthetic capacity of
plant canopies. This, however, can be a rather more complex undertaking if not done properly.
Examples/Radiometry/NDVIRAndNIRVegetationIndexImageFilter.cxx.
The following example illustrates the use of the otb::RAndNIRIndexImageFilter with the use
of the Normalized Difference Vegatation Index (NDVI). NDVI computes the difference between the
NIR channel, noted LNIR, and the red channel, noted Lr radiances reflected from the surface and
transmitted through the atmosphere:
NDVI =
LNIR − Lr
LNIR + Lr
(12.1)

12.1. Radiometric Indices 303
SRWI Simple Ratio Water Index [155]
NDWI Normalized Difference Water Index [21]
NDWI2 Normalized Difference Water Index [97]
MNDWI Modified Normalized Difference Water Index [151]
NDPI Normalized Difference Pond Index [83]
NDTI Normalized Difference Turbidity Index [83]
SA Spectral Angle
Table 12.3: Water indices
NDBI Normalized Difference Built Up Index [100]
ISU Index Surfaces Built [2]
Table 12.4: Built-up indices
• otb::Functor::RVI
• otb::Functor::PVI
• otb::Functor::SAVI
• otb::Functor::TSAVI
• otb::Functor::MSAVI
• otb::Functor::GEMI
• otb::Functor::WDVI
• otb::Functor::IPVI
• otb::Functor::TNDVI
With the otb::RAndNIRIndexImageFilter class the filter inputs are one channel images: one
inmage represents the NIR channel, the the other the NIR channel.
the otb::RAndNIRIndexImageFilter class must be included.
#include "otbRAndNIRIndexImageFilter.h"
The image types are now defined using pixel types the dimension. Input and output images are
defined as otb::Image.
typedef double InputPixelType;
typedef otb::Image <InputPixelType , Dimension > InputRImageType;
typedef otb::Image <InputPixelType , Dimension > InputNIRImageType;

The NDVI (Normalized Difference Vegetation Index) is instantiated using the images pixel type as
template parameters. It is implemented as a functor class which will be passed as a parameter to an
otb::RAndNIRIndexImageFilter.
typedef otb::Functor ::NDVI <InputPixelType ,
InputPixelType ,
OutputPixelType > FunctorType ;
The otb::RAndNIRIndexImageFilter type is instantiated using the images types and the NDVI
functor as template parameters.
typedef otb::RAndNIRIndexImageFilter <InputRImageType ,
InputNIRImageType ,
OutputImageType ,
FunctorType >
RAndNIRIndexImageFilterType;
Now the input images are set and a name is given to the output image.
readerR ->SetFileName (argv[1]);
readerNIR ->SetFileName (argv[2]);
We set the processing pipeline: ﬁlter inputs are linked to the reader output and the ﬁlter output is
linked to the writer input.
filter ->SetInputR (readerR->GetOutput ());
filter ->SetInputNIR (readerNIR ->GetOutput ());
Invocation of the Update() method on the writer triggers the execution of the pipeline. It is rec-
ommended to place update() calls in a try/catch block in case errors occur and exceptions are
thrown.
try
{
writer ->Update ();
}
catch (itk::ExceptionObject& excep)
{
std::cerr << "Exception caught !" << std::endl;
std::cerr << excep << std::endl;
}
Let’s now run this example using as input the images NDVI 3.hdr and NDVI 4.hdr (images kindly
and free of charge given by SISA and CNES) provided in the directory Examples/Data.

Figure 12.1: NDVI input images on the left (Red channel and NIR channel), on the right the result of the
algorithm.
12.1.3 ARVI
Examples/Radiometry/ARVIMultiChannelRAndBAndNIRVegetationIndexImageFilter.cxx.
The following example illustrates the use of the otb::MultiChannelRAndBAndNIRIndexImageFilter
with the use of the Atmospherically Resistant Vegetation Index (ARVI) otb::Functor::ARVI.
ARVI is an improved version of the NDVI that is more robust to the atmospheric effect. In addition
to the red and NIR channels (used in the NDVI), the ARVI takes advantage of the presence of the
blue channel to accomplish a self-correction process for the atmospheric effect on the red channel.
For this, it uses the difference in the radiance between the blue and the red channels to correct the
radiance in the red channel. Let’s define ρ∗
NIR, ρ∗
r , ρ∗
b the normalized radiances (that is to say the
radiance normalized to reflectance units) of red, blue and NIR channels respectively. ρ∗
rb is defined
as
ρ∗
rb = ρ∗
r − γ∗ (ρ∗
b − ρ∗
r) (12.2)
The ARVI expression is
ARVI =
ρ∗
NIR − ρ∗
rb
ρ∗
NIR + ρ∗
rb
(12.3)
This formula can be simplified with :
ARVI =
LNIR − Lrb
LNIR + Lrb
(12.4)
For more details, refer to Kaufman and Tanre’ work [79].
• otb::Functor::TSARVI

• otb::Functor::EVI
With the otb::MultiChannelRAndBAndNIRIndexImageFilter class the input has to be a multi
channel image and the user has to specify index channel of the red, blue and NIR channel.
the otb::MultiChannelRAndBAndNIRIndexImageFilter class must be included.
#include "otbMultiChannelRAndBAndNIRIndexImageFilter.h"
The image types are now defined using pixel types and dimension. The input image is defined as an
otb::VectorImage, the output is a otb::Image.
typedef otb::VectorImage <InputPixelType , Dimension > InputImageType;
The ARVI (Atmospherically Resistant Vegetation Index) is instantiated using the image pixel types
as template parameters. Note that we also can use other functors which operate with the Red, Blue
and Nir channels such as EVI, ARVI and TSARVI.
typedef otb::Functor ::ARVI <InputPixelType ,
InputPixelType ,
InputPixelType ,
The otb::MultiChannelRAndBAndNIRIndexImageFilter type is defined using the image types
and the ARVI functor as template parameters. We then instantiate the filter itself.
typedef otb:: MultiChannelRAndBAndNIRIndexImageFilter
<InputImageType ,
OutputImageType ,
FunctorType >
MultiChannelRAndBAndNIRIndexImageFilterType;
MultiChannelRAndBAndNIRIndexImageFilterType::Pointer
filter = MultiChannelRAndBAndNIRIndexImageFilterType::New();
Now the input image is set and a name is given to the output image.
The three used index bands (red, blue and NIR) are declared.
filter ->SetRedIndex (:: atoi(argv[5]));
filter ->SetBlueIndex(:: atoi(argv[6]));
filter ->SetNIRIndex (:: atoi(argv[7]));

Figure 12.2: ARVI result on the right with the left image in input.
The γ parameter is set. The otb::MultiChannelRAndBAndNIRIndexImageFilter class sets the
default value of γ to 0.5. This parameter is used to reduce the atmospheric effect on a global scale.
filter ->GetFunctor (). SetGamma (:: atof(argv[8]));
The ﬁlter input is linked to the reader output and the ﬁlter output is linked to the writer input.
thrown.
try
{
writer ->Update ();
}
catch (itk:: ExceptionObject& excep)
{
}
Let’s now run this example using as input the image IndexVegetation.hd (image kindly and free
of charge given by SISA and CNES) and γ=0.6 provided in the directory Examples/Data.
12.1.4 AVI
Examples/Radiometry/AVIMultiChannelRAndGAndNIRVegetationIndexImageFilter.cxx.
The following example illustrates the use of the otb::MultiChannelRAndGAndNIR VegetationIn-
dexImageFilter with the use of the Angular Vegetation Index (AVI). The equation for the Angular

Vegetation Index involves the gren, red and near infra-red bands. λ1, λ2 and λ3 are the mid-band
wavelengths for the green, red and NIR bands and tan−1 is the arctangent function.
The AVI expression is
A1 =
λ3 − λ2
λ2
(12.5)
A2 =
λ2 − λ1
λ2
(12.6)
AVI = tan−1 A1
NIR− R
+ tan−1 A2
G− R
(12.7)
For more details, refer to Plummer work [113].
With the otb::MultiChannelRAndGAndNIRIndexImageFilter class the input has to be a multi
channel image and the user has to specify the channel index of the red, green and NIR channel.
the otb::MultiChannelRAndGAndNIRIndexImageFilter class must be included.
#include "otbMultiChannelRAndGAndNIRIndexImageFilter.h"
The image types are now defined using pixel types and dimension. The input image is defined as an
otb::VectorImage, the output is a otb::Image.
typedef otb::VectorImage <InputPixelType , Dimension > InputImageType;
The AVI (Angular Vegetation Index) is instantiated using the image pixel types as template param-
eters.
typedef otb::Functor ::AVI<InputPixelType , InputPixelType ,
InputPixelType , OutputPixelType > FunctorType ;
The otb::MultiChannelRAndGAndNIRIndexImageFilter type is defined using the image types
and the AVI functor as template parameters. We then instantiate the filter itself.
typedef otb:: MultiChannelRAndGAndNIRIndexImageFilter
<InputImageType , OutputImageType , FunctorType >
MultiChannelRAndGAndNIRIndexImageFilterType;
MultiChannelRAndGAndNIRIndexImageFilterType::Pointer
filter = MultiChannelRAndGAndNIRIndexImageFilterType::New();
Now the input image is set and a name is given to the output image.

12.2. Atmospheric Corrections 309
The three used index bands (red, green and NIR) are declared.
filter ->SetRedIndex (:: atoi(argv[5]));
filter ->SetGreenIndex(:: atoi(argv[6]));
filter ->SetNIRIndex (:: atoi(argv[7]));
The λ R, G and NIR parameters are set. The otb::MultiChannelRAndGAndNIRIndexImageFilter
class sets the default values of λ to 660, 560 and 830.
filter ->GetFunctor (). SetLambdaR (::atof(argv[8]));
filter ->GetFunctor (). SetLambdaG (::atof(argv[9]));
filter ->GetFunctor (). SetLambdaNir(:: atof(argv[10]));
The ﬁlter input is linked to the reader output and the ﬁlter output is linked to the writer input.
thrown.
try
{
writer ->Update ();
}
{
}
Let’s now run this example using as input the image verySmallFSATSW.tif provided in the direc-
tory Examples/Data.
12.2 Atmospheric Corrections
Examples/Radiometry/AtmosphericCorrectionSequencement.cxx.
The following example illustrates the application of atmospheric corrections to an optical multispec-
tral image similar to Pleiades. These corrections are made in four steps :

Figure 12.3: AVI result on the right with the left image in input.
• digital number to luminance correction;
• luminance to refletance image conversion;
• atmospheric correction for TOA (top of atmosphere) to TOC (top of canopy) reflectance esti-
mation;
• correction of the adjacency effects taking into account the neighborhood contribution.
The manipulation of each class used for the different steps and the link with the 6S radiometry
library will be explained.
the otb::AtmosphericCorrectionSequencement class must be included. For the numerical to
luminance image, luminance to refletance image, and reflectance to atmospheric correction image
corrections and the neighborhood correction, four header files are required.
#include "otbImageToLuminanceImageFilter.h"
#include "otbLuminanceToReflectanceImageFilter.h"
#include "otbReflectanceToSurfaceReflectanceImageFilter.h"
#include "otbSurfaceAdjacencyEffect6SCorrectionSchemeFilter.h"
This chain uses the 6S radiative transfer code to compute radiometric parameters. To manipulate 6S
data, three classes are needed (the first one to store the metadata, the second one that calls 6S class
and generates the information which will be stored in the last one).
#include "otbAtmosphericCorrectionParameters.h"
#include "otbAtmosphericCorrectionParametersTo6SAtmosphericRadiativeTerms.h"
#include "otbAtmosphericRadiativeTerms.h"
Image types are now defined using pixel types and dimension. The input image is defined as an
otb::VectorImage, the output image is a otb::VectorImage. To simplify, input and output
image types are the same one.

The GenerateOutputInformation() reader method is called to know the number of compo-
nent per pixel of the image. It is recommended to place GenerateOutputInformation calls in
a try/catch block in case errors occur and exceptions are thrown.
try
{
}
{
}
The otb::ImageToLuminanceImageFilter type is defined and instancied. This class uses a func-
tor applied to each component of each pixel (Xk) whose formula is:
Lk
TOA =
Xk
αk
+ βk. (12.8)
Where :
• Lk
TOA is the incident luminance (in W.m−2.sr−1.µm−1);
• Xk is the measured digital number (ie. the input image pixel component);
• αk is the absolute calibration gain for the channel k;
• βk is the absolute calibration bias for the channel k.
typedef otb::ImageToLuminanceImageFilter<ImageType , ImageType >
ImageToLuminanceImageFilterType;
ImageToLuminanceImageFilterType::Pointer filterImageToLuminance
= ImageToLuminanceImageFilterType::New();
Here, α and β are read from an ASCII file given in input, stored in a vector and passed to the class.
filterImageToLuminance ->SetAlpha (alpha);
filterImageToLuminance ->SetBeta(beta);
The otb::LuminanceToReflectanceImageFilter type is defined and instancied. This class used
a functor applied to each component of each pixel of the luminance filter output (Lk
TOA):
ρk
TOA =
π.Lk
TOA
Ek
S.cos(θS).d/d0
. (12.9)
Where :

• ρk
TOA is the reflectance measured by the sensor;
• θS is the zenithal solar angle in degrees;
• Ek
S is the solar illumination out of the atmosphere measured at a distance d0 from the Earth;
• d/d0 is the ratio between the Earth-Sun distance at the acquisition date and the mean Earth-
Sun distance. The ratio can be directly given to the class or computed using a 6S routine. In
the last case (that is the one of this example), the user has to precise the month and the day of
the acquisition.
typedef otb::LuminanceToReflectanceImageFilter<ImageType , ImageType >
LuminanceToReflectanceImageFilterType;
LuminanceToReflectanceImageFilterType::Pointer filterLuminanceToReflectance
= LuminanceToReflectanceImageFilterType::New();
The solar illumination is read from a ASCII file given in input, stored in a vector and given to the
class. Day, month and zenital solar angle are inputs and can be directly given to the class.
filterLuminanceToReflectance->SetZenithalSolarAngle(
static_cast<double>(atof(argv[6])));
filterLuminanceToReflectance->SetDay(atoi(argv[7]));
filterLuminanceToReflectance->SetMonth(atoi(argv[8]));
filterLuminanceToReflectance->SetSolarIllumination(solarIllumination);
At this step of the chain, radiometric informations are nedeed. Those
informations will be computed from different parameters stored in a
otb::AtmosphericCorrectionParameters class intance. This container will be given
to an otb::AtmosphericCorrectionParametersTo6SAtmosphericRadiativeTerms
class instance which will call a 6S routine that will compute the needed radiometric in-
formations and store them in a otb::AtmosphericRadiativeTerms class instance.
For this, otb::AtmosphericCorrectionParametersTo6SAtmosphericRadiativeTerms
otb::AtmosphericCorrectionParameters and otb::AtmosphericRadiativeTerms types are
defined and instancied.
typedef otb:: AtmosphericCorrectionParametersTo6SAtmosphericRadiativeTerms
AtmosphericCorrectionParametersTo6SRadiativeTermsType;
typedef otb:: AtmosphericCorrectionParameters
AtmosphericCorrectionParametersType;
typedef otb::AtmosphericRadiativeTerms
AtmosphericRadiativeTermsType;
The otb::AtmosphericCorrectionParameters class needs several parameters :
• The zenithal and azimutal solar angles that describe the solar incidence configuration (in de-
grees);

• The zenithal and azimuthal viewing angles that describe the viewing direction (in degrees);
• The month and the day of the acquisition;
• The atmospheric pressure;
• The water vapor amount, that is, the total water vapor content over vertical atmospheric col-
umn;
• The ozone amount that is the Stratospheric ozone layer content;
• The aerosol model that is the kind of particles (no aerosol, continental, maritime, urban, de-
sertic);
• The aerosol optical thickness at 550 nm that is the is the Radiative impact of aerosol for the
reference wavelength 550 nm;
• The filter function that is the values of the filter function for one spectral band, from λin f to
λsup by step of 2.5 nm. One filter function by channel is required. This last parameter are read
in text files, the other one are directly given to the class.
dataAtmosphericCorrectionParameters->SetSolarZenithalAngle(
dataAtmosphericCorrectionParameters->SetSolarAzimutalAngle(
dataAtmosphericCorrectionParameters->SetViewingZenithalAngle(
dataAtmosphericCorrectionParameters->SetViewingAzimutalAngle(
dataAtmosphericCorrectionParameters->SetMonth(atoi(argv[8]));
dataAtmosphericCorrectionParameters->SetDay(atoi(argv[7]));
dataAtmosphericCorrectionParameters->SetAtmosphericPressure(
dataAtmosphericCorrectionParameters->SetWaterVaporAmount(
dataAtmosphericCorrectionParameters->SetOzoneAmount(
AerosolModelType aerosolModel =
static_cast<AerosolModelType >(::atoi(argv[15]));
dataAtmosphericCorrectionParameters->SetAerosolModel(aerosolModel);
dataAtmosphericCorrectionParameters->SetAerosolOptical(

Once those parameters are loaded and stored in the AtmosphericCorrectionParameters in-
stance class, it is given in input of an instance of AtmosphericCorrectionParameter-
sTo6SAtmosphericRadiativeTerms that will compute the needed radiometric informations.
AtmosphericCorrectionParametersTo6SRadiativeTermsType::Pointer
filterAtmosphericCorrectionParametersTo6SRadiativeTerms =
AtmosphericCorrectionParametersTo6SRadiativeTermsType::New();
filterAtmosphericCorrectionParametersTo6SRadiativeTerms->SetInput (
dataAtmosphericCorrectionParameters);
filterAtmosphericCorrectionParametersTo6SRadiativeTerms->Update ();
The output of this class will be an instance of the AtmosphericRadiativeTerms class. This class
contains (for each channel of the image)
• The Intrinsic atmospheric reflectance that takes into account for the molecular scattering and
the aerosol scattering attenuated by water vapor absorption;
• The spherical albedo of the atmosphere;
• The total gaseous transmission (for all species);
• The total transmittance of the atmosphere from sun to ground (downward transmittance) and
from ground to space sensor (upward transmittance).
Atmospheric corrections can now start. First, an instance of
otb::ReflectanceToSurfaceReflectanceImageFilter is created.
typedef otb::ReflectanceToSurfaceReflectanceImageFilter<ImageType ,
ImageType >
ReflectanceToSurfaceReflectanceImageFilterType;
ReflectanceToSurfaceReflectanceImageFilterType::Pointer
filterReflectanceToSurfaceReflectanceImageFilter
= ReflectanceToSurfaceReflectanceImageFilterType::New ();
The aim of the atmospheric correction is to invert the surface reflectance (for each pixel of the
input image) from the TOA reflectance and from simulations of the atmospheric radiative functions
corresponding to the geometrical conditions of the observation and to the atmospheric components.
The process required to be applied on each pixel of the image, band by band with the following
formula:
ρ
unif
S =
A
1 + SxA
(12.10)
Where,
A =
ρTOA − ρatm
T(µS).T(µV ).t
allgas
g
(12.11)

With :
• ρTOA is the reflectance at the top of the atmosphere;
• ρunif
S is the ground reflectance under assumption of a lambertian surface and an uniform envi-
ronment;
• ρatm is the intrinsic atmospheric reflectance;
• t
allgas
g is the spherical albedo of the atmosphere;
• T(µS) is the downward transmittance;
• T(µV ) is the upward transmittance.
All those parameters are contained in the AtmosphericCorrectionParametersTo6SRadiativeTerms
output.
filterReflectanceToSurfaceReflectanceImageFilter->
SetAtmosphericRadiativeTerms(
filterAtmosphericCorrectionParametersTo6SRadiativeTerms->GetOutput ());
Next (and last step) is the neighborhood correction. For this, the SurfaceAdjacencyEf-
fect6SCorrectionSchemeFilter class is used. The previous surface reflectance inversion is performed
under the assumption of a homogeneous ground environment. The following step allows to correct
the adjacency effect on the radiometry of pixels. The method is based on the decomposition of the
observed signal as the summation of the own contribution of the target pixel and of the contribution
of neighbored pixels moderated by their distance to the target pixel. A simplified relation may be :
ρS =
ρ
unif
S .T(µV )− < ρS > .td(µV )
exp(−δ/µV )
(12.12)
With :
• ρ
unif
S is the ground reflectance under assumption of an homogeneous environment;
• T(µV ) is the upward transmittance;
• td(µS) is the upward diffus transmittance;
• exp(−δ/µV ) is the upward direct transmittance;
• ρS is the environment contribution to the pixel target reflectance in the total observed signal.
ρS = ∑ j∑if(r(i, j))× ρ
unif
S (i, j) (12.13)
where,

– r(i, j) is the distance between the pixel(i, j) and the central pixel of the window in km;
– f(r) is the global environment function.
f(r) =
tR
d (µV ).fR(r)+tA
d (µV ).fA(r)
td(µV )
(12.14)
The neighborhood consideration window size is given by the window radius. An instance of
otb::SurfaceAdjacencyEffect6SCorrectionSchemeFilter is created.
typedef otb::SurfaceAdjacencyEffect6SCorrectionSchemeFilter<ImageType ,
ImageType >
SurfaceAdjacencyEffect6SCorrectionSchemeFilterType;
SurfaceAdjacencyEffect6SCorrectionSchemeFilterType::Pointer
filterSurfaceAdjacencyEffect6SCorrectionSchemeFilter
= SurfaceAdjacencyEffect6SCorrectionSchemeFilterType::New();
The needs four input informations:
• Radiometric informations (the output of the AtmosphericCorrectionParameter-
sTo6SRadiativeTerms filter);
• The zenithal viewing angle;
• The neighborhood window radius;
• The pixel spacing in kilometers.
At this step, each filter of the chain is instancied and every one has its input paramters set. A name
can be given to the output image and each filter can linked to other to create the final processing
chain.
filterImageToLuminance ->SetInput (reader ->GetOutput ());
filterLuminanceToReflectance->SetInput(filterImageToLuminance ->GetOutput ());
filterReflectanceToSurfaceReflectanceImageFilter->SetInput(
filterLuminanceToReflectance->GetOutput ());
filterSurfaceAdjacencyEffect6SCorrectionSchemeFilter->SetInput(
filterReflectanceToSurfaceReflectanceImageFilter->GetOutput ());
writer ->SetInput(
filterSurfaceAdjacencyEffect6SCorrectionSchemeFilter->GetOutput ());
recommended to place this call in a try/catch block in case errors occur and exceptions are thrown.
try
{
writer ->Update ();

}
{
}

CHAPTER
THIRTEEN
IMAGE FUSION
Satellite sensors present an important diversity in terms of characteristics. Some provide a high
spatial resolution while other focus on providing several spectral bands. The fusion process brings
the information from different sensors with different characteristics together to get the best of both
worlds.
Most of the fusion methods in the remote sensing community deal with the pansharpening tech-
nique. This fusion combines the image from the PANchromatic sensor of one satellite (high spatial
resolution data) with the multispectral (XS) data (lower resolution in several spectral bands) to gener-
ate images with a high resolution and several spectral bands. Several advantages make this situation
easier:
• PAN and XS images are taken simultaneously from the same satellite (or with a very short
delay);
• the imaged area is common to both scenes;
• many satellites provide these data (SPOT 1-5, Quickbird, Pleiades)
This case is well-studied in the literature and many methods exist. Only very few are available in
OTB now but this should evolve soon.
13.1 Simple Pan Sharpening
A simple way to view the pan-sharpening of data is to consider that, at the same resolution, the
panchromatic channel is the sum of the XS channel. After putting the two images in the same
geometry, after orthorectiﬁcation (see chapter 11) with an oversampling of the XS image, we can
proceed to the data fusion.
The idea is to apply a low pass ﬁlter to the panchromatic band to give it a spectral content (in the

320 Chapter 13. Image Fusion
Fourier domain) equivalent to the XS data. Then we normalize the XS data with this low-pass
panchromatic and multiply the result with the original panchromatic band.
The process is described on figure 13.1.
XS
PAN
Low Pass
/ ×
PAN + XS
Figure 13.1: Simple pan-sharpening procedure.
Examples/Fusion/PanSharpeningExample.cxx.
Here we are illustrating the use of the otb::SimpleRcsPanSharpeningFusionImageFilter for
PAN-sharpening. This example takes a PAN and the corresponding XS images as input. These
images are supposed to be registered.
The fusion operation is defined as
XS
Filtered(PAN)
PAN (13.1)
Figure 13.2 shows the result of applying this PAN sharpening filter to a Quickbird image.
We start by including the required header and declaring the main function:
#include "otbSimpleRcsPanSharpeningFusionImageFilter.h"

13.1. Simple Pan Sharpening 321
Figure 13.2: Result of applying the otb::SimpleRcsPanSharpeningFusionImageFilter to orthorectiﬁed
Quickbird image. From left to right : original PAN image, original XS image and the result of the PAN sharpening

int main(int argc , char* argv[])
{
We declare the different image type used here as well as the image reader. Note that, the reader for
the PAN image is templated by an otb::Image while the XS reader uses an otb::VectorImage.
typedef otb::Image <double, 2> ImageType ;
typedef otb::VectorImage <double, 2> VectorImageType;
typedef otb::ImageFileReader <VectorImageType > ReaderVectorType;
typedef otb::VectorImage <unsigned short int, 2> VectorIntImageType;
ReaderVectorType::Pointer readerXS = ReaderVectorType::New();
ReaderType ::Pointer readerPAN = ReaderType ::New();
We pass the filenames to the readers
readerPAN ->SetFileName (argv[1]);
readerXS ->SetFileName (argv[2]);
We declare the fusion filter an set its inputs using the readers:
typedef otb:: SimpleRcsPanSharpeningFusionImageFilter
<ImageType , VectorImageType , VectorIntImageType > FusionFilterType;
FusionFilterType::Pointer fusion = FusionFilterType::New ();
fusion ->SetPanInput (readerPAN ->GetOutput ());
fusion ->SetXsInput (readerXS ->GetOutput ());
And finally, we declare the writer and call its Update() method to trigger the full pipeline execution.
typedef otb::ImageFileWriter <VectorIntImageType > WriterType ;
writer ->SetInput(fusion ->GetOutput ());
writer ->Update();
13.2 Bayesian Data Fusion
Examples/Fusion/BayesianFusionImageFilter.cxx.
The following example illustrates the use of the otb::BayesianFusionFilter. The Bayesian data
fusion relies on the idea that variables of interest, denoted as vector Z, cannot be directly observed.
They are linked to the observable variables Y through the following error-like model.
Y = g(Z)+ E (13.2)

13.2. Bayesian Data Fusion 323
where g(Z) is a set of functionals and E is a vector of random errors that are stochastically inde-
pendent from Z. This algorithm uses elementary probability calculus, and several assumptions to
compute the data fusion. For more explication see Fasbender, Radoux and Bogaert’s publication
[42]. Three images are used :
• a panchromatic image,
• a multispectral image resampled at the panchromatic image spatial resolution,
• a multispectral image resampled at the panchromatic image spatial resolution, using, e.g. a
cubic interpolator.
• a float : λ, the meaning of the weight to be given to the panchromatic image compared to the
multispectral one.
the otb::BayesianFusionFilter class must be included.
#include "otbBayesianFusionFilter.h"
The image types are now defined using pixel types and particular dimension. The panchromatic
image is defined as an otb::Image and the multispectral one as otb::VectorImage.
typedef otb::Image <InternalPixelType , Dimension > PanchroImageType;
typedef otb::VectorImage <InternalPixelType , Dimension > MultiSpecImageType;
The Bayesian data fusion filter type is instantiated using the images types as a template parameters.
typedef otb::BayesianFusionFilter <MultiSpecImageType ,
MultiSpecImageType ,
PanchroImageType ,
OutputImageType >
BayesianFusionFilterType;
Next the filter is created by invoking the New() method and assigning the result to a
itk::SmartPointer.
BayesianFusionFilterType::Pointer bayesianFilter =
BayesianFusionFilterType::New ();
Now the multi spectral image, the interpolated multi spectral image and the panchromatic image are
given as inputs to the filter.
bayesianFilter ->SetMultiSpect(multiSpectReader ->GetOutput ());
bayesianFilter ->SetMultiSpectInterp(multiSpectInterpReader ->GetOutput ());
bayesianFilter ->SetPanchro (panchroReader ->GetOutput ());
writer ->SetInput(bayesianFilter ->GetOutput ());

Figure 13.3: Input images used for this example ( c European Space Imaging).
The BayesianFusionFilter requires deﬁning one parameter : λ. The λ parameter can be used to tune
the fusion toward either a high color consistency or sharp details. Typical λ value range in [0.5,1[,
where higher values yield sharper details. by default λ is set at 0.9999.
bayesianFilter ->SetLambda (atof(argv[9]));
thrown.
try
{
writer ->Update ();
}
{
}
Let’s now run this example using as input the images multiSpect.tif , multiSpectInterp.tif
and panchro.tif provided in the directory Examples/Data. The results obtained for 2 different
values for λ are shown in ﬁgure 13.3.

13.2. Bayesian Data Fusion 325
Figure 13.4: Fusion results for the Bayesian Data Fusion ﬁlter for λ = 0.5 on the left and λ = 0.9999 on the
right.

CHAPTER
FOURTEEN
FEATURE EXTRACTION
Under the term Feature Extraction we include several techniques aiming to detect or extract infor-
mations of low level of abstraction from images. These features can be objects : points, lines, etc.
They can also be measures : moments, textures, etc.
14.1 Textures
14.1.1 Haralick Descriptors
This example illustrates the use of the otb::ScalarImageToTexturesFilter, which compute
the standard Haralick’s textural features [57] presented in table 14.1.1, where µt and σt are the
mean and standard deviation of the row (or column, due to symmetry) sums, µ = (weighted pixel
average) = ∑i,j i·g(i, j) = ∑i,j j ·g(i, j) due to matrix summetry, and σ = (weighted pixel variance)
= ∑i,j(i− µ)2
·g(i, j) = ∑i,j(j − µ)2
·g(i, j) due to matrix symmetry.
More features are available in otb::ScalarImageToAdvancedTexturesFilter. The following
classes provide similar functionality:
• otb::ScalarImageToAdvancedTexturesFilter
• otb::ScalarImageToPanTexTextureFilter
• otb::MaskedScalarImageToGreyLevelCooccurrenceMatrixGenerator
• itk::GreyLevelCooccurrenceMatrixTextureCoefficientsCalculator
• otb::GreyLevelCooccurrenceMatrixAdvancedTextureCoefficientsCalculator
Examples/FeatureExtraction/TextureExample.cxx.
The first step required to use the filter is to include the header file.

328 Chapter 14. Feature Extraction
Energy f1 = ∑i,j g(i, j)2
Entropy f2 = −∑i,j g(i, j)log2 g(i, j), or 0 if g(i, j) = 0
Correlation f3 = ∑i,j
(i−µ)(j−µ)g(i,j)
σ2
Difference Moment f4 = ∑i,j
1
1+(i− j)2 g(i, j)
Inertia (a.k.a. Contrast) f5 = ∑i,j(i− j)2g(i, j)
Cluster Shade f6 = ∑i,j((i− µ)+ (j − µ))3g(i, j)
Cluster Prominence f7 = ∑i,j((i− µ)+ (j − µ))4g(i, j)
Haralick’s Correlation f8 =
∑i,j(i,j)g(i,j)−µ2
t
σ2
t
Table 14.1: Haralick features [57] available in otb::ScalarImageToTexturesFilter

14.1. Textures 329
#include "otbScalarImageToTexturesFilter.h"
After defining the types for the pixels and the images used in the example, we define the types for
the textures filter. It is templated by the input and output image types.
typedef otb:: ScalarImageToTexturesFilter
<ImageType , ImageType > TexturesFilterType;
We can now instantiate the filters.
TexturesFilterType::Pointer texturesFilter
= TexturesFilterType::New();
The texture filters takes at least 2 parameters: the radius of the neighborhood on which the texture
will be computed and the offset used. Texture features are bivariate statistics, that is, they are
computed using pair of pixels. Each texture feature is defined for an offset defining the pixel pair.
The radius parameter can be passed to the filter as a scalar parameter if the neighborhood is square,
or as SizeType in any case.
The offset is always an array of N values, where N is the number of dimensions of the image.
typedef ImageType ::SizeType SizeType;
SizeType sradius;
sradius.Fill(radius);
texturesFilter ->SetRadius (sradius );
typedef ImageType ::OffsetType OffsetType ;
OffsetType offset;
offset[0] = xOffset;
offset[1] = yOffset;
texturesFilter ->SetOffset (offset);
The textures filter will automatically derive the optimal bin size for co-occurences histogram, but
they need to know the input image minimum and maximum. These values can be set like this :
texturesFilter ->SetInputImageMinimum(0);
texturesFilter ->SetInputImageMaximum(255);
To tune co-occurence histogram resolution, you can use the SetNumberOfBinsPerAxis() method.
We can now plug the pipeline.
texturesFilter ->SetInput(reader ->GetOutput ());
writer ->SetInput(texturesFilter ->GetInertiaOutput());
writer ->Update();
Figure 14.1 shows the result of applying the contrast texture computation.

Figure 14.1: Result of applying the otb::ScalarImageToTexturesFilter to an image. From left to right :
original image, contrast.
14.1.2 PanTex
Examples/FeatureExtraction/PanTexExample.cxx.
This example illustrates the use of the otb::ScalarImageToPanTexTextureFilter. This texture
parameter was first introduced in [110] and is very useful for urban area detection. The following
classes provide similar functionality:
• otb::ScalarImageToTexturesFilter
• otb::ScalarImageToAdvancedTexturesFilter
#include "otbScalarImageToPanTexTextureFilter.h"
After defining the types for the pixels and the images used in the example, we define the type for the
PanTex filter. It is templated by the input and output image types.
typedef otb:: ScalarImageToPanTexTextureFilter
<ImageType , ImageType > PanTexTextureFilterType;
We can now instatiate the filter.
PanTexTextureFilterType::Pointer textureFilter = PanTexTextureFilterType::New ();

14.1. Textures 331
Figure 14.2: Result of applying the otb::ScalarImageToPanTexTextureFilter to an image. From left to
right: original image, PanTex feature.
Then, we set the parameters of the ﬁlter.The radius of the neighborhood to compute the tex-
ture. The number of bins per axis for histogram generation (it is the size of the co-occurrence
matrix). Moreover, we have to specify the Min/Max in the input image. In the exam-
ple, image Min/Max is set by the user to 0 and 255. Alternatively you can use the class
itk::MinimumMaximumImageCalculator to calculate these values.
PanTexTextureFilterType::SizeType sradius;
sradius.Fill(4);
textureFilter ->SetNumberOfBinsPerAxis(8);
textureFilter ->SetRadius (sradius );
textureFilter ->SetInputImageMinimum(0);
textureFilter ->SetInputImageMaximum(255);
We can now plug the pipeline and trigger the execution by calling the Update method of the writer.
textureFilter ->SetInput(reader ->GetOutput ());
writer ->SetInput(textureFilter ->GetOutput ());
writer ->Update();
Figure 14.2 shows the result of applying the PanTex computation.
14.1.3 Structural Feature Set
Examples/FeatureExtraction/SFSExample.cxx.
This example illustrates the use of the otb::SFSTexturesImageFilter. This ﬁlter computes
the Structural Feature Set as descibed in [62]. These features are textural parameters which give
information about the structure of lines passing through each pixel of the image.
#include "otbSFSTexturesImageFilter.h"

As with every OTB program, we start by defining the types for the images, the readers and the
writers.
The we can instantiate the type for the SFS filter, which is templated over the input and output pixel
types.
typedef otb::SFSTexturesImageFilter <ImageType , ImageType > SFSFilterType;
After that, we can instantiate the filter. We will also instantiate the reader and one writer for each
output image, since the SFS filter generates 6 different features.
SFSFilterType::Pointer filter = SFSFilterType::New();
WriterType ::Pointer writerLength = WriterType ::New();
WriterType ::Pointer writerWidth = WriterType ::New();
WriterType ::Pointer writerWMean = WriterType ::New();
WriterType ::Pointer writerRatio = WriterType ::New();
WriterType ::Pointer writerSD = WriterType ::New();
WriterType ::Pointer writerPsi = WriterType ::New();
The SFS filter has several parameters which have to be selected. They are:
1. a spectral threshold to decide if 2 neighboring pixels are connected;
2. a spatial threshold defining the maximum length for an extracted line;
3. the number of directions which will be analyzed (the first one is to the right and they are
equally distributed between 0 and 2π);
4. the α parameter fort the ω− mean feature;
5. the RatioMax parameter fort the ω− mean feature.
filter ->SetSpectralThreshold(spectThresh );
filter ->SetSpatialThreshold(spatialThresh);
filter ->SetNumberOfDirections(dirNb);
filter ->SetRatioMaxConsiderationNumber(maxConsideration);
filter ->SetAlpha(alpha);
In order to disable the computation of a feature, the SetFeatureStatus parameter can be used.
The true value enables the feature (default behavior) and the false value disables the computation.
Therefore, the following line is useless, but is given here as an example.
filter ->SetFeatureStatus(SFSFilterType::PSI, true);

14.2. Interest Points 333
Now, we plug the pipeline using all the writers.
writerLength ->SetFileName (outNameLength);
writerLength ->SetInput(filter ->GetLengthOutput());
writerLength ->Update();
writerWidth ->SetFileName (outNameWidth);
writerWidth ->SetInput (filter ->GetWidthOutput());
writerWidth ->Update();
writerWMean ->SetFileName (outNameWMean);
writerWMean ->SetInput (filter ->GetWMeanOutput());
writerWMean ->Update();
writerRatio ->SetFileName (outNameRatio);
writerRatio ->SetInput (filter ->GetRatioOutput());
writerRatio ->Update();
writerSD ->SetFileName (outNameSD );
writerSD ->SetInput(filter ->GetSDOutput ());
writerSD ->Update ();
writerPsi ->SetFileName (outNamePsi );
writerPsi ->SetInput (filter ->GetPSIOutput());
writerPsi ->Update ();
Figure 14.3 shows the result of applying the SFS computation to an image
14.2 Interest Points
14.2.1 Harris detector
Examples/FeatureExtraction/HarrisExample.cxx.
This example illustrates the use of the otb::HarrisImageFilter.
#include "otbHarrisImageFilter.h"
The otb::HarrisImageFilter is templated over the input and output image types, so we start by
deﬁning:
typedef otb::HarrisImageFilter <InputImageType ,
InputImageType > HarrisFilterType;

Figure 14.3: Result of applying the otb::SFSTexturesImageFilter to an image. From left to right and top
to bottom: original image, length, width, ω-mean, ratio, SD and Psi structural features. original image, .

Figure 14.4: Result of applying the otb::HarrisImageFilter to a Spot 5 image.
The otb::HarrisImageFilter needs some parameters to operate. The derivative computation is
performed by a convolution with the derivative of a Gaussian kernel of variance σD (derivation scale)
and the smoothing of the image is performed by convolving with a Gaussian kernel of variance σI
(integration scale). This allows the computation of the following matrix:
µ(x,σI,σD) = σ2
Dg(σI)⋆
L2
x(x,σD) LxLy(x,σD)
LxLy(x,σD) L2
y(x,σD)
(14.1)
The output of the detector is
det(µ)− αtrace2
(µ).
harris ->SetSigmaD (SigmaD);
harris ->SetSigmaI (SigmaI);
harris ->SetAlpha(Alpha);
Figure 14.4 shows the result of applying the interest point detector to a small patch extracted from a
Spot 5 image.
The output of the otb::HarrisImageFilter is an image where, for each pixel, we obtain the
intensity of the detection. Often, the user may want to get access to the set of points for which
the output of the detector is higher than a given threshold. This can be obtained by using the
otb::HarrisImageToPointSetFilter. This ﬁlter is only templated over the input image type,
the output being a itk::PointSet with pixel type equal to the image pixel type.
typedef otb::HarrisImageToPointSetFilter<InputImageType > FunctionType;
We declare now the ﬁlter and a pointer to the output point set.
typedef FunctionType::OutputPointSetType OutputPointSetType;
FunctionType::Pointer harrisPoints = FunctionType::New ();
OutputPointSetType::Pointer pointSet = OutputPointSetType::New ();
The otb::HarrisImageToPointSetFilter takes the same parameters as the
otb::HarrisImageFilter and an additional parameter : the threshold for the point selec-
tion.

harrisPoints ->SetInput (0, reader ->GetOutput ());
harrisPoints ->SetSigmaD (SigmaD);
harrisPoints ->SetSigmaI (SigmaI);
harrisPoints ->SetAlpha(Alpha);
harrisPoints ->SetLowerThreshold(10);
pointSet = harrisPoints ->GetOutput ();
We can now iterate through the obtained pointset and access the coordinates of the points. We start
by accessing the container of the points which is encapsulated into the point set (see section 5.2 for
more information on using itk::PointSets) and declaring an iterator to it.
typedef OutputPointSetType::PointsContainer ContainerType;
ContainerType* pointsContainer = pointSet ->GetPoints ();
typedef ContainerType::Iterator IteratorType;
IteratorType itList = pointsContainer ->Begin();
And we get the points coordinates
while (itList != pointsContainer ->End ())
{
typedef OutputPointSetType:: PointType OutputPointType;
OutputPointType pCoordinate = (itList.Value());
std::cout << pCoordinate << std::endl;
++itList;
}
14.2.2 SIFT detector
Examples/Patented/SIFTFastExample.cxx.
This example illustrates the use of the otb::SiftFastImageFilter. The Scale-Invariant Feature
Transform (or SIFT) is an algorithm in computer vision to detect and describe local features in
images. The algorithm was published by David Lowe [91]. The detection and description of local
image features can help in object recognition and image registration. The SIFT features are local and
based on the appearance of the object at particular interest points, and are invariant to image scale
and rotation. They are also robust to changes in illumination, noise, occlusion and minor changes in
viewpoint.
#include "otbSiftFastImageFilter.h"
We will start by defining the required types. We will work with a scalar image of float pixels. We
also define the corresponding image reader.
typedef float RealType;

The SIFT descriptors will be stored in a point set containing the vector of features.
The SIFT filter itself is templated over the input image and the generated point set.
typedef otb::SiftFastImageFilter <ImageType , PointSetType >
ImageToFastSIFTKeyPointSetFilterType;
We instantiate the reader.
We instantiate the filter.
ImageToFastSIFTKeyPointSetFilterType::Pointer filter =
ImageToFastSIFTKeyPointSetFilterType::New ();
We plug the filter and set the number of scales for the SIFT computation. We can afterwards run the
processing with the Update() method.
filter ->SetScalesNumber(scales);
filter ->Update();
Once the SIFT are computed, we may want to draw them on top of the input image. In order to do
this, we will create the following RGB image and the corresponding writer:
typedef itk::RGBPixel <PixelType > RGBPixelType;
typedef otb::Image <RGBPixelType , 2> OutputImageType;
OutputImageType::Pointer outputImage = OutputImageType::New();
We set the regions of the image by copying the information from the input image and we allocate
the memory for the output image.
outputImage ->SetRegions (reader ->GetOutput ()->GetLargestPossibleRegion());
We can now proceed to copy the input image into the output one using region iterators. The input
image is a grey level one. The output image will be made of color crosses for each SIFT on top of
the grey level input image. So we start by copying the grey level values on each of the 3 channels of
the color image.

itk::ImageRegionIterator <OutputImageType > iterOutput (
outputImage ,
outputImage ->
GetLargestPossibleRegion());
itk::ImageRegionIterator <ImageType > iterInput (reader ->GetOutput (),
reader ->GetOutput ()->
for (iterOutput .GoToBegin (), iterInput .GoToBegin ();
!iterOutput .IsAtEnd ();
++iterOutput , ++iterInput )
{
OutputImageType::PixelType rgbPixel;
rgbPixel.SetRed(static_cast<PixelType >(iterInput .Get ()));
rgbPixel.SetGreen (static_cast<PixelType >(iterInput .Get ()));
rgbPixel.SetBlue(static_cast<PixelType >(iterInput .Get ()));
iterOutput .Set(rgbPixel );
}
We are now going to plot color crosses on the output image. We will need to deﬁne offsets (top,
bottom, left and right) with respect to the SIFT position in order to draw the cross segments.
ImageType ::OffsetType t = {{ 0, 1}};
ImageType ::OffsetType b = {{ 0, -1}};
ImageType ::OffsetType l = {{ 1, 0}};
ImageType ::OffsetType r = {{-1, 0}};
Now, we are going to access the point set generated by the SIFT ﬁlter. The points are stored into a
points container that we are going to walk through using an iterator. These are the types needed for
this task:
We set the iterator to the beginning of the point set.
PointsIteratorType pIt = filter ->GetOutput ()->GetPoints ()->Begin();
We get the information about image size and spacing before drawing the crosses.
ImageType ::SpacingType spacing = reader ->GetOutput ()->GetSpacing ();
ImageType ::PointType origin = reader ->GetOutput ()->GetOrigin ();
And we iterate through the SIFT set:
while (pIt != filter ->GetOutput ()->GetPoints ()->End ())
{

We get the pixel coordinates for each SIFT by using the Value() method on the point set iterator.
We use the information about size and spacing in order to convert the physical coordinates of the
point into pixel coordinates.
ImageType ::IndexType index;
index[0] = static_cast<unsigned int>(vcl_floor (
static_cast<double>(
(pIt.Value()[0] -
origin [0]) / spacing [0] + 0.5
)));
(pIt.Value()[1] -
)));
We create a green pixel.
OutputImageType::PixelType keyPixel;
keyPixel.SetRed (0);
keyPixel.SetGreen (255);
keyPixel.SetBlue (0);
We draw the crosses using the offsets and checking that we are inside the image, since SIFTs on the
image borders would cause an out of bounds pixel access.
if (outputImage ->GetLargestPossibleRegion(). IsInside (index))
{
outputImage ->SetPixel(index , keyPixel );
if (outputImage ->GetLargestPossibleRegion(). IsInside(index +
t))
outputImage ->
SetPixel (index + t, keyPixel );
b))
outputImage ->
SetPixel (index + b, keyPixel );
l))
outputImage ->
SetPixel (index + l, keyPixel );
r))
outputImage ->
SetPixel (index + r, keyPixel );
}
++pIt;

Figure 14.5: Result of applying the otb::SiftFastImageFilter to a Spot 5 image.
}
Finally, we write the image.
writer ->SetInput(outputImage );
writer ->Update();
Figure 14.5 shows the result of applying the SIFT point detector to a small patch extracted from a
Spot 5 image.
14.2.3 SURF detector
Examples/FeatureExtraction/SURFExample.cxx.
This example illustrates the use of the otb::ImageToSURFKeyPointSetFilter. The Speed-Up
Robust Features (or SURF) is an algorithm in computer vision to detect and describe local features
in images. The algorithm is detailed in [11]. The applications of SURF are the same as those for
SIFT.
#include "otbImageToSURFKeyPointSetFilter.h"
We will start by defining the required types. We will work with a scalar image of float pixels. We
also define the corresponding image reader.

typedef float RealType ;
The SURF descriptors will be stored in a point set containing the vector of features.
The SURF filter itself is templated over the input image and the generated point set.
typedef otb::ImageToSURFKeyPointSetFilter<ImageType , PointSetType >
ImageToFastSURFKeyPointSetFilterType;
We instantiate the reader.
We instantiate the filter.
ImageToFastSURFKeyPointSetFilterType::Pointer filter =
ImageToFastSURFKeyPointSetFilterType::New ();
We plug the filter and set the number of scales for the SURF computation. We can afterwards run
the processing with the Update() method.
filter ->SetOctavesNumber(octaves );
filter ->SetScalesNumber(scales);
filter ->Update();
Once the SURF are computed, we may want to draw them on top of the input image. In order to do
this, we will create the following RGB image and the corresponding writer:
typedef itk::RGBPixel <PixelType > RGBPixelType;
typedef otb::Image <RGBPixelType , 2> OutputImageType;
We set the regions of the image by copying the information from the input image and we allocate
the memory for the output image.

We can now proceed to copy the input image into the output one using region iterators. The input
image is a grey level one. The output image will be made of color crosses for each SURF on top of
the grey level input image. So we start by copying the grey level values on each of the 3 channels of
the color image.
itk::ImageRegionIterator <OutputImageType > iterOutput (
outputImage ,
outputImage ->
itk::ImageRegionIterator <ImageType > iterInput (reader ->GetOutput (),
for (iterOutput .GoToBegin (), iterInput .GoToBegin ();
!iterOutput .IsAtEnd ();
++iterOutput , ++iterInput )
{
OutputImageType::PixelType rgbPixel;
rgbPixel.SetRed(static_cast<PixelType >(iterInput .Get ()));
rgbPixel.SetGreen (static_cast<PixelType >(iterInput .Get ()));
rgbPixel.SetBlue(static_cast<PixelType >(iterInput .Get ()));
iterOutput .Set(rgbPixel );
}
We are now going to plot color crosses on the output image. We will need to deﬁne offsets (top,
bottom, left and right) with respect to the SURF position in order to draw the cross segments.
ImageType ::OffsetType t = {{ 0, 1}};
ImageType ::OffsetType b = {{ 0, -1}};
ImageType ::OffsetType l = {{ 1, 0}};
ImageType ::OffsetType r = {{-1, 0}};
Now, we are going to access the point set generated by the SURF ﬁlter. The points are stored into a
points container that we are going to walk through using an iterator. These are the types needed for
this task:
We set the iterator to the beginning of the point set.
PointsIteratorType pIt = filter ->GetOutput ()->GetPoints ()->Begin();
We get the information about image size and spacing before drawing the crosses.
ImageType :: SpacingType spacing = reader ->GetOutput ()->GetSpacing ();
ImageType :: PointType origin = reader ->GetOutput ()->GetOrigin ();
//OutputImageType::SizeType size = outputImage->GetLargestPossibleRegion().GetSize();
And we iterate through the SURF set:

while (pIt != filter ->GetOutput ()->GetPoints ()->End ())
{
We get the pixel coordinates for each SURF by using the Value() method on the point set iterator.
We use the information about size and spacing in order to convert the physical coordinates of the
point into pixel coordinates.
ImageType ::IndexType index;
(pIt.Value()[0] -
)));
(pIt.Value()[1] -
)));
We create a green pixel.
OutputImageType::PixelType keyPixel;
keyPixel.SetRed (0);
keyPixel.SetGreen (255);
keyPixel.SetBlue (0);
We draw the crosses using the offsets and checking that we are inside the image, since SURFs on
the image borders would cause an out of bounds pixel access.
if (outputImage ->GetLargestPossibleRegion(). IsInside (index))
{
outputImage ->SetPixel(index , keyPixel );
t))
outputImage ->
SetPixel (index + t, keyPixel );
b))
outputImage ->
SetPixel (index + b, keyPixel );
l))
outputImage ->
SetPixel (index + l, keyPixel );
r))

Figure 14.6: Result of applying the otb::ImageToSURFKeyPointSetFilter to a Spot 5 image.
outputImage ->
SetPixel (index + r, keyPixel );
}
++pIt;
}
Finally, we write the image.
writer ->Update();
Figure 14.6 shows the result of applying the SURF point detector to a small patch extracted from a
Spot 5 image.
14.3 Alignments
Examples/FeatureExtraction/AlignmentsExample.cxx.
This example illustrates the use of the otb::ImageToPathListAlignFilter. This ﬁlter allows to
extract meaninful alignments. Alignments (that is edges and lines) are detected using the Gestalt
approach proposed by Desolneux et al. [40]. In this context, an event is considered meaningful if
the expectation of its occurrence would be very small in a random image. One can thus consider
that in a random image the direction of the gradient of a given point is uniformly distributed, and
that neighbouring pixels have a very low probability of having the same gradient direction. This

14.3. Alignments 345
algorithm gives a set of straight line segments defined by the two extremity coordinates under the
form of a std::list of itk::PolyLineParametricPath.
The first step required to use this filter is to include its header.
#include "otbImageToPathListAlignFilter.h"
In order to visualize the detected alignments, we will use the facility class otb::DrawPathFilter
which draws a itk::PolyLineParametricPath on top of a given image.
#include "itkPolyLineParametricPath.h"
#include "otbDrawPathFilter.h"
The otb::ImageToPathListAlignFilter is templated over the input image type and the output
path type, so we start by defining:
typedef itk::PolyLineParametricPath <Dimension > PathType ;
typedef otb::ImageToPathListAlignFilter<InputImageType , PathType >
ListAlignFilterType;
Next, we build the pipeline.
ListAlignFilterType::Pointer alignFilter = ListAlignFilterType::New();
alignFilter ->SetInput (reader ->GetOutput ());
We can choose the number of accepted false alarms in the detection with the method SetEps() for
which the parameter is of the form −log10(max. number of false alarms).
alignFilter ->SetEps(atoi(argv[3]));
As stated, above, the otb::DrawPathFilter, is useful for drawint the detected alignments. This
class is templated over the input image and path types and also on the output image type.
typedef otb::DrawPathFilter <InputImageType , PathType ,
OutputImageType > DrawPathFilterType;
We will now go through the list of detected paths and feed them to the otb::DrawPathFilter
inside a loop. We will use a list iterator inside a while statement.
typedef ListAlignFilterType::OutputPathListType ListType;
ListType * pathList = alignFilter ->GetOutput ();
ListType ::Iterator listIt = pathList ->Begin();
We define a dummy image will be iteratively fed to the otb::DrawPathFilter after the drawing
of each alignment.
InputImageType::Pointer backgroundImage = reader ->GetOutput ();

Figure 14.7: Result of applying the otb::ImageToPathListAlignFilter to a VHR image of a suburb.
We iterate through the list and write the result to a ﬁle.
while (listIt != pathList ->End ())
{
DrawPathFilterType::Pointer drawPathFilter = DrawPathFilterType::New ();
drawPathFilter ->SetImageInput(backgroundImage);
drawPathFilter ->SetInputPath(listIt.Get ());
drawPathFilter ->SetValue (itk::NumericTraits <OutputPixelType >:: max());
drawPathFilter ->Update ();
backgroundImage = drawPathFilter ->GetOutput ();
++listIt;
}
writer ->SetInput(backgroundImage);
Figure 14.7 shows the result of applying the alignment detection to a small patch extracted from a
VHR image.
14.4 Lines
14.4.1 Line Detection
Examples/FeatureExtraction/RatioLineDetectorExample.cxx.
This example illustrates the use of the otb::RatioLineDetectorImageFilter. This ﬁlter is used
for line detection in SAR images. Its principle is described in [134]: a line is detected if two parallel
edges are present in the images. These edges are detected with the ratio of means detector.

14.4. Lines 347
#include "otbLineRatioDetectorImageFilter.h"
typedef otb:: LineRatioDetectorImageFilter
<InternalImageType , InternalImageType > FilterType ;
types.
to SmartPointers.

Figure 14.8: Result of applying the otb::LineRatioDetectorImageFilter to a SAR image. From left to
right : original image, line intensity and edge orientation.
rescaler ->SetOutputMinimum(itk::NumericTraits <OutputPixelType >:: min());
rescaler ->SetOutputMaximum(itk::NumericTraits <OutputPixelType >:: max());
otb::LineRatioDetectorImageFilter. The pipeline is built as follows.
The methods SetLengthLine() and SetWidthLine() allow to set the minimum length and the
typical witdh of the lines which are to be detected.
filter ->SetLengthLine(atoi(argv[4]));
filter ->SetWidthLine(atoi(argv[5]));
filter ->Update();
We can also obtain the direction of the lines by invoking the GetOutputDirection() method.
rescaler ->SetInput (filter ->GetOutputDirection());
writer ->Update();
shows the result of applying the LineRatio edge detector ﬁlter to a SAR image.
• otb::LineCorrelationDetectorImageFilter

14.4. Lines 349
Examples/FeatureExtraction/CorrelationLineDetectorExample.cxx.
This example illustrates the use of the otb::CorrelationLineDetectorImageFilter. This ﬁlter
is used for line detection in SAR images. Its principle is described in [134]: a line is detected if
two parallel edges are present in the images. These edges are detected with the correlation of means
detector.
#include "otbLineCorrelationDetectorImageFilter.h"
typedef otb::LineCorrelationDetectorImageFilter<InternalImageType ,
InternalImageType >
FilterType ;
types.
to SmartPointers.

otb::LineCorrelationDetectorImageFilter. The pipeline is built as follows.
filter ->Update();
We can also obtain the direction of the lines by invoking the GetOutputDirections() method.
rescaler ->SetInput (filter ->GetOutputDirection());
writer ->Update();
shows the result of applying the LineCorrelation edge detector ﬁlter to a SAR image.
• otb::LineCorrelationDetectorImageFilter
Examples/FeatureExtraction/AssymmetricFusionOfLineDetectorExample.cxx.
This example illustrates the use of the otb::AssymmetricFusionOfLineDetectorImageFilter.
#include "otbAssymmetricFusionOfLineDetectorImageFilter.h"

14.4. Lines 351
Figure 14.9: Result of applying the otb::LineCorrelationDetectorImageFilter to a SAR image. From
left to right : original image, line intensity and edge orientation.
typedef otb:: AssymmetricFusionOfLineDetectorImageFilter<InternalImageType ,
InternalImageType >
FilterType ;
types.
to SmartPointers.

otb::AssymetricFusionOfDetectorImageFilter. The pipeline is built as follows.
filter ->Update();
Figure 14.10 shows the result of applying the AssymetricFusionOf edge detector ﬁlter to a SAR
image.
Examples/FeatureExtraction/ParallelLineDetectionExample.cxx.
This example illustrates the details of the otb::ParallelLinePathListFilter.
14.4.2 Segment Extraction
Local Hough Transform
Examples/FeatureExtraction/LocalHoughExample.cxx.

14.4. Lines 353
Figure 14.10: Result of applying the otb::AssymetricFusionOfDetectorImageFilter to a SAR image.
From left to right : original image, line intensity.
This example illustrates the use of the otb::ExtractSegmentsImageFilter.
#include "otbLocalHoughFilter.h"
#include "otbDrawLineSpatialObjectListFilter.h"
#include "otbLineSpatialObjectList.h"
typedef otb::LocalHoughFilter <InternalImageType > LocalHoughType;
typedef otb::DrawLineSpatialObjectListFilter<InternalImageType ,
OutputImageType >
DrawLineListType;
to SmartPointers.

Figure 14.11: Result of applying the otb::LocalHoughImageFilter. From left to right : original image,
extracted segments.
LocalHoughType::Pointer localHough = LocalHoughType::New ();
DrawLineListType::Pointer drawLineList = DrawLineListType::New ();
The same is done for the writer.
The image obtained with the reader is passed as input to the otb::ExtractSegmentsImageFilter.
The pipeline is built as follows.
localHough ->SetInput (reader ->GetOutput ());
drawLineList ->SetInput(reader ->GetOutput ());
drawLineList ->SetInputLineSpatialObjectList(localHough ->GetOutput ());
writer ->SetInput(drawLineList ->GetOutput ());
writer ->Update();
Figure 14.11 shows the result of applying the otb::LocalHoughImageFilter.
Line Segment Detector
Examples/FeatureExtraction/LineSegmentDetectorExample.cxx.
This example illustrates the use of the otb::LineSegmentDetector[139], also known as Lucy in
the Sky with Diamonds. This filter is designed to extract segments in mono channel images.
#include "otbLineSegmentDetector.h"
As usual, we start by defining the types for the input image and the image file reader.

14.4. Lines 355
typedef otb::Image <InputPixelType , Dimension > ImageType ;
We instantiate the reader and set the file name for the input image.
We define now the type for the segment detector filter. It is templated over the input image type and
the precision with which the coordinates of the detected segments will be given. It is recommended
to set this precision to a real type. The output of the filter will be a otb::VectorData.
typedef otb::LineSegmentDetector <ImageType ,
PrecisionType > LsdFilterType;
LsdFilterType::Pointer lsdFilter = LsdFilterType::New();
In order to be able to display the results, we will draw the detected segments on top of the
input image. For this matter, we will use a otb::VectorDataToMapFilter which is tem-
plated over the input vector data type and the output image type, and a conbination of a
itk::binaryFunctorImageFilter and the otb::FunctorAlphaBlendingFunctor.
typedef otb::VectorData <PrecisionType > VectorDataType;
ImageType > VectorDataRendererType;
VectorDataRendererType::Pointer vectorDataRenderer = VectorDataRendererType::New();
typedef otb::Functor ::AlphaBlendingFunctor <InputPixelType ,
InputPixelType , InputPixelType > FunctorType ;
typedef itk::BinaryFunctorImageFilter<ImageType , ImageType ,
ImageType , FunctorType > BlendingFilterType;
BlendingFilterType::Pointer blendingFilter = BlendingFilterType::New();
We can now define the type for the writer, instantiate it and set the file name for the output image.
writer ->SetFileName (outfname );
We plug the pipeline.
lsdFilter ->SetInput (reader ->GetOutput ());
vectorDataRenderer ->SetInput (lsdFilter ->GetOutput ());
vectorDataRenderer ->SetSize(reader ->GetOutput ()->GetLargestPossibleRegion(). GetSize ());
vectorDataRenderer ->SetRenderingStyleType(VectorDataRendererType::Binary);
blendingFilter ->SetInput1 (reader ->GetOutput ());
blendingFilter ->SetInput2 (vectorDataRenderer ->GetOutput ());

Figure 14.12: Result of applying the otb::LineSegmentDetector to an image.
blendingFilter ->GetFunctor (). SetAlpha (0.25);
writer ->SetInput(blendingFilter ->GetOutput ());
Before calling the Update() method of the writer in order to trigger the pipeline execution, we call
the GenerateOutputInformation()::of the reader, so the LSD filter gets the information about
image size and spacing.
writer ->Update();
Figure 14.12 shows the result of applying the line segment detection to an image.
14.4.3 Right Angle Detector
Examples/FeatureExtraction/RightAngleDetectionExample.cxx.
This example illustrates the use of the otb::VectorDataToRightAngleVectorDataFilter. This
filter detects the right angles in an image by exploiting the output of a line detection algorithm.
Typically the otb::LineSegmentDetector class will be used. The right angle detection algorithm
is described in detail in [1].
#include "otbVectorDataToRightAngleVectorDataFilter.h"
After defining, as usual, the types for the input image and the image reader, we define the specific
types needed for this example. First of all, we will use a vector data to store the detected lines which
will be provided by the line segment detector.

14.4. Lines 357
typedef otb::VectorData <PrecisionType > VectorDataType;
The right angle detector’s output is a vector data where each point gives the coordinate of the de-
tected angle.
Next, We define the type for the line segment detector. A detailed example for this detector can be
found in section 14.4.2.
typedef otb::LineSegmentDetector <ImageType , PrecisionType > LsdFilterType;
We can finally define the type for the right angle detection filter. This filter is templated over the
input vector data type provided by the line segment detector.
typedef otb::VectorDataToRightAngleVectorDataFilter<VectorDataType >
RightAngleFilterType;
We instantiate the line segment detector and the right angle detector.
LsdFilterType::Pointer lsdFilter = LsdFilterType::New();
RightAngleFilterType::Pointer rightAngleFilter = RightAngleFilterType::New ();
lsdFilter ->SetInput (reader ->GetOutput ());
rightAngleFilter ->SetInput (lsdFilter ->GetOutput ());
You can choose how far the right angle segments can be, and the tolerance to consider an angle
between two segments as an right one.
rightAngleFilter ->SetAngleThreshold(angleThreshold);
rightAngleFilter ->SetDistanceThreshold(distanceThreshold);
We will now draw the right angles on top of the input image. For this, we will draw the detected
points on top of the input image. For this matter, we will use a otb::VectorDataToMapFilter
which is templated over the input vector data type and the output image type, and a conbination of a
itk::binaryFunctorImageFilter and the otb::FunctorUnaryFunctorImageFilter.
ImageType > VectorDataRendererType;
VectorDataRendererType::Pointer vectorDataRenderer = VectorDataRendererType::New();
typedef otb::Functor ::AlphaBlendingFunctor <PixelType ,
PixelType , PixelType > FunctorType ;
typedef itk::BinaryFunctorImageFilter<ImageType , ImageType ,
ImageType , FunctorType > BlendingFilterType;
BlendingFilterType::Pointer blendingFilter = BlendingFilterType::New();
vectorDataRenderer ->SetInput (1, lsdFilter ->GetOutput ());
vectorDataRenderer ->SetInput (rightAngleFilter ->GetOutput ());

Figure 14.13: Result of applying the otb::LineSegmentDetector and the
otb::VectorDataToRightAngleVectorDataFilter to an image. From left to right : original image,
detected right angles.
vectorDataRenderer ->SetSize(reader ->GetOutput ()->GetLargestPossibleRegion(). GetSize ());
vectorDataRenderer ->SetOrigin (reader ->GetOutput ()->GetOrigin ());
vectorDataRenderer ->SetSpacing (reader ->GetOutput ()->GetSpacing ());
vectorDataRenderer ->SetRenderingStyleType(VectorDataRendererType::Binary);
blendingFilter ->SetInput1 (reader ->GetOutput ());
blendingFilter ->SetInput2 (vectorDataRenderer ->GetOutput ());
blendingFilter ->GetFunctor (). SetAlpha (0.25);
writer ->SetInput(blendingFilter ->GetOutput ());
Before calling the Update() method of the writer in order to trigger the pipeline execution, we call
the GenerateOutputInformation()::of the reader, so the ﬁlter gets the information about image
size and spacing.
writer ->Update();
Figure 14.13 shows the result of applying the right angle detection ﬁlter to an image.
14.5 Density Features
An interesting approach to feature extraction consists in computing the density of previously de-
tected features as simple edges or interest points.

14.5. Density Features 359
14.5.1 Edge Density
Examples/FeatureExtraction/EdgeDensityExample.cxx.
This example illustrates the use of the otb::EdgeDensityImageFilter. This filter computes a
local density of edges on an image and can be useful to detect man made objects or urban areas, for
instance. The filter has been implemented in a generic way, so that the way the edges are detected
and the way their density is computed can be chosen by the user.
#include "otbEdgeDensityImageFilter.h"
We will also include the header files for the edge detector (a Canny filter) and the density estimation
(a simple count on a binary image).
#include "otbBinaryImageDensityFunction.h"
As usual, we start by defining the types for the images, the reader and the writer.
We define now the type for the function which will be used by the edge density filter to estimate
this density. Here we choose a function which counts the number of non null pixels per area. The
fucntion takes as template the type of the image to be processed.
typedef otb::BinaryImageDensityFunction<ImageType > CountFunctionType;
These non null pixels will be the result of an edge detector. We use here the classical Canny edge
detector, which is templated over the input and output image types.
typedef itk::CannyEdgeDetectionImageFilter<ImageType , ImageType >
CannyDetectorType;
Finally, we can define the type for the edge density filter which takes as template the input and output
image types, the edge detector type, and the count fucntion type..
typedef otb::EdgeDensityImageFilter <ImageType , ImageType , CannyDetectorType ,
CountFunctionType > EdgeDensityFilterType;
We can now instantiate the different processing objects of the pipeline using the New() method.

Figure 14.14: Result of applying the otb::EdgeDensityImageFilter to an image. From left to right :
original image, edge density.
EdgeDensityFilterType::Pointer filter = EdgeDensityFilterType::New();
CannyDetectorType::Pointer cannyFilter = CannyDetectorType::New();
The edge detection filter needs to be instantiated because we need to set its parameters. This is what
we do here for the Canny filter.
cannyFilter ->SetUpperThreshold(upperThreshold);
cannyFilter ->SetLowerThreshold(lowerThreshold);
cannyFilter ->SetVariance (variance );
cannyFilter ->SetMaximumError(maximumError);
After that, we can pass the edge detector to the filter which will be used it internally.
filter ->SetDetector (cannyFilter );
filter ->SetNeighborhoodRadius(radius);
Finally, we set the file names for the input and the output images and we plug the pipeline. The
Update() method of the writer will trigger the processing.
writer ->Update();
Figure 14.14 shows the result of applying the edge density filter to an image.
14.5.2 SIFT Density
Examples/Patented/SIFTDensityExample.cxx.
This example illustrates the use of the otb::KeyPointDensityImageFilter. This filter computes
a local density of keypoints (SIFT or SURF, for instance) on an image and can be useful to detect

14.5. Density Features 361
man made objects or urban areas, for instance. The filter has been implemented in a generic way, so
that the way the keypoints are detected can be chosen by the user.
#include "otbKeyPointDensityImageFilter.h"
As usual, we start by defining the types for the images, the reader and the writer.
We define now the type for the keypoint detection. The keypoints will be stored in vector form (they
may contain many descriptors) into a point set. The filter for detecting the SIFT is templated over
the input image type and the output pointset type.
typedef itk::VariableLengthVector <PixelType > RealVectorType;
typedef otb::ImageToSIFTKeyPointSetFilter<ImageType , PointSetType >
DetectorType;
We can now define the filter which will compute the SIFT density. It will be templated over the
input and output image types and the SIFT detector.
typedef otb::KeyPointDensityImageFilter<ImageType , ImageType , DetectorType >
FilterType ;
We can instantiate the reader and the writer as wella s the filter and the detector. The detector needs
to be instantiated in order to set its parameters.
DetectorType::Pointer detector = DetectorType::New ();
We set the file names for the input and the output images.
We set the parameters for the SIFT detector (the number of octaves and the number of scales per
octave).
detector ->SetOctavesNumber(octaves );
detector ->SetScalesNumber(scales);
And we pass the detector to the filter and we set the radius for the density estimation.
filter ->SetDetector (detector );
filter ->SetNeighborhoodRadius(radius);

Figure 14.15: Result of applying the otb::KeypointDensityImageFilter to an image. From left to right :
original image, SIF density.
We trigger the execution by calling th Update() method on the writer, but before that we run the
GenerateOutputInformation() on the reader so the filter gets the information about the image
size (needed for the SIFT computation).
writer ->Update();
Figure 14.15 shows the result of applying the key point density filter to an image using the SIFT de-
tector. The main difference with respect to figure 14.14 is that for SIFTS, individual trees contribute
to the density.
14.6 Geometric Moments
14.6.1 Complex Moments
The complex geometric moments are defined as:
cpq =
+∞
−∞
+∞
−∞
(x+ iy)p
(x− iy)q
f(x,y)dxdy, (14.2)
where x and y are the coordinates of the image f(x,y), i is the imaginary unit and p + q is the order
of cpq. The geometric moments are particularly useful in the case of scale changes.
Complex Moments for Images
Examples/FeatureExtraction/ComplexMomentsImageFunctionExample.cxx.

14.6. Geometric Moments 363
This example illustrates the use of the otb::ComplexMomentsImageFunction.
#include "otbComplexMomentsImageFunction.h"
The otb::ComplexMomentImageFunction is templated over the input image type and the output
complex type value, so we start by deﬁning:
typedef otb::ComplexMomentsImageFunction<InputImageType > CMType;
typedef CMType::OutputType OutputType ;
CMType::Pointer cmFunction = CMType::New();
Next, we plug the input image into the complex moment fucntion and we set its parameters.
reader ->Update();
cmFunction ->SetInputImage(reader ->GetOutput ());
cmFunction ->SetQmax(Q);
cmFunction ->SetPmax(P);
We can chose the pixel of the image which will used as center for the moment computation
InputImageType::IndexType center;
center[0] = 50;
center[1] = 50;
We can also choose the size of the neighborhood around the center pixel for the moment computa-
tion.
In order to get the value of the moment, we call the EvaluateAtIndex method.
OutputType Result = cmFunction ->EvaluateAtIndex(center);
std::cout << "The moment of order (" << P << "," << Q <<
") is equal to " << Result.at(P).at(Q) << std::endl;
Complex Moments for Paths
Examples/FeatureExtraction/ComplexMomentPathExample.cxx.
The complex moments can be computed on images, but sometimes we are interested in computing
them on shapes extracted from images by segmentation algorithms. These shapes can be represented
by itk::Paths. This example illustrates the use of the otb::ComplexMomentPathFunction for
the computation of complex geometric moments on ITK paths.
#include "otbComplexMomentPathFunction.h"

The otb::ComplexMomentPathFunction is templated over the input path type and the output
complex type value, so we start by deﬁning:
typedef itk::PolyLineParametricPath <Dimension > PathType;
typedef std::complex <double> ComplexType ;
typedef otb::ComplexMomentPathFunction<PathType , ComplexType > CMType;
CMType::Pointer cmFunction = CMType::New();
Next, we set the parameters of the plug the input path into the complex moment function and we set
its parameters.
cmFunction ->SetInputPath(path);
cmFunction ->SetQ(Q);
cmFunction ->SetP(P);
Since the paths are deﬁned in physical coordinates, we do not need to set the center for the mo-
ment computation as we did with the otb::ComplexMomentImageFunction. The same applies
for the size of the neighborhood around the center pixel for the moment computation. The moment
computation is triggered by calling the Evaluate method.
ComplexType Result = cmFunction ->Evaluate ();
std::cout << "The moment of order (" << P << "," << Q <<
") is equal to " << Result << std::endl;
14.6.2 Hu Moments
Using the algebraic moment theory, H. Ming-Kuel obtained a family of 7 invariants with respect to
planar transformations called Hu invariants, [61]. Those invariants can be seen as nonlinear com-
binations of the complex moments. Hu invariants have been very much used in object recognition
during the last 30 years, since they are invariant to rotation, scaling and translation. [47] gives their
expressions :
φ1 = c11; φ2 = c20c02; φ3 = c30c03; φ4 = c21c12;
φ5 = Re(c30c3
12); φ6 = Re(c21c2
12); φ7 = Im(c30c3
12).
(14.3)
[43] have used these invariants for the recognition of aircraft silhouettes. Flusser and Suk have used
them for image registration, [73].

14.6. Geometric Moments 365
Hu Moments for Images
Examples/FeatureExtraction/HuMomentsImageFunctionExample.cxx.
This example illustrates the use of the otb::HuMomentsImageFunction.
#include "otbHuMomentsImageFunction.h"
The otb::HuImageFunction is templated over the input image type and the output (real) type
value, so we start by deﬁning:
typedef otb::HuMomentsImageFunction <InputImageType > HuType;
typedef HuType::OutputType MomentType ;
HuType::Pointer hmFunction = HuType::New();
We can choose the region and the pixel of the image which will used as coordinate origin for the
moment computation
InputImageType::RegionType region;
InputImageType::SizeType size;
InputImageType::IndexType start;
start[0] = 0;
start[1] = 0;
size[0] = 50;
size[1] = 50;
reader ->Update();
InputImageType::Pointer image = reader ->GetOutput ();
image ->Update();
center[0] = start[0] + size[0] / 2;
Next, we plug the input image into the complex moment function and we set its parameters.
hmFunction ->SetInputImage(image);
hmFunction ->SetNeighborhoodRadius(radius);
MomentType Result = hmFunction ->EvaluateAtIndex(center);

for (unsigned int j=0; j<7; ++j)
{
std::cout << "The moment of order " << j+1 <<
" is equal to " << Result[j] << std::endl;
}
• otb::HuPathFunction
14.6.3 Flusser Moments
The Hu invariants have been modiﬁed and improved by several authors. Flusser used these moments
in order to produce a new family of descriptors of order higher than 3, [47]. These descriptors are
invariant to scale and rotation. They have the following expressions:
ψ1 = c11 = φ1; ψ2 = c21c12 = φ4; ψ3 = Re(c20c2
12) = φ6;
ψ4 = Im(c20c2
12); ψ5 = Re(c30c3
12) = φ5; ψ6 = Im(c30c3
12) = φ7.
ψ7 = c22; ψ8 = Re(c31c2
12); ψ9 = Im(c31c12 2);
ψ10 = Re(c40c4
12); ψ11 = Im(c40c2
12).
(14.4)
Examples
Flusser Moments for Images
Examples/FeatureExtraction/FlusserMomentsImageFunctionExample.cxx.
This example illustrates the use of the otb::FlusserMomentsImageFunction.
#include "otbFlusserMomentsImageFunction.h"
The otb::FlusserMomentsImageFunction is templated over the input image type and the output
(real) type value, so we start by deﬁning:
typedef otb::FlusserMomentsImageFunction<InputImageType > FlusserType ;
typedef FlusserType ::OutputType MomentType ;
FlusserType ::Pointer fmFunction = FlusserType ::New ();
We can choose the region and the pixel of the image which will used as coordinate origin for the
moment computation

14.7. Road extraction 367
InputImageType::RegionType region;
InputImageType::SizeType size;
InputImageType::IndexType start;
start[0] = 0;
start[1] = 0;
size[0] = 50;
size[1] = 50;
reader ->Update();
InputImageType::Pointer image = reader ->GetOutput ();
image ->Update();
Next, we plug the input image into the complex moment function and we set its parameters.
fmFunction ->SetInputImage(image);
fmFunction ->SetNeighborhoodRadius(radius);
MomentType Result = fmFunction ->EvaluateAtIndex(center);
for (unsigned int j=0; j<11; ++j)
{
std::cout << "The moment of order " << j+1 <<
" is equal to " << Result[j] << std::endl;
}
• otb::FlusserPathFunction
14.7 Road extraction
Road extraction is a critical feature for an efﬁcient use of high resolution satellite images. There are
many applications of road extraction: update of GIS database, reference for image registration, help
for identiﬁcation algorithms and rapid mapping for example. Road network can be used to register
an optical image with a map or an optical image with a radar image for example. Road network
extraction can help for other algorithms: isolated building detection, bridge detection. In these cases,

a rough extraction can be sufficient. In the context of response to crisis, a fast mapping is necessary:
within 6 hours, infrastructures for the designated area are required. Within this timeframe, a manual
extraction is inconceivable and an automatic help is necessary.
14.7.1 Road extraction filter
Examples/FeatureExtraction/ExtractRoadExample.cxx.
The easiest way to use the road extraction filter provided by OTB is to use the composite filter. If a
modification in the pipeline is required to adapt to a particular situation, the step by step example,
described in the next section can be adapted.
This example demonstrates the use of the otb::RoadExtractionFilter. This filter is a composite
filter achieving road extraction according to the algorithm adapted by E. Christophe and J. Inglada
[25] from an original method proposed in [84].
The first step toward the use of this filter is the inclusion of the proper header files.
#include "otbPolyLineParametricPathWithValue.h"
#include "otbRoadExtractionFilter.h"
#include "otbDrawPathListFilter.h"
Then we must decide what pixel type to use for the image. We choose to do all the computation in
otb::RoadExtractionFilter needs an otb::VectorImage as input to handle multispectral im-
ages.
typedef otb::VectorImage <InputPixelType , Dimension > InputVectorImageType;
We define the type of the polyline that the filter produces. We use the
otb::PolyLineParametricPathWithValue, which allows the filter to produce a likehood
value along with each polyline. The filter is able to produce itk::PolyLineParametricPath as
well.
typedef otb::PolyLineParametricPathWithValue<InputPixelType ,
Dimension > PathType;
Now we can define the otb::RoadExtractionFilter that takes a multi-spectral image as input
and produces a list of polylines.

typedef otb::RoadExtractionFilter <InputVectorImageType ,
PathType > RoadExtractionFilterType;
We also define an otb::DrawPathListFilter to draw the output polylines on an image, taking
their likehood values into account.
typedef otb::DrawPathListFilter <InputImageType , PathType ,
InputImageType > DrawPathFilterType;
types.
typedef itk::RescaleIntensityImageFilter<InputImageType ,
An otb::ImageFileReader class is also instantiated in order to read image data from a file. Then,
an otb::ImageFileWriter is instantiated in order to write the output image to a file.
typedef otb::ImageFileReader <InputVectorImageType > ReaderType ;
The different filters composing our pipeline are created by invoking their New() methods, assigning
the results to smart pointers.
RoadExtractionFilterType::Pointer roadExtractionFilter
= RoadExtractionFilterType::New();
DrawPathFilterType::Pointer drawingFilter = DrawPathFilterType::New();
RescalerType::Pointer rescaleFilter = RescalerType::New();
The otb::RoadExtractionFilter needs to have a reference pixel corresponding to the spectral
content likely to represent a road. This is done by passing a pixel to the filter. Here we suppose that
the input image has four spectral bands.
InputVectorImageType::PixelType ReferencePixel;
ReferencePixel.SetSize (4);
ReferencePixel.SetElement (0, ::atof(argv[3]));
roadExtractionFilter ->SetReferencePixel(ReferencePixel);
We must also set the alpha parameter of the filter which allows us to tune the width of the roads we
want to extract. Typical value is 1.0 and should be working in most situations.
roadExtractionFilter ->SetAlpha(atof(argv[7]));

All other parameter should not influence the results too much in most situation and can be kept at
the default value.
The amplitude threshold parameter tunes the sensitivity of the vectorization step. A typical value is
5 ·10−5.
roadExtractionFilter ->SetAmplitudeThreshold(atof(argv[8]));
The tolerance threshold tunes the sensitivity of the path simplification step. Typical value is 1.0.
roadExtractionFilter ->SetTolerance(atof(argv[9]));
Roads are not likely to have sharp turns. Therefore we set the max angle parameter, as well as the
link angular threshold. The value is typicaly π
8 .
roadExtractionFilter ->SetMaxAngle (atof(argv[10]));
roadExtractionFilter ->SetAngularThreshold(atof(argv[10]));
The otb::RoadExtractionFilter performs two odd path removing operations at different stage
of its execution. The first mean distance threshold and the second mean distance threshold set their
criterion for removal. Path are removed if their mean distance between nodes is to small, since such
path coming from previous filters are likely to be tortuous. The first removal operation as a typical
mean distance threshold parameter of 1.0, and the second of 10.0.
roadExtractionFilter ->SetFirstMeanDistanceThreshold(atof(argv[11]));
roadExtractionFilter ->SetSecondMeanDistanceThreshold(atof(argv[12]));
The otb::RoadExtractionFilter is able to link path whose ends are near according to an eu-
clidean distance criterion. The threshold for this distance to link a path is the distance threshold
parameter. A typical value is 25.
roadExtractionFilter ->SetDistanceThreshold(atof(argv[13]));
We will now create a black background image to draw the resulting polyline on. To
achieve this we need to know the size of our input image. Therefore we trigger the
GenerateOutputInformation() of the reader.
InputImageType::Pointer blackBackground = InputImageType::New ();
blackBackground ->SetRegions (reader ->GetOutput ()->GetLargestPossibleRegion());
blackBackground ->Allocate ();
blackBackground ->FillBuffer (0);
We tell the otb::DrawPathListFilter to try to use the likehood value embedded within the
polyline as a value for drawing this polyline if possible.
drawingFilter ->UseInternalPathValueOn();

Figure 14.16: Result of applying the otb::RoadExtractionFilter to a fusionned Quickbird image. From
left to right : original image, extracted road with their likehood values (color are inverted for display).
The itk::RescaleIntensityImageFilter needs to know which is the minimum and maxi-
mum values of the output generated image. Those can be chosen in a generic way by using the
rescaleFilter ->SetOutputMinimum(itk ::NumericTraits <OutputPixelType >:: min ());
rescaleFilter ->SetOutputMaximum(itk ::NumericTraits <OutputPixelType >:: max ());
Now it is time for some pipeline wiring.
roadExtractionFilter ->SetInput(reader ->GetOutput ());
drawingFilter ->SetInput(blackBackground);
drawingFilter ->SetInputPath(roadExtractionFilter ->GetOutput ());
rescaleFilter ->SetInput(drawingFilter ->GetOutput ());
The update of the pipeline is triggered by the Update() method of the rescale intensity filter.
rescaleFilter ->Update();
Figure 14.16 shows the result of applying the road extraction filter to a fusionned Quickbird image.
14.7.2 Step by step road extraction
Examples/FeatureExtraction/ExtractRoadByStepsExample.cxx.
This example illustrates the details of the otb::RoadExtractionFilter. This filter, described in

Figure 14.17: Illustration of the spectral angle for one pixel of a three-band image. One of the vector is the
reference pixel and the other is the current pixel.
the previous section, is a composite filter that includes all the steps below. Individual filters can be
replaced to design a road detector targeted at SAR images for example.
The spectral angle is used to compute a grayscale image from the multispectral original image using
otb::SpectralAngleDistanceImageFilter. The spectral angle is illustrated on Figure 14.17.
Pixels corresponding to roads are in darker color.
typedef otb::SpectralAngleDistanceImageFilter<MultiSpectralImageType ,
InternalImageType > SAFilterType;
SAFilterType::Pointer saFilter = SAFilterType::New ();
saFilter ->SetReferencePixel(pixelRef );
saFilter ->SetInput (multispectralReader ->GetOutput ());
A square root is applied to the spectral angle image in order to enhance contrast between darker
pixels (which are pixels of interest) with itk::SqrtImageFilter.
typedef itk::SqrtImageFilter <InternalImageType ,
InternalImageType > SqrtFilterType;
SqrtFilterType::Pointer sqrtFilter = SqrtFilterType::New ();
sqrtFilter ->SetInput (saFilter ->GetOutput ());
Use the Gaussian gradient filter compute the gradient in x and y direction respectively (
itk::GradientRecursiveGaussianImageFilter).
double sigma = alpha * (1.2 / resolution + 1);
typedef itk::GradientRecursiveGaussianImageFilter<InternalImageType ,
VectorImageType >
GradientFilterType;

GradientFilterType::Pointer gradientFilter = GradientFilterType::New();
gradientFilter ->SetSigma(sigma);
gradientFilter ->SetInput(sqrtFilter ->GetOutput ());
Compute the scalar product of the neighboring pixels and keep the minimum value and the direction
with otb::NeighborhoodScalarProductFilter. This is the line detector described in [84].
typedef otb::NeighborhoodScalarProductFilter<VectorImageType ,
InternalImageType ,
InternalImageType >
NeighborhoodScalarProductType;
NeighborhoodScalarProductType::Pointer scalarFilter
= NeighborhoodScalarProductType::New ();
scalarFilter ->SetInput(gradientFilter ->GetOutput ());
The resulting image is passed to the otb::RemoveIsolatedByDirectionFilter ﬁlter to remove
pixels with no neighbor having the same direction.
typedef otb::RemoveIsolatedByDirectionFilter<InternalImageType ,
InternalImageType ,
InternalImageType >
RemoveIsolatedByDirectionType;
RemoveIsolatedByDirectionType::Pointer removeIsolatedByDirectionFilter
= RemoveIsolatedByDirectionType::New ();
removeIsolatedByDirectionFilter->SetInput (scalarFilter ->GetOutput ());
removeIsolatedByDirectionFilter
->SetInputDirection(scalarFilter ->GetOutputDirection());
We remove pixels having a direction corresponding to bright lines as we know that after the spectral
angle, roads are in darker color with the otb::RemoveWrongDirectionFilter ﬁlter.
typedef otb::RemoveWrongDirectionFilter<InternalImageType ,
InternalImageType ,
InternalImageType >
RemoveWrongDirectionType;
RemoveWrongDirectionType::Pointer removeWrongDirectionFilter
= RemoveWrongDirectionType::New();
removeWrongDirectionFilter->SetInput(
removeIsolatedByDirectionFilter->GetOutput ());
removeWrongDirectionFilter->SetInputDirection(
scalarFilter ->GetOutputDirection());
We remove pixels which are not maximum on the direction perpendicular to the road direction with
the otb::NonMaxRemovalByDirectionFilter.
typedef otb::NonMaxRemovalByDirectionFilter<InternalImageType ,
InternalImageType ,
InternalImageType >
NonMaxRemovalByDirectionType;
NonMaxRemovalByDirectionType::Pointer nonMaxRemovalByDirectionFilter
= NonMaxRemovalByDirectionType::New ();
nonMaxRemovalByDirectionFilter->SetInput (

removeWrongDirectionFilter->GetOutput ());
nonMaxRemovalByDirectionFilter
->SetInputDirection(scalarFilter ->GetOutputDirection());
Extracted road are vectorized into polylines with otb::VectorizationPathListFilter.
typedef otb::VectorizationPathListFilter<InternalImageType ,
InternalImageType ,
PathType > VectorizationFilterType;
VectorizationFilterType::Pointer vectorizationFilter
= VectorizationFilterType::New();
vectorizationFilter ->SetInput(nonMaxRemovalByDirectionFilter->GetOutput ());
vectorizationFilter ->SetInputDirection(scalarFilter ->GetOutputDirection());
vectorizationFilter ->SetAmplitudeThreshold(atof(argv[8]));
However, this vectorization is too simple and need to be reﬁned to be usable. First,
we remove all aligned points to make one segment with otb::SimplifyPathListFilter.
Then we break the polylines which have sharp angles as they are probably not road with
otb::BreakAngularPathListFilter. Finally we remove path which are too short with
otb::RemoveTortuousPathListFilter.
typedef otb::SimplifyPathListFilter <PathType > SimplifyPathType;
SimplifyPathType::Pointer simplifyPathListFilter = SimplifyPathType::New();
simplifyPathListFilter ->GetFunctor (). SetTolerance(1.0);
simplifyPathListFilter ->SetInput (vectorizationFilter ->GetOutput ());
typedef otb::BreakAngularPathListFilter<PathType > BreakAngularPathType;
BreakAngularPathType::Pointer breakAngularPathListFilter
= BreakAngularPathType::New ();
breakAngularPathListFilter->SetMaxAngle (otb::CONST_PI / 8.);
breakAngularPathListFilter->SetInput(simplifyPathListFilter ->GetOutput ());
typedef otb::RemoveTortuousPathListFilter<PathType > RemoveTortuousPathType;
RemoveTortuousPathType::Pointer removeTortuousPathListFilter
= RemoveTortuousPathType::New ();
removeTortuousPathListFilter->GetFunctor (). SetThreshold(1.0);
removeTortuousPathListFilter->SetInput(breakAngularPathListFilter->GetOutput ());
Polylines within a certain range are linked ( otb::LinkPathListFilter) to try
to ﬁll gaps due to occultations by vehicules, trees, etc. before simplifying poly-
lines ( otb::SimplifyPathListFilter) and removing the shortest ones with
otb::RemoveTortuousPathListFilter.
typedef otb::LinkPathListFilter <PathType > LinkPathType;
LinkPathType::Pointer linkPathListFilter = LinkPathType::New();
linkPathListFilter ->SetDistanceThreshold(25.0 / resolution );
linkPathListFilter ->SetAngularThreshold(otb::CONST_PI / 8);
linkPathListFilter ->SetInput(removeTortuousPathListFilter->GetOutput ());
SimplifyPathType::Pointer simplifyPathListFilter2 = SimplifyPathType::New();
simplifyPathListFilter2 ->GetFunctor (). SetTolerance(1.0);
simplifyPathListFilter2 ->SetInput (linkPathListFilter ->GetOutput ());

RemoveTortuousPathType::Pointer removeTortuousPathListFilter2
= RemoveTortuousPathType::New ();
removeTortuousPathListFilter2->GetFunctor (). SetThreshold(10.0);
removeTortuousPathListFilter2->SetInput (simplifyPathListFilter2 ->GetOutput ());
A value can be associated with each polyline according to pixel values under the polyline with
otb::LikelihoodPathListFilter. A higher value will mean a higher Likelihood to be a road.
typedef otb::LikelihoodPathListFilter<PathType ,
InternalImageType >
PathListToPathListWithValueType;
PathListToPathListWithValueType::Pointer pathListConverter
= PathListToPathListWithValueType::New();
pathListConverter ->SetInput (removeTortuousPathListFilter2->GetOutput ());
pathListConverter ->SetInputImage(nonMaxRemovalByDirectionFilter->GetOutput ());
A black background image is built to draw the path on.
InternalImageType::Pointer output = InternalImageType::New ();
output ->SetRegions (multispectralReader ->GetOutput ()
->GetLargestPossibleRegion());
output ->Allocate ();
output ->FillBuffer (0.0);
Polylines are drawn on a black background image with otb::DrawPathListFilter. The
SetUseIternalValues() tell the drawing filter to draw the path with its Likelihood value.
typedef otb::DrawPathListFilter <InternalImageType , PathType ,
InternalImageType > DrawPathType;
DrawPathType::Pointer drawPathListFilter = DrawPathType::New();
drawPathListFilter ->SetInput (output);
drawPathListFilter ->SetInputPath(pathListConverter ->GetOutput ());
drawPathListFilter ->SetUseInternalPathValue(true);
The output from the drawing filter contains very small values (Likelihood values). Therefore the
image has to be rescaled to be viewed. The whole pipeline is executed by invoking the Update()
method on this last filter.
InternalImageType > RescalerType;
rescaler ->SetInput(drawPathListFilter ->GetOutput ());
rescaler ->Update ();
Figures 14.18 and 14.19 show the result of applying the road extraction by steps to a fusionned
Quickbird image. The result image is a RGB composition showing the extracted path in red. Full
processing took about 3 seconds for each image.

Figure 14.18: Result of applying the road extraction by steps pipeline to a fusionned Quickbird image. From
left to right : original image, extracted road with their Likelihood values.
Figure 14.19: Result of applying the road extraction by steps pipeline to a fusionned Quickbird image. From
left to right : original image, extracted road with their Likelihood values.

14.8. Cloud Detection 377
14.8 Cloud Detection
Examples/FeatureExtraction/CloudDetectionExample.cxx.
The cloud detection functor is a processing chain composed by the computation of a spectral angle
(with SpectralAngleFunctor). The result is multiplied by a gaussian factor (with CloudEstimator-
Functor) and finally thresholded to obtain a binary image (with CloudDetectionFilter). However,
modifications can be added in the pipeline to adapt to a particular situation.
This example demonstrates the use of the otb::CloudDetectionFilter. This filter uses the
spectral angle principle to measure the radiometric gap between a reference pixel and the other
pixels of the image.
#include "otbCloudDetectionFunctor.h"
#include "otbCloudDetectionFilter.h"
Then we must decide what pixel type to use for the images. We choose to do all the computations
in double precision.
typedef double OutputPixelType;
otb::CloudDetectionFilter needs an otb::VectorImage as input to handle multispectral im-
ages.
typedef otb::VectorImage <InputPixelType , Dimension > VectorImageType;
typedef VectorImageType::PixelType VectorPixelType;
We define the functor type that the filter will use. We use the otb::CloudDetectionFunctor.
typedef otb::Functor ::CloudDetectionFunctor <VectorPixelType ,
Now we can define the otb::CloudDetectionFilter that takes a multi-spectral image as input
and produces a binary image.
typedef otb::CloudDetectionFilter <VectorImageType , OutputImageType ,
FunctorType > CloudDetectionFilterType;
An otb::ImageFileReader class is also instantiated in order to read image data from a file. Then,
an otb::ImageFileWriter is instantiated in order to write the output image to a file.

CloudDetectionFilterType::Pointer cloudDetection =
CloudDetectionFilterType::New ();
The otb::CloudDetectionFilter needs to have a reference pixel corresponding to the spectral
content likely to represent a cloud. This is done by passing a pixel to the filter. Here we suppose that
the input image has four spectral bands.
VectorPixelType referencePixel;
referencePixel.SetSize (4);
referencePixel.Fill(0.);
referencePixel[0] = (atof(argv[5]));
cloudDetection ->SetReferencePixel(referencePixel);
We must also set the variance parameter of the filter and the parameter of the gaussian functor. The
bigger the value, the more tolerant the detector will be.
cloudDetection ->SetVariance (atof(argv[9]));
The minimum and maximum thresholds are set to binarise the final result. These values have to be
between 0 and 1.
cloudDetection ->SetMinThreshold(atof(argv[10]));
cloudDetection ->SetMaxThreshold(atof(argv[11]));
writer ->SetInput(cloudDetection ->GetOutput ());
writer ->Update();
Figure 14.20 shows the result of applying the cloud detection filter to a cloudy image.

14.8. Cloud Detection 379
Figure 14.20: From left to right : original image, cloud mask resulting from processing.

CHAPTER
FIFTEEN
MULTI-SCALE ANALYSIS
15.1 Introduction
In this chapter, the tools for multi-scale and multi-resoltuion processing (analysis, synthesis and
fusion) will be presented. Most of the algorithms are based on pyramidal approaches. These ap-
proaches were first used for image compression and they are based on the fact that, once an image
has been low-pass filtered it does not have details beyond the cut-off frequency of the low-pass filter
any more. Therefore, the image can be subsampled – decimated – without any loss of information.
A pyramidal decomposition is thus performed applying the following 3 steps in an iterative way:
1. Low pas filter the image In in order to produce F(In);
2. Compute the difference Dn = In − F(In) which corresponds to the details at level n;
3. Subsample F(In) in order to obtain In+1.
The result is a series of decrasing resolution images Ik and a series of decreasing resolution details
Dk.
15.2 Morphological Pyramid
If the smoothing filter used in the pyramidal analysis is a morphological filter, one cannot safely
subsample the filtered image without loss of information. However, by keeping the details possibly
lost in the down-sampling operation, such a decomposition can be used.
The Morphological Pyramid is an approach to such a decomposition. Its computation process is an
iterative analysis involving smoothing by the morphological filter, computing the details lost in the
smoothing, down-sampling the current image, and computing the details lost in the down-sampling.

382 Chapter 15. Multi-scale Analysis
Examples/MultiScale/MorphologicalPyramidAnalysisFilterExample.cxx.
This example illustrates the use of the otb::MorphologicalPyramidAnalyseFilter.
#include "otbMorphologicalPyramidAnalysisFilter.h"
The mathematical morphology filters to be used have also to be included here.
#include "otbOpeningClosingMorphologicalFilter.h"
As usual, we start by defining the types needed for the pixels, the images, the image reader and the
image writer.
Now, we define the types needed for the morphological filters which will be used to build the mor-
phological pyramid. The first thing to do is define the structuring element, which in our case, will
be a itk::BinaryBallStructuringElement which is templated over the pixel type and the di-
mension of the image.
typedef itk::BinaryBallStructuringElement<InputPixelType ,
We can now define the type of the filter to be used by the morphological pyramid. In this case, we
choose to use an otb::OpeningClosingMorphologicalFilter which is just the concatenation
of an opening and a closing. This filter is templated over the input and output image types and the
structuring element type that we just define above.
typedef otb::OpeningClosingMorphologicalFilter<InputImageType ,
InputImageType ,
StructuringElementType >
OpeningClosingFilterType;
We can finally define the type of the morpholoical pyramid filter. The filter is templated over the
input and output image types and the lowpas morphological filter to be used.
typedef otb::MorphologicalPyramidAnalysisFilter<InputImageType ,
OutputImageType ,

15.2. Morphological Pyramid 383
OpeningClosingFilterType>
PyramidFilterType;
Since the otb::MorphologicalPyramidAnalyseFilter generates a list of images as output, it is
useful to have an iterator to access the images. This is done as follows :
typedef PyramidFilterType::OutputImageListType::Iterator
ImageListIterator;
We can now instantiate the reader in order to access the input image which has to be analysed.
We instantiate the morphological pyramid analysis filter and set its parameters which are:
• the number of iterations or levels of the pyramid;
• the subsample scale or decimation factor between two successive pyramid levels.
After that, we plug the pipeline and run it by calling the Update() method.
PyramidFilterType::Pointer pyramid = PyramidFilterType::New();
pyramid ->SetNumberOfLevels(numberOfLevels);
pyramid ->SetDecimationRatio(decimationRatio);
pyramid ->SetInput(reader ->GetOutput ());
pyramid ->Update();
The morphological pyramid has 5 types of output:
• the analysed image at each level of the pyramid through the GetOutput() method;
• the brighter details extracted from the filtering operation through the GetSupFilter()
method;
• the darker details extracted from the filtering operation through the GetInfFilter() method;
• the brighter details extracted from the resampling operation through the GetSupDeci()
method;
• the darker details extracted from the resampling operation through the GetInfDeci() method;
to decimation
Each one of these methods provides a list of images (one for each level of analysis), so we can iterate
through the image lists by using iterators.
ImageListIterator itAnalyse = pyramid->GetOutput ()->Begin();
ImageListIterator itSupFilter = pyramid->GetSupFilter()->Begin();
ImageListIterator itInfFilter = pyramid->GetInfFilter()->Begin();
ImageListIterator itInfDeci = pyramid->GetSupDeci ()->Begin();
ImageListIterator itSupDeci = pyramid->GetInfDeci ()->Begin();

We can now instantiate a writer and use it to write all the images to ﬁles.
int i = 1;
// Writing the results images
std::cout << (itAnalyse != (pyramid->GetOutput ()->End ())) << std::endl;
while (itAnalyse != pyramid ->GetOutput ()->End())
{
writer ->SetInput (itAnalyse .Get ());
writer ->SetFileName (argv[0 * 4 + i + 1]);
writer ->Update ();
writer ->SetInput (itSupFilter .Get());
writer ->Update ();
writer ->SetInput (itInfFilter .Get());
writer ->Update ();
writer ->SetInput (itInfDeci .Get ());
writer ->Update ();
writer ->SetInput (itSupDeci .Get ());
writer ->Update ();
++itAnalyse ;
++ itSupFilter ;
++ itInfFilter ;
++itInfDeci ;
++itSupDeci ;
++i;
}
Figure 15.1 shows the test image to be processed by the morphological pyramid.
Figure 15.2 shows the 4 levels of analysis of the image.
Figure 15.3 shows the 4 levels of bright details.
Figure 15.4 shows the 4 levels of dark details.
Figure 15.5 shows the 4 levels of bright decimation details.
Figure 15.6 shows the 4 levels of dark decimation details.
Examples/MultiScale/MorphologicalPyramidSynthesisFilterExample.cxx.
This example illustrates the use of the otb::MorphologicalPyramidSynthesisFilter.

Figure 15.1: Test image for the morphological pyramid.
Figure 15.2: Result of the analysis for 4 levels of the pyramid.
Figure 15.3: Bright details for 4 levels of the pyramid.
Figure 15.4: Dark details for 4 levels of the pyramid.

Figure 15.5: Bright decimation details for 4 levels of the pyramid.
Figure 15.6: Dark decimation details for 4 levels of the pyramid.
#include "otbMorphologicalPyramidSynthesisFilter.h"
The mathematical morphology filters to be used have also to be included here, as well as the
otb::MorphologicalPyramidAnalyseFilter in order to perform the analysis step.
image writer.
Now, we define the types needed for the morphological filters which will be used to build the mor-
phological pyramid. The first thing to do is define the structuring element, which in our case, will
be a itk::BinaryBallStructuringElement which is templated over the pixel type and the di-
mension of the image.
typedef itk::BinaryBallStructuringElement<InputPixelType , Dimension >

StructuringElementType;
We can now define the type of the filter to be used by the morphological pyramid. In this case, we
choose to use an otb::OpeningClosingMorphologicalFilter which is just the concatenation
of an opening and a closing. This filter is theplated over the input and output image types and the
structurung element type that we just define above.
InputImageType ,
We can now define the type of the morpholoical pyramid filter. The filter is templated over the input
and output mage types and the lowpas morphological filter to be used.
OutputImageType ,
PyramidAnalysisFilterType;
We can finally define the type of the morpholoical pyramid synthesis filter. The filter is templated
over the input and output mage types.
typedef otb::MorphologicalPyramidSynthesisFilter<InputImageType ,
OutputImageType >
PyramidSynthesisFilterType;
We can now instantiate the reader in order to access the input image which has to be analysed.
We instantiate the morphological pyramid analysis filter and set its parameters which are:
• the number of iterations or levels of the pyramid;
• the subsample scale or decimation factor between two successive pyramid levels.
PyramidAnalysisFilterType::Pointer pyramidAnalysis =
PyramidAnalysisFilterType::New();
pyramidAnalysis ->SetNumberOfLevels(numberOfLevels);
pyramidAnalysis ->SetDecimationRatio(decimationRatio);
pyramidAnalysis ->SetInput(reader ->GetOutput ());
pyramidAnalysis ->Update ();
Once the analysis step is finished we can proceed to the synthesis of the image from its different
levels of decomposition. The morphological pyramid has 5 types of output:

• the Analysisd image at each level of the pyramid through the GetOutput() method;
• the brighter details extracted from the filtering operation through the GetSupFilter()
method;
• the darker details extracted from the filtering operation through the GetInfFilter() method;
• the brighter details extracted from the resampling operation through the GetSupDeci()
method;
• the darker details extracted from the resampling operation through the GetInfDeci() method;
to decimation
This outputs can be used as input of the synthesis filter by using the appropriate methods.
PyramidSynthesisFilterType::Pointer pyramidSynthesis =
PyramidSynthesisFilterType::New();
pyramidSynthesis ->SetInput (pyramidAnalysis ->GetOutput ()->Back());
pyramidSynthesis ->SetSupFilter(pyramidAnalysis ->GetSupFilter());
pyramidSynthesis ->SetSupDeci (pyramidAnalysis ->GetSupDeci ());
pyramidSynthesis ->SetInfFilter(pyramidAnalysis ->GetInfFilter());
pyramidSynthesis ->SetInfDeci (pyramidAnalysis ->GetInfDeci ());
pyramidSynthesis ->Update ();
We finally instatiate a the writer in order to save the result image to a file.
writer ->SetInput(pyramidSynthesis ->GetOutput ()->Back());
writer ->Update();
Since the synthesis operation is applied on the result of the analysis, the input and the output images
should be identical. This is the case as shown in figure 15.7.
Of course, in a real application, a specific processing will be applied after the analysis and before
the synthesis to, for instance, denoise the image by removing pixels at the finer scales, etc.
15.2.1 Morphological Pyramid Exploitation
One of the possible uses of the morphological pyramid is the segmentation of objects – regions – of
a particular scale.
Examples/MultiScale/MorphologicalPyramidSegmenterExample.cxx.

Figure 15.7: Result of the morphological pyramid analysis and synthesis. Left: original image. Right: result of
applying the analysis and the synthesis steps.
This example illustrates the use of the otb::MorphologicalPyramid::Segmenter. This class
performs the segmentation of a detail image extracted from a morphological pyramid analysis. The
Segmentation is perfomed using the itk::ConnectedThresholdImageFilter. The seeds are
extracted from the image using the otb::ImageToPointSetFilter. The thresolds are set by
using quantiles computed with the HistogramGenerator.
#include "otbMorphologicalPyramidSegmenter.h"
image writer. Note that, for this example, an RGB image will be created to store the results of the
segmentation.
typedef unsigned short LabelPixelType;
typedef itk::RGBPixel <unsigned char> RGBPixelType;
typedef otb::Image <RGBPixelType , 2> RGBImageType;
typedef otb::ImageFileWriter <RGBImageType > WriterType ;
We deﬁne now the segmenter. Please pay attention to the fact that this class belongs to the
morphologicalPyramid namespace.
typedef otb:: MorphologicalPyramid::Segmenter <InputImageType ,
LabelImageType >
SegmenterType;

We instantiate the readers which will give us access to the image of details produced by the morpho-
logical pyramid analysis and the original image (before analysis) which is used in order to produce
segmented regions which are sharper than what would have been obtained with the detail image
only.
ReaderType ::Pointer reader2 = ReaderType ::New();
reader2 ->SetFileName (originalFilename);
We instantiate the segmenter and set its parameters as follows. We plug the output of the readers
for the details image and the original image; we set the boolean variable which controls wether the
segmented details are bright or dark; we set the quantile used to threshold the details image in order
to obtain the seed points for the segmentation; we set the quantile for setting the threshold for the
region growing segmentation; and finally, we set the minimum size for a segmented region to be
kept in the final result.
SegmenterType::Pointer segmenter = SegmenterType::New();
segmenter ->SetDetailsImage(reader ->GetOutput ());
segmenter ->SetOriginalImage(reader2->GetOutput ());
segmenter ->SetSegmentDarkDetailsBool(segmentDark );
segmenter ->SetSeedsQuantile(seedsQuantile);
segmenter ->SetConnectedThresholdQuantile(segmentationQuantile);
segmenter ->SetMinimumObjectSize(minObjectSize);
The output of the segmenter is an image of integer labels, where a label denotes membership of
a pixel in a particular segmented region. This value is usually coded using 16 bits. This format
is not practical for visualization, so for the purposes of this example, we will convert it to RGB
pixels. RGB images have the advantage that they can be saved as a simple png file and viewed using
any standard image viewer software. The itk::Functor::ScalarToRGBPixelFunctor class is
a special function object designed to hash a scalar value into an itk::RGBPixel. Plugging this
functor into the itk::UnaryFunctorImageFilter creates an image filter for that converts scalar
images to RGB images.
typedef itk::Functor ::ScalarToRGBPixelFunctor <LabelPixelType >
ColorMapFunctorType;
typedef itk::UnaryFunctorImageFilter <LabelImageType ,
RGBImageType ,
ColorMapFunctorType > ColorMapFilterType;
ColorMapFilterType::Pointer colormapper = ColorMapFilterType::New();
We can now plug the final segment of the pipeline by using the color mapper and the image file
writer.
colormapper ->SetInput (segmenter ->GetOutput ());
writer ->SetInput(colormapper ->GetOutput ());
writer ->SetFileName (outputFilename1);
writer ->Update();

Figure 15.8: Morphological pyramid segmentation. From left to right: original image, image of bright details
and result of the sementation.
Figure 15.8 shows the results of the segmentation of the image of bright details obtained with the
morphological pyramid analysis.
This same approach can be applied to all the levels of the morphological pyramid analysis.
Examples/MultiScale/MorphologicalPyramidSegmentationExample.cxx.
This example illustrates the use of the otb::MorphologicalSegmentationFilter. This fil-
ter performs a segmentation of the details supFilter and infFilter extracted with the mor-
phological pyramid. The segmentation algorithm used is based on seeds extraction using
the otb::ImageToPointSetFilter, followed by a connected threshold segmentation using the
itk::ConnectedThresholdImageFilter. The threshold for seeds extraction and segmenta-
tion are computed using quantiles. A pre processing step is applied by multiplying the full
resolution brighter details (resp. darker details) with the original image (resp. the inverted
original image). This perfoms an enhancement of the regions contour precision. The de-
tails from the pyramid are set via the SetBrighterDetails() and SetDarkerDetails() meth-
ods. The brighter and darker details depend on the filter used in the pyramid analysis. If the
otb::OpeningClosingMorphologicalFilter filter is used, then the brighter details are those
from the supFilter image list, whereas if the otb::ClosingOpeningMorphologicalFilter
filter is used, the brighter details are those from the infFilter list. The output of the segmentation
filter is a single segmentation images list, containing first the brighter details segmentation from
higher scale to lower, and then the darker details in the same order. The attention of the user is
drawn to the fact that since the label filter used internally will deal with a large number of labels, the
OutputPixelType is required to be sufficiently precise. Unsigned short or Unsigned long would
be a good choice, unless the user has a very good reason to think that a less precise type will be
sufficient. The first step to use this filter is to include its header file.
#include "otbMorphologicalPyramidSegmentationFilter.h"
The mathematical morphology filters to be used have also to be included here, as well as the mor-

phological pyramid analysis filter.
As usual, we start by defining the types for the pixels, the images, the reader and the writer. We also
define the types needed for the morphological pyramid analysis.
typedef unsigned short OutputPixelType;
typedef itk::BinaryBallStructuringElement<InputPixelType , Dimension >
StructuringElementType;
InputImageType ,
InputImageType ,
PyramidFilterType;
We can now define the type for the otb::MorphologicalPyramidSegmentationFilter which is
templated over the input and output image types.
typedef otb::MorphologicalPyramidSegmentationFilter<InputImageType ,
OutputImageType >
SegmentationFilterType;
Since the output of the segmentation filter is a list of images, we define an iterator type which will
be used to access the segmented images.
typedef SegmentationFilterType:: OutputImageListIteratorType
OutputListIteratorType;
The following code snippet shows how to read the input image and perform the morphological
pyramid analysis.
PyramidFilterType::Pointer pyramid = PyramidFilterType::New();
pyramid ->SetNumberOfLevels(numberOfLevels);
pyramid ->SetDecimationRatio(decimationRatio);
pyramid ->SetInput(reader ->GetOutput ());

We can now instantiate the segmentation filter and set its parameters. As one can see, the
SetReferenceImage() is used to pass the original image in order to obtain sharp region boundaries.
Using the SetBrighterDetails() and SetDarkerDetails() the output of the analysis is passed
to the filter. Finally, the parameters for the segmentation are set by using the SetSeedsQuantile(),
SetConnectedThresholdQuantile() and SetMinimumObjectSize() methods.
SegmentationFilterType::Pointer segmentation = SegmentationFilterType::New();
segmentation ->SetReferenceImage(reader ->GetOutput ());
segmentation ->SetBrighterDetails(pyramid ->GetSupFilter());
segmentation ->SetDarkerDetails(pyramid->GetInfFilter());
segmentation ->SetSeedsQuantile(seedsQuantile);
segmentation ->SetConnectedThresholdQuantile(segmentationQuantile);
segmentation ->SetMinimumObjectSize(minObjectSize);
The pipeline is executed bu calling the Update() method.
segmentation ->Update();
Finally, we get an iterator to the list generated as output for the segmentation and we use it to iterate
through the list and write the images to files.
OutputListIteratorType it = segmentation ->GetOutput ()->Begin();
WriterType ::Pointer writer;
int index = 1;
std:: stringstream oss;
while (it != segmentation ->GetOutput ()->End ())
{
oss << outputFilenamePrefix << index << "." << outputFilenameSuffix;
writer = WriterType ::New();
writer ->SetInput(it.Get ());
writer ->SetFileName (oss.str (). c_str());
writer ->Update ();
std::cout << oss.str() << " file written." << std::endl;
oss.str("");
++index;
++it;
}
The user will pay attention to the fact that the list contains first the brighter details segmentation
from higher scale to lower, and then the darker details in the same order.

CHAPTER
SIXTEEN
IMAGE SEGMENTATION
Segmentation of remote sensing images is a challenging task. A myriad of different methods have
been proposed and implemented in recent years. In spite of the huge effort invested in this problem,
there is no single approach that can generally solve the problem of segmentation for the large variety
of image modalities existing today.
The most effective segmentation algorithms are obtained by carefully customizing combinations of
components. The parameters of these components are tuned for the characteristics of the image
modality used as input and the features of the objects to be segmented.
The Insight Toolkit provides a basic set of algorithms that can be used to develop and customize a
full segmentation application. They are therefore available in the Orfeo Toolbox. Some of the most
commonly used segmentation components are described in the following sections.
16.1 Region Growing
Region growing algorithms have proven to be an effective approach for image segmentation. The
basic approach of a region growing algorithm is to start from a seed region (typically one or more
pixels) that are considered to be inside the object to be segmented. The pixels neighboring this
region are evaluated to determine if they should also be considered part of the object. If so, they are
added to the region and the process continues as long as new pixels are added to the region. Region
growing algorithms vary depending on the criteria used to decide whether a pixel should be included
in the region or not, the type connectivity used to determine neighbors, and the strategy used to visit
neighboring pixels.
Several implementations of region growing are available in ITK. This section describes some of the
most commonly used.

396 Chapter 16. Image Segmentation
16.1.1 Connected Threshold
A simple criterion for including pixels in a growing region is to evaluate intensity value inside a
specific interval.
Examples/Segmentation/ConnectedThresholdImageFilter.cxx.
The following example illustrates the use of the itk::ConnectedThresholdImageFilter. This
filter uses the flood fill iterator. Most of the algorithmic complexity of a region growing method
comes from visiting neighboring pixels. The flood fill iterator assumes this responsibility and greatly
simplifies the implementation of the region growing algorithm. Thus the algorithm is left to establish
a criterion to decide whether a particular pixel should be included in the current region or not.
The criterion used by the ConnectedThresholdImageFilter is based on an interval of intensity values
provided by the user. Values of lower and upper threshold should be provided. The region growing
algorithm includes those pixels whose intensities are inside the interval.
I(X) ∈ [lower,upper] (16.1)
the ConnectedThresholdImageFilter class must be included.
#include "itkConnectedThresholdImageFilter.h"
Noise present in the image can reduce the capacity of this filter to grow large regions. When faced
with noisy images, it is usually convenient to pre-process the image by using an edge-preserving
smoothing filter. In this particular example we use the itk::CurvatureFlowImageFilter, hence
we need to include its header file.
#include "itkCurvatureFlowImageFilter.h"
We declare the image type based on a particular pixel type and dimension. In this case the float
type is used for the pixels due to the requirements of the smoothing filter.
The smoothing filter is instantiated using the image type as a template parameter.
typedef itk::CurvatureFlowImageFilter<InternalImageType , InternalImageType >
CurvatureFlowImageFilterType;
Then the filter is created by invoking the New() method and assigning the result to a
itk::SmartPointer.

16.1. Region Growing 397
CurvatureFlowImageFilterType::Pointer smoothing =
CurvatureFlowImageFilterType::New ();
We now declare the type of the region growing filter. In this case it is the ConnectedThresholdIm-
ageFilter.
typedef itk::ConnectedThresholdImageFilter<InternalImageType ,
InternalImageType >
ConnectedFilterType;
Then we construct one filter of this class using the New() method.
ConnectedFilterType::Pointer connectedThreshold = ConnectedFilterType::New();
Now it is time to connect a simple, linear pipeline. A file reader is added at the beginning of the
pipeline and a cast filter and writer are added at the end. The cast filter is required to convert float
pixel types to integer types since only a few image file formats support float types.
smoothing ->SetInput (reader ->GetOutput ());
connectedThreshold ->SetInput (smoothing ->GetOutput ());
caster ->SetInput(connectedThreshold ->GetOutput ());
The CurvatureFlowImageFilter requires a couple of parameters to be defined. The following are
typical values, however they may have to be adjusted depending on the amount of noise present in
the input image.
smoothing ->SetNumberOfIterations(5);
smoothing ->SetTimeStep (0.125);
The ConnectedThresholdImageFilterhas two main parameters to be defined. They are the lower and
upper thresholds of the interval in which intensity values should fall in order to be included in the
region. Setting these two values too close will not allow enough flexibility for the region to grow.
Setting them too far apart will result in a region that engulfs the image.
connectedThreshold ->SetLower (lowerThreshold);
connectedThreshold ->SetUpper (upperThreshold);
The output of this filter is a binary image with zero-value pixels everywhere except on the extracted
region. The intensity value set inside the region is selected with the method SetReplaceValue()
connectedThreshold ->SetReplaceValue(
itk::NumericTraits <OutputPixelType >::max ());
The initialization of the algorithm requires the user to provide a seed point. It is convenient to select
this point to be placed in a typical region of the structure to be segmented. The seed is passed in the
form of a itk::Index to the SetSeed() method.

Structure Seed Index Lower Upper Output Image
Road (110,38) 50 100 Second from left in Figure 16.1
Shadow (118,100) 0 10 Third from left in Figure 16.1
Building (169,146) 220 255 Fourth from left in Figure 16.1
Table 16.1: Parameters used for segmenting some structures shown in Figure 16.1 with the filter
itk::ConnectedThresholdImageFilter.
Figure 16.1: Segmentation results for the ConnectedThreshold filter for various seed points.
connectedThreshold ->SetSeed(index);
usually wise to put update calls in a try/catch block in case errors occur and exceptions are thrown.
try
{
writer ->Update ();
}
{
}
Let’s run this example using as input the image QB Suburb.png provided in the directory
Examples/Data. We can easily segment the major structures by providing seeds in the appropri-
ate locations and defining values for the lower and upper thresholds. Figure 16.1 illustrates several
examples of segmentation. The parameters used are presented in Table 16.1.
Notice that some objects are not being completely segmented. This illustrates the vulnerability of the
region growing methods when the structures to be segmented do not have a homogeneous statistical
distribution over the image space. You may want to experiment with different values of the lower
and upper thresholds to verify how the accepted region will extend.
Another option for segmenting regions is to take advantage of the functionality provided by the
ConnectedThresholdImageFilter for managing multiple seeds. The seeds can be passed one by one
to the filter using the AddSeed() method. You could imagine a user interface in which an operator

clicks on multiple points of the object to be segmented and each selected point is passed as a seed to
this filter.
16.1.2 Otsu Segmentation
Another criterion for classifying pixels is to minimize the error of misclassification. The goal is to
find a threshold that classifies the image into two clusters such that we minimize the area under the
histogram for one cluster that lies on the other cluster’s side of the threshold. This is equivalent to
minimizing the within class variance or equivalently maximizing the between class variance.
Examples/Segmentation/OtsuThresholdImageFilter.cxx.
This example illustrates how to use the itk::OtsuThresholdImageFilter.
#include "itkOtsuThresholdImageFilter.h"
The next step is to decide which pixel types to use for the input and output images.
The input and output image types are now defined using their respective pixel types and dimensions.
The filter type can be instantiated using the input and output image types defined above.
typedef itk::OtsuThresholdImageFilter<
An otb::ImageFileReader class is also instantiated in order to read image data from a file. (See
Section 6 on page 99 for more information about reading and writing data.)
to itk::SmartPointers.
The image obtained with the reader is passed as input to the OtsuThresholdImageFilter.

Figure 16.2: Effect of the OtsuThresholdImageFilter.
The method SetOutsideValue() defines the intensity value to be assigned to those pixels
whose intensities are outside the range defined by the lower and upper thresholds. The method
SetInsideValue() defines the intensity value to be assigned to pixels with intensities falling in-
side the threshold range.
filter ->SetOutsideValue(outsideValue);
filter ->SetInsideValue(insideValue );
The method SetNumberOfHistogramBins() defines the number of bins to be used for computing
the histogram. This histogram will be used internally in order to compute the Otsu threshold.
filter ->SetNumberOfHistogramBins(128);
The execution of the filter is triggered by invoking the Update() method. If the filter’s output has
been passed as input to subsequent filters, the Update() call on any posterior filters in the pipeline
will indirectly trigger the update of this filter.
filter ->Update();
We print out here the Threshold value that was computed internally by the filter. For this we invoke
the GetThreshold method.
int threshold = filter ->GetThreshold();
std::cout << "Threshold = " << threshold << std::endl;
Figure 16.2 illustrates the effect of this filter. This figure shows the limitations of this filter for
performing segmentation by itself. These limitations are particularly noticeable in noisy images and
in images lacking spatial uniformity.

Examples/Segmentation/OtsuMultipleThresholdImageFilter.cxx.
This example illustrates how to use the itk::OtsuMultipleThresholdsCalculator.
#include "itkOtsuMultipleThresholdsCalculator.h"
OtsuMultipleThresholdsCalculator calculates thresholds for a give histogram so as to maximize the
between-class variance. We use ScalarImageToHistogramGenerator to generate histograms
typedef itk::Statistics :: ScalarImageToHistogramGenerator<InputImageType >
ScalarImageToHistogramGeneratorType;
typedef itk::OtsuMultipleThresholdsCalculator<
ScalarImageToHistogramGeneratorType::HistogramType > CalculatorType;
Once thresholds are computed we will use BinaryThresholdImageFilter to segment the input image
into segments.
typedef itk::BinaryThresholdImageFilter<InputImageType , OutputImageType >
FilterType ;
ScalarImageToHistogramGeneratorType::Pointer scalarImageToHistogramGenerator
=
ScalarImageToHistogramGeneratorType::New ();
CalculatorType::Pointer calculator = CalculatorType::New ();
scalarImageToHistogramGenerator->SetNumberOfBins(128);
int nbThresholds = argc - 2;
calculator ->SetNumberOfThresholds(nbThresholds);
The pipeline will look as follows:
scalarImageToHistogramGenerator->SetInput (reader ->GetOutput ());
calculator ->SetInputHistogram(scalarImageToHistogramGenerator->GetOutput ());
Thresholds are obtained using the GetOutput method
const CalculatorType::OutputType & thresholdVector =
calculator ->GetOutput ();
CalculatorType::OutputType ::const_iterator itNum = thresholdVector.begin();

Figure 16.3: Effect of the OtsuMultipleThresholdImageFilter.
for (; itNum < thresholdVector.end(); itNum++)
{
std::cout << "OtsuThreshold["
<< (int) (itNum - thresholdVector.begin())
<< "] = " <<
static_cast<itk::NumericTraits <CalculatorType::MeasurementType >:: PrintType >
(*itNum) << std::endl;
upperThreshold = (*itNum);
filter ->SetLowerThreshold(static_cast<OutputPixelType > (lowerThreshold));
filter ->SetUpperThreshold(static_cast<OutputPixelType > (upperThreshold));
lowerThreshold = upperThreshold;
writer ->SetFileName (argv[2 + counter ]);
++counter;
}
Figure 16.3 illustrates the effect of this ﬁlter.

16.1.3 Neighborhood Connected
Examples/Segmentation/NeighborhoodConnectedImageFilter.cxx.
The following example illustrates the use of the itk::NeighborhoodConnectedImageFilter.
This filter is a close variant of the itk::ConnectedThresholdImageFilter. On one hand, the
ConnectedThresholdImageFilter accepts a pixel in the region if its intensity is in the interval defined
by two user-provided threshold values. The NeighborhoodConnectedImageFilter,on the other hand,
will only accept a pixel if all its neighbors have intensities that fit in the interval. The size of the
neighborhood to be considered around each pixel is defined by a user-provided integer radius.
The reason for considering the neighborhood intensities instead of only the current pixel intensity
is that small structures are less likely to be accepted in the region. The operation of this filter is
equivalent to applying the ConnectedThresholdImageFilter followed by mathematical morphology
erosion using a structuring element of the same shape as the neighborhood provided to the Neigh-
borhoodConnectedImageFilter.
#include "itkNeighborhoodConnectedImageFilter.h"
The itk::CurvatureFlowImageFilter is used here to smooth the image while preserving edges.
We now define the image type using a particular pixel type and image dimension. In this case the
float type is used for the pixels due to the requirements of the smoothing filter.
The smoothing filter type is instantiated using the image type as a template parameter.
Then, the filter is created by invoking the New() method and assigning the result to a
itk::SmartPointer.
We now declare the type of the region growing filter. In this case it is the NeighborhoodConnected-
ImageFilter.

typedef itk::NeighborhoodConnectedImageFilter<InternalImageType ,
InternalImageType >
One filter of this class is constructed using the New() method.
ConnectedFilterType::Pointer neighborhoodConnected = ConnectedFilterType::New ();
Now it is time to create a simple, linear data processing pipeline. A file reader is added at the
beginning of the pipeline and a cast filter and writer are added at the end. The cast filter is required
to convert float pixel types to integer types since only a few image file formats support float
types.
neighborhoodConnected ->SetInput(smoothing ->GetOutput ());
caster ->SetInput(neighborhoodConnected ->GetOutput ());
The CurvatureFlowImageFilter requires a couple of parameters to be defined. The following are
typical values for 2D images. However they may have to be adjusted depending on the amount of
noise present in the input image.
The NeighborhoodConnectedImageFilter requires that two main parameters are specified. They are
the lower and upper thresholds of the interval in which intensity values must fall to be included in
the region. Setting these two values too close will not allow enough flexibility for the region to grow.
Setting them too far apart will result in a region that engulfs the image.
neighborhoodConnected ->SetLower(lowerThreshold);
neighborhoodConnected ->SetUpper(upperThreshold);
Here, we add the crucial parameter that defines the neighborhood size used to determine whether a
pixel lies in the region. The larger the neighborhood, the more stable this filter will be against noise
in the input image, but also the longer the computing time will be. Here we select a filter of radius
2 along each dimension. This results in a neighborhood of 5 × 5 pixels.
InternalImageType::SizeType radius;
radius[0] = 2; // two pixels along X
radius[1] = 2; // two pixels along Y
neighborhoodConnected ->SetRadius (radius);
As in the ConnectedThresholdImageFilter we must now provide the intensity value to be used for
the output pixels accepted in the region and at least one seed point to define the initial region.

itk::NeighborhoodConnectedThresholdImageFilter.
Figure 16.4: Segmentation results for the NeighborhoodConnectedThreshold filter for various seed points.
neighborhoodConnected ->SetSeed(index);
neighborhoodConnected ->SetReplaceValue(255);
usually wise to put update calls in a try/catch block in case errors occur and exceptions are thrown.
try
{
writer ->Update ();
}
{
}
Let’s run this example using as input the image QB Suburb.png provided in the directory
Examples/Data. We can easily segment the major structures by providing seeds in the appropri-
ate locations and defining values for the lower and upper thresholds. Figure 16.4 illustrates several
examples of segmentation. The parameters used are presented in Table 16.2.
As with the ConnectedThresholdImageFilter, several seeds could be provided to the filter by using
the AddSeed() method. Compare the output of Figure 16.4 with those of Figure 16.1 produced by
the ConnectedThresholdImageFilter. You may want to play with the value of the neighborhood ra-
dius and see how it affect the smoothness of the segmented object borders, the size of the segmented
region and how much that costs in computing time.

16.1.4 Confidence Connected
Examples/Segmentation/ConfidenceConnected.cxx.
The following example illustrates the use of the itk::ConfidenceConnectedImageFilter. The
criterion used by the ConfidenceConnectedImageFilter is based on simple statistics of the current
region. First, the algorithm computes the mean and standard deviation of intensity values for all
the pixels currently included in the region. A user-provided factor is used to multiply the standard
deviation and define a range around the mean. Neighbor pixels whose intensity values fall inside the
range are accepted and included in the region. When no more neighbor pixels are found that satisfy
the criterion, the algorithm is considered to have finished its first iteration. At that point, the mean
and standard deviation of the intensity levels are recomputed using all the pixels currently included
in the region. This mean and standard deviation defines a new intensity range that is used to visit
current region neighbors and evaluate whether their intensity falls inside the range. This iterative
process is repeated until no more pixels are added or the maximum number of iterations is reached.
The following equation illustrates the inclusion criterion used by this filter,
I(X) ∈ [m− fσ,m+ fσ] (16.2)
where m and σ are the mean and standard deviation of the region intensities, f is a factor defined by
the user, I() is the image and X is the position of the particular neighbor pixel being considered for
inclusion in the region.
the itk::ConfidenceConnectedImageFilter class must be included.
#include "itkConfidenceConnectedImageFilter.h"
Noise present in the image can reduce the capacity of this filter to grow large regions. When faced
with noisy images, it is usually convenient to pre-process the image by using an edge-preserving
smoothing filter. In this particular example we use the itk::CurvatureFlowImageFilter, hence
we need to include its header file.
We now define the image type using a pixel type and a particular dimension. In this case the float
The smoothing filter type is instantiated using the image type as a template parameter.

itk::SmartPointer.
We now declare the type of the region growing filter. In this case it is the ConfidenceConnectedIm-
ageFilter.
typedef itk::ConfidenceConnectedImageFilter<InternalImageType ,
InternalImageType >
Then, we construct one filter of this class using the New() method.
ConnectedFilterType::Pointer confidenceConnected = ConnectedFilterType::New ();
Now it is time to create a simple, linear pipeline. A file reader is added at the beginning of the
pipeline and a cast filter and writer are added at the end. The cast filter is required here to convert
float pixel types to integer types since only a few image file formats support float types.
confidenceConnected ->SetInput(smoothing ->GetOutput ());
caster ->SetInput(confidenceConnected ->GetOutput ());
The CurvatureFlowImageFilter requires defining two parameters. The following are typical values.
However they may have to be adjusted depending on the amount of noise present in the input image.
The ConfidenceConnectedImageFilter requires defining two parameters. First, the factor f that the
defines how large the range of intensities will be. Small values of the multiplier will restrict the
inclusion of pixels to those having very similar intensities to those in the current region. Larger
values of the multiplier will relax the accepting condition and will result in more generous growth
of the region. Values that are too large will cause the region to grow into neighboring regions that
may actually belong to separate structures.
confidenceConnected ->SetMultiplier(2.5);
The number of iterations is specified based on the homogeneity of the intensities of the object to
be segmented. Highly homogeneous regions may only require a couple of iterations. Regions with
ramp effect, may require more iterations. In practice, it seems to be more important to carefully
select the multiplier factor than the number of iterations. However, keep in mind that there is no
reason to assume that this algorithm should converge to a stable region. It is possible that by letting
the algorithm run for more iterations the region will end up engulfing the entire image.

confidenceConnected ->SetNumberOfIterations(5);
The output of this filter is a binary image with zero-value pixels everywhere except on the ex-
tracted region. The intensity value to be set inside the region is selected with the method
SetReplaceValue()
confidenceConnected ->SetReplaceValue(255);
The initialization of the algorithm requires the user to provide a seed point. It is convenient to
select this point to be placed in a typical region of the structure to be segmented. A small neighbor-
hood around the seed point will be used to compute the initial mean and standard deviation for the
inclusion criterion. The seed is passed in the form of a itk::Index to the SetSeed() method.
confidenceConnected ->SetSeed(index);
The size of the initial neighborhood around the seed is defined with the method
SetInitialNeighborhoodRadius(). The neighborhood will be defined as an N-dimensional rect-
angular region with 2r + 1 pixels on the side, where r is the value passed as initial neighborhood
radius.
confidenceConnected ->SetInitialNeighborhoodRadius(2);
thrown.
try
{
writer ->Update ();
}
{
}
Let’s now run this example using as input the image QB Suburb.png provided in the directory
Examples/Data. We can easily segment structures by providing seeds in the appropriate locations.
For example

16.2. Segmentation Based on Watersheds 409
itk::ConnectedThresholdImageFilter.
Figure 16.5: Segmentation results for the ConfidenceConnected filter for various seed points.
16.2 Segmentation Based on Watersheds
16.2.1 Overview
Watershed segmentation classifies pixels into regions using gradient descent on image features and
analysis of weak points along region boundaries. Imagine water raining onto a landscape topology
and flowing with gravity to collect in low basins. The size of those basins will grow with increasing
amounts of precipitation until they spill into one another, causing small basins to merge together into
larger basins. Regions (catchment basins) are formed by using local geometric structure to associate
points in the image domain with local extrema in some feature measurement such as curvature or
gradient magnitude. This technique is less sensitive to user-defined thresholds than classic region-
growing methods, and may be better suited for fusing different types of features from different data
sets. The watersheds technique is also more flexible in that it does not produce a single image
segmentation, but rather a hierarchy of segmentations from which a single region or set of regions
can be extracted a-priori, using a threshold, or interactively, with the help of a graphical user interface
[152, 153].
The strategy of watershed segmentation is to treat an image f as a height function, i.e., the surface
formed by graphing f as a function of its independent parameters, x ∈ U. The image f is often not
the original input data, but is derived from that data through some filtering, graded (or fuzzy) feature
extraction, or fusion of feature maps from different sources. The assumption is that higher values
of f (or − f) indicate the presence of boundaries in the original data. Watersheds may therefore
be considered as a final or intermediate step in a hybrid segmentation method, where the initial
segmentation is the generation of the edge feature map.

WatershedDepth
Intensity profile of input image Intensity profile of filtered image Watershed Segmentation
Figure 16.6: A fuzzy-valued boundary map, from an image or set of images, is segmented using local minima
and catchment basins.
Gradient descent associates regions with local minima of f (clearly interior points) using the water-
sheds of the graph of f, as in Figure 16.6. That is, a segment consists of all points in U whose paths
of steepest descent on the graph of f terminate at the same minimum in f. Thus, there are as many
segments in an image as there are minima in f. The segment boundaries are “ridges” [80, 81, 45]
in the graph of f. In the 1D case (U ⊂ ℜ), the watershed boundaries are the local maxima of f,
and the results of the watershed segmentation is trivial. For higher-dimensional image domains,
the watershed boundaries are not simply local phenomena; they depend on the shape of the entire
watershed.
The drawback of watershed segmentation is that it produces a region for each local minimum—in
practice too many regions—and an over segmentation results. To alleviate this, we can establish a
minimum watershed depth. The watershed depth is the difference in height between the watershed
minimum and the lowest boundary point. In other words, it is the maximum depth of water a region
could hold without flowing into any of its neighbors. Thus, a watershed segmentation algorithm can
sequentially combine watersheds whose depths fall below the minimum until all of the watersheds
are of sufficient depth. This depth measurement can be combined with other saliency measurements,
such as size. The result is a segmentation containing regions whose boundaries and size are signif-
icant. Because the merging process is sequential, it produces a hierarchy of regions, as shown in
Figure 16.7. Previous work has shown the benefit of a user-assisted approach that provides a graph-
ical interface to this hierarchy, so that a technician can quickly move from the small regions that lie
within an area of interest to the union of regions that correspond to the anatomical structure [153].
There are two different algorithms commonly used to implement watersheds: top-down and bottom-
up. The top-down, gradient descent strategy was chosen for ITK because we want to consider
the output of multi-scale differential operators, and the f in question will therefore have floating
point values. The bottom-up strategy starts with seeds at the local minima in the image and grows
regions outward and upward at discrete intensity levels (equivalent to a sequence of morphological
operations and sometimes called morphological watersheds [126].) This limits the accuracy by
enforcing a set of discrete gray levels on the image.
Figure 16.8 shows how the ITK image-to-image watersheds filter is constructed. The filter is actually
a collection of smaller filters that modularize the several steps of the algorithm in a mini-pipeline.
The segmenter object creates the initial segmentation via steepest descent from each pixel to local
minima. Shallow background regions are removed (flattened) before segmentation using a simple
minimum value threshold (this helps to minimize oversegmentation of the image). The initial seg-

Node
Threshold of
Watershed depth
Image
Leaf
Boolean Operations
on Sub−Trees
(e.g. User Interaction)
Node Node Node
Node
Node
Node
Node
Node
Leaf Leaf Leaf Leaf Leaf Leaf Leaf Leaf LeafLeaf
Figure 16.7: A watershed segmentation combined with a saliency measure (watershed depth) produces a
hierarchy of regions. Structures can be derived from images by either thresholding the saliency measure or
combining subtrees within the hierarchy.
Height
Image
Labeled
Image
Image
Relabeler
Merge
Tree
Tree
Generator
Basic
Segmentation
Segmenter
Threshold
Maximum Flood Level
Output Flood Level
Data Object
Process Object
Parameter
Watershed Image Filter
Figure 16.8: The construction of the Insight watersheds ﬁlter.

mentation is passed to a second sub-filter that generates a hierarchy of basins to a user-specified
maximum watershed depth. The relabeler object at the end of the mini-pipeline uses the hierarchy
and the initial segmentation to produce an output image at any scale below the user-specified maxi-
mum. Data objects are cached in the mini-pipeline so that changing watershed depths only requires
a (fast) relabeling of the basic segmentation. The three parameters that control the filter are shown
in Figure 16.8 connected to their relevant processing stages.
16.2.2 Using the ITK Watershed Filter
Examples/Segmentation/WatershedSegmentation.cxx.
The following example illustrates how to preprocess and segment images using the
itk::WatershedImageFilter. Note that the care with which the data is preprocessed will greatly
affect the quality of your result. Typically, the best results are obtained by preprocessing the origi-
nal image with an edge-preserving diffusion filter, such as one of the anisotropic diffusion filters, or
with the bilateral image filter. As noted in Section 16.2.1, the height function used as input should be
created such that higher positive values correspond to object boundaries. A suitable height function
for many applications can be generated as the gradient magnitude of the image to be segmented.
The itk::VectorGradientMagnitudeAnisotropicDiffusionImageFilter class is used to
smooth the image and the itk::VectorGradientMagnitudeImageFilter is used to generate
the height function. We begin by including all preprocessing filter header files and the header file
for the WatershedImageFilter. We use the vector versions of these filters because the input data is a
color image.
#include "itkVectorGradientAnisotropicDiffusionImageFilter.h"
#include "itkVectorGradientMagnitudeImageFilter.h"
#include "itkWatershedImageFilter.h"
We now declare the image and pixel types to use for instantiation of the filters. All of
these filters expect real-valued pixel types in order to work properly. The preprocessing
stages are done directly on the vector-valued data and the segmentation is done using floating
point scalar data. Images are converted from RGB pixel type to numerical vector type using
itk::VectorCastImageFilter. Please pay attention to the fact that we are using itk::Images
since the itk::VectorGradientMagnitudeImageFilter has some internal typedefs which make
polymorfism impossible.
typedef itk::RGBPixel <unsigned char> RGBPixelType;
typedef otb::Image <RGBPixelType , 2> RGBImageType;
typedef itk::Vector <float, 3> VectorPixelType;
typedef itk::Image <VectorPixelType , 2> VectorImageType;
typedef itk::Image <unsigned long, 2> LabeledImageType;
typedef itk::Image <float, 2> ScalarImageType;
The various image processing filters are declared using the types created above and eventually used
in the pipeline.

typedef otb::ImageFileReader <RGBImageType > FileReaderType;
typedef itk::VectorCastImageFilter <RGBImageType , VectorImageType >
CastFilterType;
typedef itk:: VectorGradientAnisotropicDiffusionImageFilter<VectorImageType ,
VectorImageType >
DiffusionFilterType;
typedef itk::VectorGradientMagnitudeImageFilter<VectorImageType , float,
ScalarImageType >
GradientMagnitudeFilterType;
typedef itk::WatershedImageFilter <ScalarImageType > WatershedFilterType;
Next we instantiate the filters and set their parameters. The first step in the image processing pipeline
is diffusion of the color input image using an anisotropic diffusion filter. For this class of filters, the
CFL condition requires that the time step be no more than 0.25 for two-dimensional images, and no
more than 0.125 for three-dimensional images. The number of iterations and the conductance term
will be taken from the command line. See Section 8.7.2 for more information on the ITK anisotropic
diffusion filters.
DiffusionFilterType::Pointer diffusion = DiffusionFilterType::New ();
diffusion ->SetNumberOfIterations(atoi(argv[4]));
diffusion ->SetConductanceParameter(atof(argv[3]));
diffusion ->SetTimeStep (0.125);
The ITK gradient magnitude filter for vector-valued images can optionally take several parameters.
Here we allow only enabling or disabling of principal component analysis.
GradientMagnitudeFilterType::Pointer
gradient = GradientMagnitudeFilterType::New ();
gradient ->SetUsePrincipleComponents(atoi(argv[7]));
Finally we set up the watershed filter. There are two parameters. Level controls watershed depth,
and Threshold controls the lower thresholding of the input. Both parameters are set as a percentage
(0.0 - 1.0) of the maximum depth in the input image.
WatershedFilterType::Pointer watershed = WatershedFilterType::New ();
watershed ->SetLevel (atof(argv[6]));
watershed ->SetThreshold(atof(argv[5]));
The output of WatershedImageFilter is an image of unsigned long integer labels, where a label
denotes membership of a pixel in a particular segmented region. This format is not practical for
visualization, so for the purposes of this example, we will convert it to RGB pixels. RGB images
have the advantage that they can be saved as a simple png file and viewed using any standard image
viewer software. The itk::Functor::ScalarToRGBPixelFunctor class is a special function
object designed to hash a scalar value into an itk::RGBPixel. Plugging this functor into the
itk::UnaryFunctorImageFilter creates an image filter for that converts scalar images to RGB
images.
typedef itk::Functor ::ScalarToRGBPixelFunctor <unsigned long>

Figure 16.9: Segmented RGB image. At left is the original image. The image in the middle was generated with
parameters: conductance = 2.0, iterations = 10, threshold = 0.0, level = 0.05, principal components = on. The
image on the right was generated with parameters: conductance = 2.0, iterations = 10, threshold = 0.001, level
= 0.15, principal components = off.
ColorMapFunctorType;
typedef itk::UnaryFunctorImageFilter <LabeledImageType ,
RGBImageType ,
ColorMapFunctorType > ColorMapFilterType;
ColorMapFilterType::Pointer colormapper = ColorMapFilterType::New();
The filters are connected into a single pipeline, with readers and writers at each end.
caster ->SetInput(reader ->GetOutput ());
diffusion ->SetInput (caster ->GetOutput ());
gradient ->SetInput (diffusion ->GetOutput ());
watershed ->SetInput (gradient ->GetOutput ());
colormapper ->SetInput (watershed ->GetOutput ());
writer ->SetInput(colormapper ->GetOutput ());
Tuning the filter parameters for any particular application is a process of trial and error. The thresh-
old parameter can be used to great effect in controlling oversegmentation of the image. Raising the
threshold will generally reduce computation time and produce output with fewer and larger regions.
The trick in tuning parameters is to consider the scale level of the objects that you are trying to
segment in the image. The best time/quality trade-off will be achieved when the image is smoothed
and thresholded to eliminate features just below the desired scale.
Figure 16.9 shows output from the example code. Note that a critical difference between the two
segmentations is the mode of the gradient magnitude calculation.
A note on the computational complexity of the watershed algorithm is warranted. Most of the
complexity of the ITK implementation lies in generating the hierarchy. Processing times for this
stage are non-linear with respect to the number of catchment basins in the initial segmentation. This
means that the amount of information contained in an image is more significant than the number of
pixels in the image. A very large, but very flat input take less time to segment than a very small, but
very detailed input.

16.3. Level Set Segmentation 415
16.3 Level Set Segmentation
Zero Set f(x,y)=0
Exterior f(x,y) < 0
Interior
f(x,y) > 0
Figure 16.10: Concept of zero set in a level set.
The paradigm of the level set
is that it is a numerical method
for tracking the evolution of
contours and surfaces. Instead
of manipulating the contour di-
rectly, the contour is embedded
as the zero level set of a higher
dimensional function called the
level-set function, ψ(X,t). The
level-set function is then evolved
under the control of a differen-
tial equation. At any time, the
evolving contour can be obtained
by extracting the zero level-set
Γ((X),t) = {ψ(X,t) = 0} from
the output. The main advantages
of using level sets is that arbitrarily complex shapes can be modeled and topological changes such
as merging and splitting are handled implicitly.
Level sets can be used for image segmentation by using image-based features such as mean inten-
sity, gradient and edges in the governing differential equation. In a typical approach, a contour is
initialized by a user and is then evolved until it fits the form of an object in the image. Many dif-
ferent implementations and variants of this basic concept have been published in the literature. An
overview of the field has been made by Sethian [127].
The following sections introduce practical examples of some of the level set segmentation methods
available in ITK. The remainder of this section describes features common to all of these filters
except the itk::FastMarchingImageFilter, which is derived from a different code framework.
Understanding these features will aid in using the filters more effectively.
Each filter makes use of a generic level-set equation to compute the update to the solution ψ of the
partial differential equation.
d
dt
ψ = −αA(x)·∇ψ− βP(x) | ∇ψ | +γZ(x)κ | ∇ψ | (16.3)
where A is an advection term, P is a propagation (expansion) term, and Z is a spatial modifier term
for the mean curvature κ. The scalar constants α, β, and γ weight the relative influence of each of
the terms on the movement of the interface. A segmentation filter may use all of these terms in its
calculations, or it may omit one or more terms. If a term is left out of the equation, then setting the
corresponding scalar constant weighting will have no effect.
All of the level-set based segmentation filters must operate with floating point precision to produce

−0.4
−0.3
−1.3
−1.4
−1.4
−0.2−1.2
−1.1 −0.1
−0.6
0.6
0.4 0.3
−0.7
1.31.6
0.8
−0.3
0.3
−0.8
−0.7
0.7
−0.4−1.3
0.4
1.3 0.3 0.4 −0.6
−0.6
0.2
1.3
0.2 −0.8
−0.8
1.2
2.3
1.2
1.4
−0.6
0.4−0.5−1.5
0.9
−0.6
0.2
−0.8
0.7
−0.6 −1.7
−1.6
−0.7
−1.8
−1.8
−1.8−2.4
−2.4
−2.4
−2.5
−2.5 −1.5
−1.6
−1.6
2.4
1.7
1.8
Ψ(x, t)
Figure 16.11: The implicit level set surface Γ is the black line superimposed over the image grid. The location
of the surface is interpolated by the image pixel values. The grid pixels closest to the implicit surface are shown
in gray.
valid results. The third, optional template parameter is the numerical type used for calculations
and as the output image pixel type. The numerical type is float by default, but can be changed
to double for extra precision. A user-defined, signed floating point type that defines all of the
necessary arithmetic operators and has sufficient precision is also a valid choice. You should not use
types such as int or unsigned char for the numerical parameter. If the input image pixel types
do not match the numerical type, those inputs will be cast to an image of appropriate type when the
filter is executed.
Most filters require two images as input, an initial model ψ(X,t = 0), and a feature image, which is
either the image you wish to segment or some preprocessed version. You must specify the isovalue
that represents the surface Γ in your initial model. The single image output of each filter is the
function ψ at the final time step. It is important to note that the contour representing the surface Γ
is the zero level-set of the output image, and not the isovalue you specified for the initial model. To
represent Γ using the original isovalue, simply add that value back to the output.
The solution Γ is calculated to subpixel precision. The best discrete approximation of the surface is
therefore the set of grid positions closest to the zero-crossings in the image, as shown in Figure 16.11.
The itk::ZeroCrossingImageFilter operates by finding exactly those grid positions and can be
used to extract the surface.
There are two important considerations when analyzing the processing time for any particular level-
set segmentation task: the surface area of the evolving interface and the total distance that the surface
must travel. Because the level-set equations are usually solved only at pixels near the surface (fast
marching methods are an exception), the time taken at each iteration depends on the number of
points on the surface. This means that as the surface grows, the solver will slow down proportionally.
Because the surface must evolve slowly to prevent numerical instabilities in the solution, the distance

Binary
Threshold
Time−Crossing
Map
Fast
Marching
Sigmoid
Filter
Gradient
Magnitude
Anisotropic
Diffusion
Input
otb::Image
Binary
Image
Iterations Sigma Alpha,Beta Seeds Threshold
Figure 16.12: Collaboration diagram of the FastMarchingImageFilter applied to a segmentation task.
the surface must travel in the image dictates the total number of iterations required.
Some level-set techniques are relatively insensitive to initial conditions and are
therefore suitable for region-growing segmentation. Other techniques, such as the
itk::LaplacianSegmentationLevelSetImageFilter, can easily become “stuck” on image
features close to their initialization and should be used only when a reasonable prior segmentation
is available as the initialization. For best efficiency, your initial model of the surface should be the
best guess possible for the solution.
16.3.1 Fast Marching Segmentation
Examples/Segmentation/FastMarchingImageFilter.cxx.
When the differential equation governing the level set evolution has a very simple form, a fast
evolution algorithm called fast marching can be used.
The following example illustrates the use of the itk::FastMarchingImageFilter. This filter
implements a fast marching solution to a simple level set evolution problem. In this example, the
speed term used in the differential equation is expected to be provided by the user in the form of an
image. This image is typically computed as a function of the gradient magnitude. Several mappings
are popular in the literature, for example, the negative exponential exp(−x) and the reciprocal 1/(1+
x). In the current example we decided to use a Sigmoid function since it offers a good deal of control
parameters that can be customized to shape a nice speed image.
The mapping should be done in such a way that the propagation speed of the front will be very low
close to high image gradients while it will move rather fast in low gradient areas. This arrangement
will make the contour propagate until it reaches the edges of anatomical structures in the image
and then slow down in front of those edges. The output of the FastMarchingImageFilter is a time-
crossing map that indicates, for each pixel, how much time it would take for the front to arrive at the
pixel location.
The application of a threshold in the output image is then equivalent to taking a snapshot of the
contour at a particular time during its evolution. It is expected that the contour will take a longer
time to cross over the edges of a particular structure. This should result in large changes on the
time-crossing map values close to the structure edges. Segmentation is performed with this filter by
locating a time range in which the contour was contained for a long time in a region of the image

space.
Figure 16.12 shows the major components involved in the application of the FastMarchingIm-
ageFilter to a segmentation task. It involves an initial stage of smoothing using the
itk::CurvatureAnisotropicDiffusionImageFilter. The smoothed image is passed as
the input to the itk::GradientMagnitudeRecursiveGaussianImageFilter and then to the
itk::SigmoidImageFilter. Finally, the output of the FastMarchingImageFilter is passed to a
itk::BinaryThresholdImageFilter in order to produce a binary mask representing the seg-
mented object.
The code in the following example illustrates the typical setup of a pipeline for performing segmen-
tation with fast marching. First, the input image is smoothed using an edge-preserving filter. Then
the magnitude of its gradient is computed and passed to a sigmoid filter. The result of the sigmoid
filter is the image potential that will be used to affect the speed term of the differential equation.
Let’s start by including the following headers. First we include the header of the Curvature-
AnisotropicDiffusionImageFilter that will be used for removing noise from the input image.
#include "itkCurvatureAnisotropicDiffusionImageFilter.h"
The headers of the GradientMagnitudeRecursiveGaussianImageFilter and SigmoidImageFilter are
included below. Together, these two filters will produce the image potential for regulating the speed
term in the differential equation describing the evolution of the level set.
#include "itkGradientMagnitudeRecursiveGaussianImageFilter.h"
#include "itkSigmoidImageFilter.h"
Of course, we will need the otb::Image class and the FastMarchingImageFilter class. Hence we
include their headers.
#include "itkFastMarchingImageFilter.h"
The time-crossing map resulting from the FastMarchingImageFilter will be thresholded using the
BinaryThresholdImageFilter. We include its header here.
#include "itkBinaryThresholdImageFilter.h"
Reading and writing images will be done with the otb::ImageFileReader and
otb::ImageFileWriter.
We now define the image type using a pixel type and a particular dimension. In this case the float

The output image, on the other hand, is declared to be binary.
The type of the BinaryThresholdImageFilterfilter is instantiated below using the internal image type
and the output image type.
typedef itk::BinaryThresholdImageFilter<InternalImageType ,
OutputImageType >
ThresholdingFilterType;
ThresholdingFilterType::Pointer thresholder = ThresholdingFilterType::New();
The upper threshold passed to the BinaryThresholdImageFilter will define the time snapshot that we
are taking from the time-crossing map.
thresholder ->SetLowerThreshold(0.0);
thresholder ->SetUpperThreshold(timeThreshold);
thresholder ->SetOutsideValue(0);
thresholder ->SetInsideValue(255);
We instantiate reader and writer types in the following lines.
The CurvatureAnisotropicDiffusionImageFilter type is instantiated using the internal image type.
typedef itk::CurvatureAnisotropicDiffusionImageFilter<
InternalImageType ,
InternalImageType > SmoothingFilterType;
Then, the filter is created by invoking the New() method and assigning the result to a
itk::SmartPointer.
SmoothingFilterType::Pointer smoothing = SmoothingFilterType::New ();
The types of the GradientMagnitudeRecursiveGaussianImageFilter and SigmoidImageFilter are in-
stantiated using the internal image type.
typedef itk::GradientMagnitudeRecursiveGaussianImageFilter<
InternalImageType ,
InternalImageType > GradientFilterType;
typedef itk::SigmoidImageFilter <
InternalImageType ,
InternalImageType > SigmoidFilterType;
The corresponding filter objects are instantiated with the New() method.

GradientFilterType::Pointer gradientMagnitude = GradientFilterType::New ();
SigmoidFilterType::Pointer sigmoid = SigmoidFilterType::New();
The minimum and maximum values of the SigmoidImageFilter output are defined with the methods
SetOutputMinimum() and SetOutputMaximum(). In our case, we want these two values to be 0.0
and 1.0 respectively in order to get a nice speed image to feed to the FastMarchingImageFilter.
sigmoid ->SetOutputMinimum(0.0);
sigmoid ->SetOutputMaximum(1.0);
We now declare the type of the FastMarchingImageFilter.
typedef itk::FastMarchingImageFilter <InternalImageType ,
InternalImageType >
FastMarchingFilterType;
Then, we construct one filter of this class using the New() method.
FastMarchingFilterType::Pointer fastMarching = FastMarchingFilterType::New();
The filters are now connected in a pipeline shown in Figure 16.12 using the following lines.
gradientMagnitude ->SetInput (smoothing ->GetOutput ());
sigmoid ->SetInput(gradientMagnitude ->GetOutput ());
fastMarching ->SetInput(sigmoid->GetOutput ());
thresholder ->SetInput (fastMarching ->GetOutput ());
writer ->SetInput(thresholder ->GetOutput ());
The CurvatureAnisotropicDiffusionImageFilter class requires a couple of parameters to be defined.
The following are typical values. However they may have to be adjusted depending on the amount
of noise present in the input image.
smoothing ->SetConductanceParameter(2.0);
The GradientMagnitudeRecursiveGaussianImageFilter performs the equivalent of a convolution
with a Gaussian kernel followed by a derivative operator. The sigma of this Gaussian can be used to
control the range of influence of the image edges.
gradientMagnitude ->SetSigma (sigma);
The SigmoidImageFilter class requires two parameters to define the linear transformation to be ap-
plied to the sigmoid argument. These parameters are passed using the SetAlpha() and SetBeta()
methods. In the context of this example, the parameters are used to intensify the differences between
regions of low and high values in the speed image. In an ideal case, the speed value should be 1.0 in
the homogeneous regions and the value should decay rapidly to 0.0 around the edges of structures.

The heuristic for finding the values is the following. From the gradient magnitude image, let’s call
K1 the minimum value along the contour of the structure to be segmented. Then, let’s call K2 an
average value of the gradient magnitude in the middle of the structure. These two values indicate the
dynamic range that we want to map to the interval [0 : 1] in the speed image. We want the sigmoid
to map K1 to 0.0 and K2 to 1.0. Given that K1 is expected to be higher than K2 and we want to
map those values to 0.0 and 1.0 respectively, we want to select a negative value for alpha so that
the sigmoid function will also do an inverse intensity mapping. This mapping will produce a speed
image such that the level set will march rapidly on the homogeneous region and will definitely stop
on the contour. The suggested value for beta is (K1 + K2)/2 while the suggested value for alpha
is (K2 − K1)/6, which must be a negative number. In our simple example the values are provided
by the user from the command line arguments. The user can estimate these values by observing the
gradient magnitude image.
sigmoid ->SetAlpha(alpha);
sigmoid ->SetBeta(beta);
The FastMarchingImageFilter requires the user to provide a seed point from which the contour will
expand. The user can actually pass not only one seed point but a set of them. A good set of seed
points increases the chances of segmenting a complex object without missing parts. The use of
multiple seeds also helps to reduce the amount of time needed by the front to visit a whole object
and hence reduces the risk of leaks on the edges of regions visited earlier. For example, when
segmenting an elongated object, it is undesirable to place a single seed at one extreme of the object
since the front will need a long time to propagate to the other end of the object. Placing several
seeds along the axis of the object will probably be the best strategy to ensure that the entire object
is captured early in the expansion of the front. One of the important properties of level sets is their
natural ability to fuse several fronts implicitly without any extra bookkeeping. The use of multiple
seeds takes good advantage of this property.
The seeds are passed stored in a container. The type of this container is defined as NodeContainer
among the FastMarchingImageFilter traits.
typedef FastMarchingFilterType:: NodeContainer NodeContainer;
typedef FastMarchingFilterType::NodeType NodeType ;
NodeContainer::Pointer seeds = NodeContainer::New ();
Nodes are created as stack variables and initialized with a value and an itk::Index position.
NodeType node;
const double seedValue = 0.0;
node.SetValue (seedValue );
node.SetIndex (seedPosition);
The list of nodes is initialized and then every node is inserted using the InsertElement().
seeds ->Initialize ();
seeds ->InsertElement(0, node);

Structure Seed Index σ α β Threshold Output Image from left
Road (91,176) 0.5 -0.5 3.0 100 First
Shadow (118,100) 1.0 -0.5 3.0 100 Second
Building (145,21) 0.5 -0.5 3.0 100 Third
Table 16.4: Parameters used for segmenting some structures shown in Figure 16.14 using the filter Fast-
MarchingImageFilter. All of them used a stopping value of 100.
The set of seed nodes is now passed to the FastMarchingImageFilter with the method
SetTrialPoints().
fastMarching ->SetTrialPoints(seeds);
The FastMarchingImageFilter requires the user to specify the size of the image to be produced as
output. This is done using the SetOutputSize(). Note that the size is obtained here from the output
image of the smoothing filter. The size of this image is valid only after the Update() methods of
this filter has been called directly or indirectly.
fastMarching ->SetOutputSize(
reader ->GetOutput ()->GetBufferedRegion(). GetSize ());
Since the front representing the contour will propagate continuously over time, it is desirable to stop
the process once a certain time has been reached. This allows us to save computation time under
the assumption that the region of interest has already been computed. The value for stopping the
process is defined with the method SetStoppingValue(). In principle, the stopping value should
be a little bit higher than the threshold value.
fastMarching ->SetStoppingValue(stoppingTime);
The invocation of the Update() method on the writer triggers the execution of the pipeline. As
usual, the call is placed in a try/catch block should any errors occur or exceptions be thrown.
try
{
writer ->Update ();
}
{
}
Now let’s run this example using the input image QB Suburb.png provided in the directory
Examples/Data. We can easily segment structures by providing seeds in the appropriate locations.
The following table presents the parameters used for some structures.

Figure 16.13: Images generated by the segmentation process based on the FastMarchingImageFilter. From
left to right and top to bottom: input image to be segmented, image smoothed with an edge-preserving smoothing
filter, gradient magnitude of the smoothed image, sigmoid of the gradient magnitude. This last image, the
sigmoid, is used to compute the speed term for the front propagation
Figure 16.13 presents the intermediate outputs of the pipeline illustrated in Figure 16.12. They
are from left to right: the output of the anisotropic diffusion filter, the gradient magnitude of the
smoothed image and the sigmoid of the gradient magnitude which is finally used as the speed image
for the FastMarchingImageFilter.
• itk::ShapeDetectionLevelSetImageFilter
• itk::GeodesicActiveContourLevelSetImageFilter
• itk::ThresholdSegmentationLevelSetImageFilter
• itk::CannySegmentationLevelSetImageFilter
• itk::LaplacianSegmentationLevelSetImageFilter
See the ITK Software Guide for examples of the use of these classes.

Figure 16.14: Images generated by the segmentation process based on the FastMarchingImageFilter. From
left to right: segmentation of the road, shadow, building.

CHAPTER
SEVENTEEN
IMAGE SIMULATION
This chapter deals with image simulation algorithm. Using objects transmittance and reflectance and
sensor characteristics, it can be possible to generate realistic hyperspectral synthetic set of data. This
chapter includes PROSPECT (leaf optical properties) and SAIL (canopy bidirectional reflectance)
model. Vegetation optical properties are modeled using PROSPECT model [72].
17.1 PROSAIL model
PROSAIL [72] model is the combinaison of PROSPECT leaf optical properties model and SAIL
canopy bidirectional reflectance model. PROSAIL has also been used to develop new methods for
retrieval of vegetation biophysical properties. It links the spectral variation of canopy reflectance,
which is mainly related to leaf biochemical contents, with its directional variation, which is primarily
related to canopy architecture and soil/vegetation contrast. This link is key to simultaneous estima-
tion of canopy biophysical/structural variables for applications in agriculture, plant physiology, or
ecology, at different scales. PROSAIL has become one of the most popular radiative transfer tools
due to its ease of use, general robustness, and consistent validation by lab/field/space experiments
over the years. Here we present a first example, which returns Hemispheric and Viewing reflectance
for wavelength sampled from 400 to 2500nm. Inputs are leaf and Sensor (intrinsic and extrinsic)
characteristics.
Examples/Simulation/ProsailModel.cxx.
This example presents how to use PROSAIL (Prospect + Sail) model to generate viewing reflectance
from leaf parameters, vegetation, and viewing parameters. Output can be used to simulate image for
example.
Let’s look at the minimal code required to use this algorithm. First, the following headers must be
included.
#include "otbLeafParameters.h"
#include "otbSailModel.h"

426 Chapter 17. Image Simulation
#include "otbProspectModel.h"
We now deﬁne leaf parameters, which characterize vegetation composition.
typedef otb::LeafParameters LeafParametersType;
Next the parameters variable is created by invoking the New() method and assigning the result to a
itk::SmartPointer.
LeafParametersType::Pointer leafParams = LeafParametersType::New ();
Leaf characteristics is then set. Input parameters are :
• Chlorophyll concentration (Cab) in µg/cm2.
• Carotenoid concentration (Car) in µg/cm2.
• Brown pigment content (CBrown) in arbitrary unit.
• Water thickness EWT (Cw) in cm.
• Dry matter content LMA (Cm) in g/cm2.
• Leaf structure parameter (N).
double Cab = static_cast<double> (atof(argv[1]));
double Car = static_cast<double> (atof(argv[2]));
double CBrown = static_cast<double> (atof(argv[3]));
double Cw = static_cast<double> (atof(argv[4]));
double Cm = static_cast<double> (atof(argv[5]));
double N = static_cast<double> (atof(argv[6]));
leafParams ->SetCab(Cab);
leafParams ->SetCar(Car);
leafParams ->SetCBrown (CBrown);
leafParams ->SetCw(Cw);
leafParams ->SetCm(Cm);
leafParams ->SetN(N);
Leaf parameters are used as prospect input
typedef otb::ProspectModel ProspectType;
ProspectType::Pointer prospect = ProspectType::New ();
prospect ->SetInput (leafParams );
Now we use SAIL model to generate transmitance and reﬂectance spectrum. SAIL model is created
by invoking the New() method and assigning the result to a itk::SmartPointer.
sail input parameters are :

17.1. PROSAIL model 427
• leaf area index (LAI).
• average leaf angle (Ang) in deg.
• soil coefficient (PSoil).
• diffuse/direct radiation (Skyl).
• hot spot (HSpot).
• solar zenith angle (TTS) in deg.
• observer zenith angle (TTO) in deg.
• azimuth (PSI) in deg.
double LAI = static_cast<double> (atof(argv[7]));
double Angl = static_cast<double> (atof(argv[8]));
double PSoil = static_cast<double> (atof(argv[9]));
double Skyl = static_cast<double> (atof(argv[10]));
double HSpot = static_cast<double> (atof(argv[11]));
double TTS = static_cast<double> (atof(argv[12]));
double TTO = static_cast<double> (atof(argv[13]));
double PSI = static_cast<double> (atof(argv[14]));
typedef otb::SailModel SailType;
SailType ::Pointer sail = SailType ::New ();
sail ->SetLAI(LAI);
sail ->SetAngl(Angl);
sail ->SetPSoil (PSoil);
sail ->SetSkyl(Skyl);
sail ->SetHSpot (HSpot);
sail ->SetTTS(TTS);
sail ->SetTTO(TTO);
sail ->SetPSI(PSI);
Reflectance and Transmittance are set with prospect output.
sail ->SetReflectance(prospect ->GetReflectance());
sail ->SetTransmittance(prospect ->GetTransmittance());
The invocation of the Update() method triggers the execution of the pipeline.
sail ->Update();
GetViewingReflectance method provides viewing reflectance vector (size Nx2, where N is the num-
ber of sampled wavelength values, columns corresponds respectively to wavelength and viewing
reflectance) by calling GetResponse. GetHemisphericalReflectance method provides hemispherical
reflectance vector (size Nx2, where N is the number ofsampled wavelength values, columns corre-
sponds to wavelength and hemispherical reflectance) by calling GetResponse.

Note that PROSAIL simulation are done for 2100 samples starting from 400nm up to 2500nm
for (unsigned int i = 0; i < sail ->GetViewingReflectance()->Size(); ++i)
{
std::cout << "wavelength : ";
std::cout << sail ->GetViewingReflectance()->GetResponse ()[i].first;
std::cout << ". Viewing reflectance ";
std::cout << sail ->GetViewingReflectance()->GetResponse ()[i].second;
std::cout << ". Hemispherical reflectance ";
std::cout << sail ->GetHemisphericalReflectance()->GetResponse ()[i].second;
std::cout << std::endl;
}
here you can found example parameters :
• Cab 30.0
• Car 10.0
• CBrown 0.0
• Cw 0.015
• Cm 0.009
• N 1.2
• LAI 2
• Angl 50
• PSoil 1
• Skyl 70
• HSpot 0.2
• TTS 30
• TTO 0
• PSI 0
More informations and data about leaf properties can be found at St´ephane Jacquemoud
OPTICLEAF website.

17.2. Image Simulation 429
17.2 Image Simulation
Here we present a complete pipeline to simulate image using sensor characteristics and objects
reflectance and transmittance properties. This example use :
• input image
• label image : describes image object properties.
• label properties : describes each label characteristics.
• mask : vegetation image mask.
• cloud mask (optionnal).
• acquisition rarameter file : file containing the parameters for the acquisition.
• RSR File : File name for the relative spectral response to be used.
• sensor FTM file : File name for sensor spatial interpolation.
Algorithm is divided in following step :
1. LAI (Leaf Area Index) image estimation using NDVI formula.
2. Sensor Reduce Spectral Response (RSR) using PROSAIL reflectance output interpolated at
sensor spectral bands.
3. Simulated image using Sensor RSR and Sensor FTM.
17.2.1 LAI image estimation
Examples/Simulation/LAIFromNDVIImageTransform.cxx.
This example presents a way to generate LAI (Leaf Area Index) image using formula dedicated to
Formosat2. LAI Image is used as an input in Image Simulation process.
included.
#include "otbMultiChannelRAndNIRIndexImageFilter.h"
#include "otbVegetationIndicesFunctor.h"
Filter type is a generic otb::MultiChannelRAndNIRIndexImageFilter using Formosat2 specific
LAI otb::LAIFromNDVIFormosat2Functor.

Figure 17.1: LAI generation (right) from NDVI applied on Formosat 2 Image (left) .
typedef otb::Functor ::LAIFromNDVIFormosat2Functor
<InputImageType::InternalPixelType ,
InputImageType::InternalPixelType , OutputImageType::PixelType > FunctorType ;
typedef otb:: MultiChannelRAndNIRIndexImageFilter
<InputImageType , OutputImageType , FunctorType >
MultiChannelRAndNIRIndexImageFilterType;
itk::SmartPointer.
MultiChannelRAndNIRIndexImageFilterType::Pointer filter
= MultiChannelRAndNIRIndexImageFilterType::New();
ﬁlter input is set with input image
then red and nir channels index are set using SetRedIndex() and SetNIRIndex()
unsigned int redChannel = static_cast<unsigned int> (atoi(argv[5]));
unsigned int nirChannel = static_cast<unsigned int> (atoi(argv[6]));
filter ->SetRedIndex (redChannel );
filter ->SetNIRIndex (nirChannel );
filter ->Update();
Figure 17.1 illustrates the LAI generation using Formosat 2 data.

17.2.2 Sensor RSR Image Simulation
Examples/Simulation/LAIAndPROSAILToSensorResponse.cxx.
The following code is an example of Sensor spectral response image generated using image of
labeled objects image, objects properties (vegetation classes are handled using PROSAIL model,
non-vegetation classes are characterized using Aster database characteristics provided by a text file),
acquisition parameters, sensor characteristics, and LAI (Leaf Area Index) image.
Sensor RSR is modeled by 6S (Second Simulation of a Satellite Signal in the Solar Spectrum) model
[137]. Detailed information about 6S can be found here.
included.
#include "otbProspectModel.h"
#include "otbSailModel.h"
#include "otbLeafParameters.h"
#include "otbSatelliteRSR.h"
#include "otbReduceSpectralResponse.h"
#include "otbImageSimulationMethod.h"
#include "otbSpatialisationFilter.h"
#include "otbAttributesMapLabelObject.h"
#include "otbSpectralResponse.h"
#include "itkTernaryFunctorImageFilter.h"
#include "otbVegetationIndicesFunctor.h"
#include "otbRAndNIRIndexImageFilter.h"
#include "otbVectorDataToLabelMapWithAttributesFilter.h"
ImageUniqueValuesCalculator class is defined here. Method GetUniqueValues() returns an
array with all values contained in an image. This class is implemented and used to test if all labels
in labeled image are present in label parameter file.
template < class TImage >
class ITK_EXPORT ImageUniqueValuesCalculator : public itk ::Object
{
public:
typedef ImageUniqueValuesCalculator<TImage > Self;
typedef itk::Object Superclass ;
typedef itk::SmartPointer <Self > Pointer;
typedef itk::SmartPointer <const Self > ConstPointer;
itkNewMacro (Self);
itkTypeMacro(ImageUniqueValuesCalculator, itk::Object);
itkStaticConstMacro(ImageDimension , unsigned int,
TImage::ImageDimension);
typedef typename TImage::PixelType PixelType ;
typedef TImage ImageType ;

typedef std::vector <PixelType > ArrayType ;
typedef typename ImageType ::Pointer ImagePointer;
typedef typename ImageType ::ConstPointer ImageConstPointer;
virtual void SetImage ( const ImageType * image )
{
if ( m_Image != image )
{
m_Image = image;
this->Modified ();
}
}
ArrayType GetUniqueValues() const
{
typedef typename ImageType :: IndexType IndexType ;
if( !m_Image )
{
itkExceptionMacro(<<"GetUniqueValues(): Null input image pointer.");
}
itk::ImageRegionConstIterator< ImageType > it( m_Image,
m_Image->GetRequestedRegion() );
ArrayType uniqueValues;
uniqueValues.push_back (it.Get ());
++it;
while( !it.IsAtEnd () )
{
if( std::find(uniqueValues.begin(),
uniqueValues.end(), it.Get ()) == uniqueValues.end())
{
uniqueValues.push_back (it.Get ());
}
++it;
}
return uniqueValues;
}
protected:
ImageUniqueValuesCalculator()
{
m_Image = NULL;
}
virtual ˜ImageUniqueValuesCalculator()
{
}
void PrintSelf (std::ostream& os, itk::Indent indent) const

{
Superclass ::PrintSelf (os, indent);
os << indent << "Image: " << m_Image.GetPointer () << std::endl;
}
private:
ImageUniqueValuesCalculator(const Self&); //purposely not implemented
void operator=(const Self&); //purposely not implemented
ImageConstPointer m_Image;
}; // class ImageUniqueValuesCalculator
ProsailSimulatorFunctor functor is deﬁned here.
template<class TLAI , class TLabel , class TMask , class TOutput,
class TLabelSpectra , class TLabelParameter ,
class TAcquistionParameter , class TSatRSR>
class ProsailSimulatorFunctor
ProsailSimulatorFunctor functor is deﬁned here.
typedef TLAI LAIPixelType;
typedef TLabel LabelPixelType;
typedef TMask MaskPixelType;
typedef TOutput OutputPixelType;
typedef TLabelSpectra LabelSpectraType;
typedef TLabelParameter LabelParameterType;
typedef TAcquistionParameter AcquistionParameterType;
typedef TSatRSR SatRSRType ;
typedef typename SatRSRType ::Pointer SatRSRPointerType;
typedef typename otb::ProspectModel ProspectType;
typedef typename otb::SailModel SailType ;
typedef double PrecisionType;
typedef std::pair <PrecisionType , PrecisionType > PairType;
typedef typename std::vector <PairType > VectorPairType;
typedef otb::SpectralResponse <PrecisionType , PrecisionType > ResponseType;
typedef ResponseType::Pointer ResponsePointerType;
typedef otb:: ReduceSpectralResponse
<ResponseType , SatRSRType > ReduceResponseType;
typedef typename ReduceResponseType::Pointer
ReduceResponseTypePointerType;
In this example spectra are generated form 400 to 2400nm. the number of simulated band is set by
SimNbBands value.
static const unsigned int SimNbBands = 2000;
mask value is read to know if the pixel have to be calculated, it is set to 0 otherwise.
OutputPixelType pix;
pix.SetSize(m_SatRSR ->GetNbBands ());

if ((! mask && !m_InvertedMask) || (mask && m_InvertedMask))
{
for (unsigned int i = 0; i < m_SatRSR ->GetNbBands (); i++)
pix[i] = static_cast<typename OutputPixelType::ValueType > (0);
return pix;
}
Object reﬂectance hxSpectrum is calculated. If object label correspond to vegetation label then
Prosail code is used, aster database is used otherwise.
VectorPairType hxSpectrum ;
for (unsigned int i = 0; i < SimNbBands ; i++)
{
PairType resp;
resp.first = static_cast<PrecisionType > ((400.0 + i) / 1000);
hxSpectrum .push_back (resp);
}
// either the spectrum has to be simulated by Prospect+Sail
if (m_LabelParameters.find(label) != m_LabelParameters.end())
{
ProspectType::Pointer prospect = ProspectType::New();
prospect ->SetInput(m_LabelParameters[label]);
SailType ::Pointer sail = SailType ::New ();
sail ->SetLAI(lai);
sail ->SetAngl(m_AcquisitionParameters[std::string("Angl")]);
sail ->SetPSoil (m_AcquisitionParameters[std::string("PSoil")]);
sail ->SetSkyl(m_AcquisitionParameters[std::string("Skyl")]);
sail ->SetHSpot (m_AcquisitionParameters[std::string("HSpot")]);
sail ->SetTTS(m_AcquisitionParameters[std::string("TTS")]);
sail ->SetTTO(m_AcquisitionParameters[std::string("TTO")]);
sail ->SetPSI(m_AcquisitionParameters[std::string("PSI")]);
sail ->SetReflectance(prospect ->GetReflectance());
sail ->SetTransmittance(prospect ->GetTransmittance());
sail ->Update ();
{
hxSpectrum [i].second = static_cast<typename OutputPixelType::ValueType >
(sail ->GetHemisphericalReflectance()->GetResponse ()[i].second);
}
}
// or the spectra has been set from outside the functor (ex. bare soil, etc.)
else
if (m_LabelSpectra.find(label) != m_LabelSpectra.end ())
{
hxSpectrum [i].second =
static_cast<typename OutputPixelType::ValueType >
(m_LabelSpectra[label][i]);

}
// or the class does not exist
else
{
hxSpectrum [i].second =
static_cast<typename OutputPixelType::ValueType > (0);
}
Spectral response aResponse is set using hxSpectrum.
ResponseType::Pointer aResponse = ResponseType::New();
aResponse ->SetResponse (hxSpectrum );
Satellite RSR is initialized and set with aResponse.
ReduceResponseTypePointerType reduceResponse = ReduceResponseType::New();
reduceResponse ->SetInputSatRSR(m_SatRSR );
reduceResponse ->SetInputSpectralResponse(aResponse );
reduceResponse ->CalculateResponse();
VectorPairType reducedResponse =
reduceResponse ->GetReduceResponse()->GetResponse ();
pix value is returned for desired Satellite bands
for (unsigned int i = 0; i < m_SatRSR ->GetNbBands (); i++)
pix[i] =
static_cast<typename OutputPixelType::ValueType >
(reducedResponse[i].second);
return pix;
TernaryFunctorImageFilterWithNBands class is deﬁned here. This class inherits form
itk::TernaryFunctorImageFilter::with additionnal nuber of band parameters. It’s implemen-
tation is done to process Label, LAI, and mask image with Simulation functor.
template <class TInputImage1 , class TInputImage2 , class TInputImage3 ,
class TOutputImage , class TFunctor >
class ITK_EXPORT TernaryFunctorImageFilterWithNBands :
public itk::TernaryFunctorImageFilter< TInputImage1 , TInputImage2 ,
TInputImage3 , TOutputImage , TFunctor >
{
public:
typedef TernaryFunctorImageFilterWithNBands Self;
typedef itk::TernaryFunctorImageFilter< TInputImage1 , TInputImage2 ,
TInputImage3 , TOutputImage , TFunctor > Superclass ;
/** Method for creation through the object factory. */
itkNewMacro (Self);
/** Macro defining the type*/
itkTypeMacro(TernaryFunctorImageFilterWithNBands, SuperClass );

/** Accessors for the number of bands*/
itkSetMacro (NumberOfOutputBands , unsigned int);
itkGetConstMacro(NumberOfOutputBands , unsigned int);
protected:
TernaryFunctorImageFilterWithNBands() {}
virtual ˜TernaryFunctorImageFilterWithNBands() {}
void GenerateOutputInformation()
{
Superclass ::GenerateOutputInformation();
this->GetOutput ()->SetNumberOfComponentsPerPixel( m_NumberOfOutputBands );
}
private:
TernaryFunctorImageFilterWithNBands(const Self &); //purposely not implemented
void operator =(const Self&); //purposely not implemented
unsigned int m_NumberOfOutputBands;
};
input images typedef are presented below. This example uses double LAI image, binary mask and
cloud mask, and integer label image
typedef double LAIPixelType;
typedef unsigned short LabelType ;
typedef unsigned short MaskPixelType;
// Image typedef
typedef otb::Image <LAIPixelType , 2> LAIImageType;
typedef otb::Image <LabelType , 2> LabelImageType;
typedef otb::Image <MaskPixelType , 2> MaskImageType;
typedef otb::VectorImage <OutputPixelType , 2> SimulatedImageType;
Leaf parameters typedef is defined.
small
begin{lstlisting }
typedef otb::LeafParameters LeafParametersType;
typedef LeafParametersType::Pointer LeafParametersPointerType;
typedef std::map<LabelType , LeafParametersPointerType> LabelParameterMapType;
Sensor spectral response typedef is deﬁned
typedef double PrecisionType;
typedef std::vector <PrecisionType > SpectraType ;
typedef std::map<LabelType , SpectraType > SpectraParameterType;
Acquisition response typedef is deﬁned
typedef std::map<std::string , double> AcquistionParsType;

Satellite typedef is defined
typedef otb::SatelliteRSR <PrecisionType , PrecisionType > SatRSRType ;
Filter type is the specific TernaryFunctorImageFilterWithNBands defined below with specific
functor.
typedef otb::Functor ::ProsailSimulatorFunctor
<LAIPixelType , LabelType , MaskPixelType , SimulatedImageType::PixelType ,
SpectraParameterType , LabelParameterMapType , AcquistionParsType , SatRSRType >
SimuFunctorType;
typedef otb:: TernaryFunctorImageFilterWithNBands
<LAIImageType , LabelImageType , MaskImageType , SimulatedImageType ,
SimuFunctorType > SimulatorType;
Acquisition parameters are loaded using text file. A detailled definition of acquisition parameters
can be found in class SailModel::.
AcquistionParsType acquistionPars;
acquistionPars[std::string("Angl")] = 0.0;
acquistionPars[std::string("PSoil")] = 0.0;
acquistionPars[std::string("Skyl")] = 0.0;
acquistionPars[std::string("HSpot")] = 0.0;
acquistionPars[std::string("TTS")] = 0.0;
acquistionPars[std::string("TTO")] = 0.0;
acquistionPars[std::string("PSI")] = 0.0;
std::ifstream acquistionParsFile;
try
{
acquistionParsFile.open(apfname );
}
catch (...)
{
std::cerr << "Could not open file " << apfname << std ::endl;
}
//unsigned int acPar = 0;
while (acquistionParsFile.good())
{
std::string line;
std::getline(acquistionParsFile , line);
std::stringstream ss(line);
std::string parName;
ss >> parName;
double parValue;
ss >> parValue;
acquistionPars[parName] = parValue;
}
acquistionParsFile.close();

Label parameters are loaded using text file. Two type of object characteristic can be found. If
label corresponds to vegetation class, then leaf parameters are loaded. A detailled definition of leaf
parameters can be found in class otb::LeafParameters class. Otherwise object reflectance is
generated from 400 to 2400nm using Aster database.
LabelParameterMapType labelParameters;
std::ifstream labelParsFile;
SpectraParameterType spectraParameters;
try
{
labelParsFile.open(lpfname, std::ifstream ::in);
}
catch (...)
{
std::cerr << "Could not open file " << lpfname << std::endl;
}
while (labelParsFile.good())
{
char fileLine [256];
labelParsFile.getline(fileLine , 256);
if (fileLine [0] != ’#’)
{
std::stringstream ss(fileLine );
unsigned short label;
ss >> label;
unsigned short paramsOrSpectra;
ss >> paramsOrSpectra;
if (paramsOrSpectra == 1)
{
double Cab;
ss >> Cab;
double Car;
ss >> Car;
double CBrown;
ss >> CBrown;
double Cw;
ss >> Cw;
double Cm;
ss >> Cm;
double N;
ss >> N;
LeafParametersType::Pointer leafParams = LeafParametersType::New ();
leafParams ->SetCab(Cab);
leafParams ->SetCar(Car);
leafParams ->SetCBrown (CBrown);
leafParams ->SetCw(Cw);
leafParams ->SetCm(Cm);
leafParams ->SetN(N);

labelParameters[label] = leafParams ;
}
else
{
std::string spectraFilename = rootPath;
ss >> spectraFilename;
spectraFilename = rootPath + spectraFilename;
typedef otb::SpectralResponse <PrecisionType , PrecisionType > ResponseType;
ResponseType::Pointer resp = ResponseType::New();
// Coefficient 100 since Aster database is given in % reflectance
resp ->Load(spectraFilename , 100.0);
SpectraType spec;
for (unsigned int i = 0; i < SimuFunctorType::SimNbBands ; i++)
//Prosail starts at 400 and lambda in Aster DB is in micrometers
spec.push_back
(static_cast<PrecisionType > ((*resp)((i + 400.0) / 1000.0)));
spectraParameters[label] = spec;
}
}
}
labelParsFile.close();
LAI image is read.
LAIReaderType::Pointer laiReader = LAIReaderType::New();
laiReader ->SetFileName (laiifname );
laiReader ->Update ();
LAIImageType::Pointer laiImage = laiReader ->GetOutput ();
Label image is then read. Label image is processed using ImageUniqueValuesCalculator in order
to check if all the labels are present in the labelParameters ﬁle.
LabelReaderType::Pointer labelReader = LabelReaderType::New();
labelReader ->SetFileName (lifname );
labelReader ->Update();
LabelImageType::Pointer labelImage = labelReader ->GetOutput ();
typedef otb::ImageUniqueValuesCalculator<LabelImageType > UniqueCalculatorType;
UniqueCalculatorType::Pointer uniqueCalculator = UniqueCalculatorType::New();
uniqueCalculator ->SetImage (labelImage );
UniqueCalculatorType::ArrayType uniqueVals =
uniqueCalculator ->GetUniqueValues();
std::cout << "Labels are " << std::endl;
UniqueCalculatorType::ArrayType ::const_iterator uvIt = uniqueVals .begin();
while (uvIt != uniqueVals .end ())

{
std::cout << (*uvIt) << ", ";
++uvIt;
}
std::cout << std::endl;
uvIt = uniqueVals .begin();
while (uvIt != uniqueVals .end ())
{
if (labelParameters.find(static_cast<LabelType >
(*uvIt)) == labelParameters.end() &&
spectraParameters.find(static_cast<LabelType > (*uvIt)) ==
spectraParameters.end() &&
static_cast<LabelType > (*uvIt) != 0)
{
std::cout << "label " << (*uvIt) << " not found in " <<
lpfname << std::endl;
}
++uvIt;
}
Mask image is read. If cloud mask is ﬁlename is given, a new mask image is generated with masks
concatenation.
MaskReaderType::Pointer miReader = MaskReaderType::New();
miReader ->SetFileName (mifname);
miReader ->UpdateOutputInformation();
MaskImageType::Pointer maskImage = miReader ->GetOutput ();
if (cmifname != NULL)
{
MaskReaderType::Pointer cmiReader = MaskReaderType::New ();
cmiReader ->SetFileName (cmifname );
cmiReader ->UpdateOutputInformation();
typedef itk::OrImageFilter
<MaskImageType , MaskImageType , MaskImageType > OrType;
OrType::Pointer orfilter = OrType::New();
orfilter ->SetInput1 (miReader ->GetOutput ());
orfilter ->SetInput2 (cmiReader ->GetOutput ());
orfilter ->Update();
maskImage = orfilter ->GetOutput ();
}
A test is done. All images must have the same size.
if (laiImage ->GetLargestPossibleRegion(). GetSize ()[0] !=

labelImage ->GetLargestPossibleRegion(). GetSize ()[0] ||
laiImage ->GetLargestPossibleRegion(). GetSize ()[1] !=
labelImage ->GetLargestPossibleRegion(). GetSize ()[1] ||
maskImage ->GetLargestPossibleRegion(). GetSize ()[0] ||
maskImage ->GetLargestPossibleRegion(). GetSize ()[1])
{
std::cerr << "Image of labels , mask and LAI image must have the same size"
<< std::endl;
}
Satellite RSR (Reduced Spectral Response) is defined using filename and band number given by
command line arguments.
SatRSRType ::Pointer satRSR = SatRSRType ::New ();
satRSR ->SetNbBands (nbBands );
satRSR ->SetSortBands(false);
satRSR ->Load(rsfname );
for (unsigned int i = 0; i < nbBands; ++i)
std::cout <GetRSR())[i]->GetInterval (). first << " "
<< (satRSR ->GetRSR ())[i]->GetInterval (). second << std::endl;
At this step all initialization have been done. The next step is to implement and initialize simulation
functor ProsailSimulatorFunctor.
SimuFunctorType simuFunctor ;
simuFunctor .SetLabelParameters(labelParameters);
simuFunctor .SetLabelSpectra(spectraParameters);
simuFunctor .SetAcquisitionParameters(acquistionPars);
simuFunctor .SetRSR(satRSR);
simuFunctor .SetInvertedMask(true);
Inputs and Functor are plugged to simulator filter.
SimulatorType::Pointer simulator = SimulatorType::New();
simulator ->SetInput1 (laiImage );
simulator ->SetInput2 (labelImage );
simulator ->SetInput3 (maskImage );
simulator ->SetFunctor (simuFunctor );
simulator ->SetNumberOfOutputBands(nbBands );
simulator ->Update ();

CHAPTER
EIGHTEEN
DIMENSION REDUCTION
Dimension reduction is a statistical process, which concentrates the amount of information in mul-
tivariate data into a fewer number of variables (or dimensions). An interesting review of the domain
has been done by Fodor [48].
Though there are plenty of non-linear methods in the litterature, OTB provides only linear dimension
reduction techniques applied to images for now.
Usually, linear dimension-reduction algorithms try to find a set of linear combinations of the input
image bands that maximise a given criterion, often chosen so that image information concentrates
on the first components. Algorithms differs by the criterion to optimise and also by their handling
of the signal or image noise.
In remote-sensing images processing, dimension reduction algorithms are of great interest for de-
noising, or as a preliminary processing for classification of feature images or unmixing of hyper-
spectral images. In addition to the denoising effect, the advantage of dimension reduction in the two
latter is that it lowers the size of the data to be analysed, and as such, speeds up the processing time
without too much loss of accuracy.
18.1 Principal Component Analysis
Examples/DimensionReduction/PCAExample.cxx.
This example illustrates the use of the otb::PCAImageFilter. This filter computes a Principal
Component Analysis using an efficient method based on the inner product in order to compute the
covariance matrix.
#include "otbPCAImageFilter.h"

444 Chapter 18. Dimension Reduction
We start by defining the types for the images and the reader and the writer. We choose to work with a
otb::VectorImage, since we will produce a multi-channel image (the principal components) from
a multi-channel input image.
We instantiate now the image reader and we set the image file name.
reader ->SetFileName (inputFileName);
We define the type for the filter. It is templated over the input and the output image types and also
the transformation direction. The internal structure of this filter is a filter-to-filter like structure. We
can now the instantiate the filter.
typedef otb::PCAImageFilter <ImageType , ImageType ,
otb::Transform ::FORWARD> PCAFilterType;
PCAFilterType::Pointer pcafilter = PCAFilterType::New ();
The only parameter needed for the PCA is the number of principal components required as output.
Principal components are linear combination of input components (here the input image bands),
which are selected using Singular Value Decomposition eigen vectors sorted by eigen value. We can
choose to get less Principal Components than the number of input bands.
pcafilter ->SetNumberOfPrincipalComponentsRequired(
numberOfPrincipalComponentsRequired);
We now instantiate the writer and set the file name for the output image.
We finally plug the pipeline and trigger the PCA computation with the method Update() of the
writer.
pcafilter ->SetInput (reader ->GetOutput ());
writer ->SetInput(pcafilter ->GetOutput ());
writer ->Update();
otb::PCAImageFilter allows also to compute inverse transformation from PCA coefficients. In
reverse mode, the covariance matrix or the transformation matrix (which may not be square) has to
be given.
typedef otb::PCAImageFilter < ImageType , ImageType ,
otb::Transform ::INVERSE > InvPCAFilterType;
InvPCAFilterType::Pointer invFilter = InvPCAFilterType::New();

18.2. Noise-Adjusted Principal Components Analysis 445
Figure 18.1: Result of applying the otb::PCAImageFilter to an image. From left to right: original image,
color composition with first three principal components and output of the inverse mode (the input RGB image).
invFilter ->SetInput (pcafilter ->GetOutput ());
invFilter ->SetTransformationMatrix(pcafilter ->GetTransformationMatrix());
WriterType ::Pointer invWriter = WriterType ::New();
invWriter ->SetFileName (outputInverseFilename );
invWriter ->SetInput (invFilter ->GetOutput () );
invWriter ->Update ();
Figure 18.1 shows the result of applying forward and reverse PCA transformation to a 8 bands
Worldview2 image.
18.2 Noise-Adjusted Principal Components Analysis
Examples/DimensionReduction/NAPCAExample.cxx.
This example illustrates the use of the otb::NAPCAImageFilter. This filter computes a Noise-
Adjusted Principal Component Analysis transform [86] using an efficient method based on the inner
product in order to compute the covariance matrix.
The Noise-Adjusted Principal Component Analysis transform is a sequence of two Principal Com-
ponent Analysis transforms. The first transform is based on an estimated covariance matrix of the
noise, and intends to whiten the input image (noise with unit variance and no correlation between
bands).
The second Principal Component Analysis is then applied to the noise-whitened image, giving the
Maximum Noise Fraction transform. Applying PCA on noise-whitened image consists in ranking

Principal Components according to signal to noise ratio.
It is basically a reformulation of the Maximum Noise Fraction algorithm.
#include "otbNAPCAImageFilter.h"
We also need to include the header of the noise filter.
#include "otbLocalActivityVectorImageFilter.h"
We start by defining the types for the images, the reader and the writer. We choose to work with a
In contrast with standard Principal Component Analysis, NA-PCA needs an estimation of the noise
correlation matrix in the dataset prior to transformation.
A classical approach is to use spatial gradient images and infer the noise correlation matrix from
it. The method of noise estimation can be customized by templating the otb::NAPCAImageFilter
with the desired noise estimation method.
In this implementation, noise is estimated from a local window. We define the type of the noise
filter.
typedef otb::LocalActivityVectorImageFilter<ImageType ,ImageType > NoiseFilterType;
We define the type for the filter. It is templated over the input and the output image types, the noise
estimation filter type, and also the transformation direction. The internal structure of this filter is a
filter-to-filter like structure. We can now the instantiate the filter.
typedef otb::NAPCAImageFilter <ImageType , ImageType ,
NoiseFilterType ,
otb::Transform ::FORWARD > NAPCAFilterType;
NAPCAFilterType::Pointer napcafilter = NAPCAFilterType::New ();
We then set the number of principal components required as output. We can choose to get less PCs
than the number of input bands.
napcafilter ->SetNumberOfPrincipalComponentsRequired(

18.3. Maximum Noise Fraction 447
We set the radius of the sliding window for noise estimation.
NoiseFilterType:: RadiusType radius = {{ vradius , vradius }};
napcafilter ->GetNoiseImageFilter()->SetRadius (radius);
Last, we can activate normalisation.
napcafilter ->SetUseNormalization( normalization );
We ﬁnally plug the pipeline and trigger the NA-PCA computation with the method Update() of the
writer.
napcafilter ->SetInput (reader ->GetOutput ());
writer ->SetInput(napcafilter ->GetOutput ());
writer ->Update();
otb::NAPCAImageFilter allows also to compute inverse transformation from NA-PCA coefﬁ-
cients. In reverse mode, the covariance matrix or the transformation matrix (which may not be
square) has to be given.
typedef otb::NAPCAImageFilter < ImageType , ImageType ,
NoiseFilterType ,
otb::Transform ::INVERSE > InvNAPCAFilterType;
InvNAPCAFilterType::Pointer invFilter = InvNAPCAFilterType::New ();
invFilter ->SetMeanValues( napcafilter ->GetMeanValues() );
if ( normalization )
invFilter ->SetStdDevValues( napcafilter ->GetStdDevValues() );
invFilter ->SetTransformationMatrix( napcafilter ->GetTransformationMatrix() );
invFilter ->SetInput (napcafilter ->GetOutput ());
Figure 18.2 shows the result of applying forward and reverse NA-PCA transformation to a 8 bands
Worldview2 image.
18.3 Maximum Noise Fraction
Examples/DimensionReduction/MNFExample.cxx.

Figure 18.2: Result of applying the otb::NAPCAImageFilter to an image. From left to right: original image,
This example illustrates the use of the otb::MNFImageFilter. This filter computes a Maximum
Noise Fraction transform [53] using an efficient method based on the inner product in order to
compute the covariance matrix.
The Maximum Noise Fraction transform is a sequence of two Principal Component Analysis trans-
forms. The first transform is based on an estimated covariance matrix of the noise, and intends to
whiten the input image (noise with unit variance and no correlation between bands).
The second Principal Component Analysis is then applied to the noise-whitened image, giving the
Maximum Noise Fraction transform.
In this implementation, noise is estimated from a local window.
#include "otbMNFImageFilter.h"
We also need to include the header of the noise filter.
#include "otbLocalActivityVectorImageFilter.h"
We start by defining the types for the images, the reader, and the writer. We choose to work with a

18.3. Maximum Noise Fraction 449
In contrast with standard Principal Component Analysis, MNF needs an estimation of the noise
correlation matrix in the dataset prior to transformation.
A classical approach is to use spatial gradient images and infer the noise correlation matrix from it.
The method of noise estimation can be customized by templating the otb::MNFImageFilter with
the desired noise estimation method.
In this implementation, noise is estimated from a local window. We define the type of the noise
filter.
typedef otb::LocalActivityVectorImageFilter<ImageType ,ImageType > NoiseFilterType;
typedef otb::MNFImageFilter <ImageType , ImageType ,
NoiseFilterType ,
otb::Transform ::FORWARD> MNFFilterType;
MNFFilterType::Pointer MNFfilter = MNFFilterType::New ();
We then set the number of principal components required as output. We can choose to get less PCs
than the number of input bands.
MNFfilter ->SetNumberOfPrincipalComponentsRequired(
We set the radius of the sliding window for noise estimation.
NoiseFilterType:: RadiusType radius = {{ vradius , vradius }};
MNFfilter ->GetNoiseImageFilter()->SetRadius (radius);
Last, we can activate normalisation.
MNFfilter ->SetUseNormalization( normalization );
We finally plug the pipeline and trigger the MNF computation with the method Update() of the
writer.
MNFfilter ->SetInput (reader ->GetOutput ());
writer ->SetInput(MNFfilter ->GetOutput ());
writer ->Update();

Figure 18.3: Result of applying the otb::MNFImageFilter to an image. From left to right: original image,
otb::MNFImageFilter allows also to compute inverse transformation from MNF coefﬁcients. In
reverse mode, the covariance matrix or the transformation matrix (which may not be square) has to
be given.
typedef otb::MNFImageFilter < ImageType , ImageType ,
NoiseFilterType ,
otb::Transform ::INVERSE > InvMNFFilterType;
InvMNFFilterType::Pointer invFilter = InvMNFFilterType::New();
invFilter ->SetMeanValues( MNFfilter ->GetMeanValues() );
if ( normalization )
invFilter ->SetStdDevValues( MNFfilter ->GetStdDevValues() );
invFilter ->SetTransformationMatrix( MNFfilter ->GetTransformationMatrix() );
invFilter ->SetInput (MNFfilter ->GetOutput ());
Figure 18.3 shows the result of applying forward and reverse MNF transformation to a 8 bands
Worldview2 image.
18.4 Fast Independant Component Analysis
Examples/DimensionReduction/ICAExample.cxx.
This example illustrates the use of the otb::FastICAImageFilter. This ﬁlter computes a Fast

18.4. Fast Independant Component Analysis 451
Independent Components Analysis transform.
Like Principal Components Analysis, Independent Component Analysis [77] computes a set of or-
thogonal linear combinations, but the criterion of Fast ICA is different: instead of maximizing vari-
ance, it tries to maximize statistical independence between components.
In the Fast ICA algorithm [67], statistical independence is measured by evaluating non-Gaussianity
of the components, and the maximization is done in an iterative way.
#include "otbFastICAImageFilter.h"
We start by deﬁning the types for the images, the reader, and the writer. We choose to work with
a otb::VectorImage, since we will produce a multi-channel image (the independent components)
from a multi-channel input image.
typedef otb::FastICAImageFilter <ImageType , ImageType ,
otb::Transform ::FORWARD> FastICAFilterType;
FastICAFilterType::Pointer FastICAfilter = FastICAFilterType::New ();
We then set the number of independent components required as output. We can choose to get less
ICs than the number of input bands.
FastICAfilter ->SetNumberOfPrincipalComponentsRequired(
We set the number of iterations of the ICA algorithm.
FastICAfilter ->SetNumberOfIterations(numIterations);
We also set the µ parameter.
FastICAfilter ->SetMu( mu );

We finally plug the pipeline and trigger the ICA computation with the method Update() of the
writer.
FastICAfilter ->SetInput(reader ->GetOutput ());
writer ->SetInput(FastICAfilter ->GetOutput ());
writer ->Update();
otb::FastICAImageFilter allows also to compute inverse transformation from ICA coefficients.
In reverse mode, the covariance matrix or the transformation matrix (which may not be square) has
to be given.
typedef otb::FastICAImageFilter < ImageType , ImageType ,
otb::Transform ::INVERSE > InvFastICAFilterType;
InvFastICAFilterType::Pointer invFilter = InvFastICAFilterType::New();
invFilter ->SetMeanValues( FastICAfilter ->GetMeanValues() );
invFilter ->SetStdDevValues( FastICAfilter ->GetStdDevValues() );
invFilter ->SetTransformationMatrix( FastICAfilter ->GetTransformationMatrix() );
invFilter ->SetPCATransformationMatrix(
FastICAfilter ->GetPCATransformationMatrix() );
invFilter ->SetInput (FastICAfilter ->GetOutput ());
Figure 18.4 shows the result of applying forward and reverse FastICA transformation to a 8 bands
Worldview2 image.
18.5 Maximum Autocorrelation Factor
Examples/DimensionReduction/MaximumAutocorrelationFactor.cxx.
This example illustrates the class otb::MaximumAutocorrelationFactorImageFilter, which
performs a Maximum Autocorrelation Factor transform [103]. Like PCA, MAF tries to find a set of
orthogonal linear transform, but the criterion to maximize is the spatial auto-correlation rather than
the variance.
Auto-correlation is the correlation between the component and a unitary shifted version of the com-
ponent.

18.5. Maximum Autocorrelation Factor 453
Figure 18.4: Result of applying the otb::FastICAImageFilter to an image. From left to right: original
image, color composition with first three principal components and output of the inverse mode (the input RGB
image).
Please note that the inverse transform is not implemented yet.
We start by including the corresponding header file.
#include "otbMaximumAutocorrelationFactorImageFilter.h"
We then define the types for the input image and the output image.
typedef otb::VectorImage <unsigned short, 2> InputImageType;
typedef otb::VectorImage <double, 2> OutputImageType;
We can now declare the types for the reader. Since the images can be very large, we will force the
pipeline to use streaming. For this purpose, the file writer will be streamed. This is achieved by
using the otb::ImageFileWriter class.
The otb::MultivariateAlterationDetectorImageFilter is templated over the type of the
input images and the type of the generated change image.
typedef otb::MaximumAutocorrelationFactorImageFilter<InputImageType ,
OutputImageType > FilterType ;
The different elements of the pipeline can now be instantiated.
We set the parameters of the different elements of the pipeline.

Figure 18.5: Results of the Maximum Autocorrelation Factor algorithm applied to a 8 bands Worldview2 image
(3 ﬁrst components).
We build the pipeline by plugging all the elements together.
And then we can trigger the pipeline update, as usual.
writer ->Update();
Figure 18.5 shows the results of Maximum Autocorrelation Factor applied to an 8 bands Worldview2
image.

CHAPTER
NINETEEN
CLASSIFICATION
19.1 Introduction
Image classification consists in extracting added-value information from images. Such processing
methods classify pixels within images into geographical connected zones with similar properties,
and identified by a common class label. The classification can be either unsupervised or supervised.
Unsupervised classification does not require any additional information about the properties of the
input image to classify it. On the contrary, supervised methods need a preliminary learning to be
computed over training datasets having similar properties than the image to classify, in order to build
a classification model.
19.2 Unsupervised classification
19.2.1 K-Means Classification
Simple version
Examples/Classification/ScalarImageKmeansClassifier.cxx.
This example shows how to use the KMeans model for classifying the pixel of a scalar image.
The itk::Statistics::ScalarImageKmeansImageFilter is used for taking a scalar image and
applying the K-Means algorithm in order to define classes that represents statistical distributions of
intensity values in the pixels. The classes are then used in this filter for generating a labeled image
where every pixel is assigned to one of the classes.

456 Chapter 19. Classification
#include "itkScalarImageKmeansImageFilter.h"
First we define the pixel type and dimension of the image that we intend to classify. With this image
type we can also declare the otb::ImageFileReader needed for reading the input image, create
one and set its input filename.
typedef signed short PixelType ;
reader ->SetFileName (inputImageFileName);
With the ImageType we instantiate the type of the itk::ScalarImageKmeansImageFilter that
will compute the K-Means model and then classify the image pixels.
typedef itk::ScalarImageKmeansImageFilter<ImageType > KMeansFilterType;
KMeansFilterType::Pointer kmeansFilter = KMeansFilterType::New ();
kmeansFilter ->SetInput(reader ->GetOutput ());
const unsigned int numberOfInitialClasses = atoi(argv[4]);
In general the classification will produce as output an image whose pixel values are integers as-
sociated to the labels of the classes. Since typically these integers will be generated in order (0,
1, 2, ...N), the output image will tend to look very dark when displayed with naive viewers. It is
therefore convenient to have the option of spreading the label values over the dynamic range of
the output image pixel type. When this is done, the dynamic range of the pixels is divided by the
number of classes in order to define the increment between labels. For example, an output image
of 8 bits will have a dynamic range of [0:255], and when it is used for holding four classes, the
non-contiguous labels will be (0, 64, 128, 192). The selection of the mode to use is done with the
method SetUseContiguousLabels().
const unsigned int useNonContiguousLabels = atoi(argv[3]);
kmeansFilter ->SetUseNonContiguousLabels(useNonContiguousLabels);
For each one of the classes we must provide a tentative initial value for the mean of the class. Given
that this is a scalar image, each one of the means is simply a scalar value. Note however that in a
general case of K-Means, the input image would be a vector image and therefore the means will be
vectors of the same dimension as the image pixels.
for (unsigned k = 0; k < numberOfInitialClasses; ++k)
{
const double userProvidedInitialMean = atof(argv[k + argoffset ]);
kmeansFilter ->AddClassWithInitialMean(userProvidedInitialMean);
}

19.2. Unsupervised classification 457
The itk::ScalarImageKmeansImageFilter is predefined for producing an 8 bits scalar image
as output. This output image contains labels associated to each one of the classes in the K-Means
algorithm. In the following lines we use the OutputImageType in order to instantiate the type of a
otb::ImageFileWriter. Then create one, and connect it to the output of the classification filter.
typedef KMeansFilterType::OutputImageType OutputImageType;
writer ->SetInput(kmeansFilter ->GetOutput ());
writer ->SetFileName (outputImageFileName);
We are now ready for triggering the execution of the pipeline. This is done by simply invoking the
Update() method in the writer. This call will propagate the update request to the reader and then to
the classifier.
try
{
writer ->Update ();
}
catch (itk:: ExceptionObject& excp)
{
std::cerr << "Problem encountered while writing ";
std::cerr << " image file : " << argv[2] << std::endl;
std::cerr << excp << std::endl;
}
At this point the classification is done, the labeled image is saved in a file, and we can take a look at
the means that were found as a result of the model estimation performed inside the classifier filter.
KMeansFilterType:: ParametersType estimatedMeans =
kmeansFilter ->GetFinalMeans();
const unsigned int numberOfClasses = estimatedMeans.Size();
for (unsigned int i = 0; i < numberOfClasses; ++i)
{
std::cout << "cluster[" << i << "] ";
std::cout << " estimated mean : " << estimatedMeans[i] << std::endl;
}
Figure 19.1 illustrates the effect of this filter with three classes. The means can be estimated by
ScalarImageKmeansModelEstimator.cxx.
Examples/Classification/ScalarImageKmeansModelEstimator.cxx.
This example shows how to compute the KMeans model of an Scalar Image.

Figure 19.1: Effect of the KMeans classifier. Left: original image. Right: image of classes.
The itk::Statistics::KdTreeBasedKmeansEstimator is used for taking a scalar image and
applying the K-Means algorithm in order to define classes that represents statistical distributions of
intensity values in the pixels. One of the drawbacks of this technique is that the spatial distribution
of the pixels is not considered at all. It is common therefore to combine the classification resulting
from K-Means with other segmentation techniques that will use the classification as a prior and add
spatial information to it in order to produce a better segmentation.
// Create a List from the scalar image
typedef itk::Statistics ::ScalarImageToListAdaptor<ImageType > AdaptorType ;
AdaptorType ::Pointer adaptor = AdaptorType ::New();
adaptor ->SetImage(reader ->GetOutput ());
// Define the Measurement vector type from the AdaptorType
typedef AdaptorType ::MeasurementVectorType MeasurementVectorType;
// Create the K-d tree structure
typedef itk::Statistics :: WeightedCentroidKdTreeGenerator<
AdaptorType >
TreeGeneratorType;
TreeGeneratorType::Pointer treeGenerator = TreeGeneratorType::New();
treeGenerator ->SetSample (adaptor );
treeGenerator ->SetBucketSize(16);
treeGenerator ->Update();
typedef TreeGeneratorType::KdTreeType TreeType;
typedef itk::Statistics ::KdTreeBasedKmeansEstimator<TreeType > EstimatorType;
const unsigned int numberOfClasses = 4;

EstimatorType::ParametersType initialMeans(numberOfClasses);
initialMeans[0] = 25.0;
estimator ->SetParameters(initialMeans);
estimator ->SetKdTree (treeGenerator ->GetOutput ());
estimator ->SetMaximumIteration(200);
estimator ->SetCentroidPositionChangesThreshold(0.0);
estimator ->StartOptimization();
EstimatorType::ParametersType estimatedMeans = estimator ->GetParameters();
{
std::cout << "cluster[" << i << "] " << std::endl;
}
General approach
Examples/Classification/KMeansImageClassificationExample.cxx.
The K-Means classification proposed by ITK for images is limited to scalar im-
ages and is not streamed. In this example, we show how the use of the
otb::KMeansImageClassificationFilter allows for a simple implementation of a K-Means
classification application. We will start by including the appropirate header file.
#include "otbKMeansImageClassificationFilter.h"
We will assume double precision input images and will also define the type for the labeled pixels.
typedef unsigned short LabeledPixelType;
Our classifier will be genric enough to be able to process images with any number of bands. We
read the images as otb::VectorImages. The labeled image will be a scalar image.
typedef otb::Image <LabeledPixelType , Dimension > LabeledImageType;
We can now define the type for the classifier filter, which is templated over its input and output
image types.
typedef otb::KMeansImageClassificationFilter<ImageType , LabeledImageType >
ClassificationFilterType;
typedef ClassificationFilterType::KMeansParametersType KMeansParametersType;

And finally, we define the reader and the writer. Since the images to classify can be very big, we
will use a streamed writer which will trigger the streaming ability of the classifier.
typedef otb::ImageFileWriter <LabeledImageType > WriterType ;
We instantiate the classifier and the reader objects and we set their parameters. Please note the
call of the GenerateOutputInformation() method on the reader in order to have available the
information about the input image (size, number of bands, etc.) without needing to actually read the
image.
ClassificationFilterType::Pointer filter = ClassificationFilterType::New();
The classifier needs as input the centroids of the classes. We declare the parameter vector, and we
read the centroids from the arguments of the program.
const unsigned int sampleSize =
ClassificationFilterType::MaxSampleDimension;
const unsigned int parameterSize = nbClasses * sampleSize ;
KMeansParametersType parameters ;
parameters .SetSize(parameterSize);
parameters .Fill(0);
for (unsigned int i = 0; i < nbClasses ; ++i)
{
for (unsigned int j = 0; j <
reader ->GetOutput ()->GetNumberOfComponentsPerPixel(); ++j)
{
parameters [i * sampleSize + j] =
atof(argv[4 + i *
reader ->GetOutput ()->GetNumberOfComponentsPerPixel()
+ j]);
}
}
std::cout << "Parameters : " << parameters << std::endl;
We set the parameters for the classifier, we plug the pipeline and trigger its execution by updating
the output of the writer.
filter ->SetCentroids(parameters );
writer ->Update();

k-d Tree Based k-Means Clustering
Examples/Classification/KdTreeBasedKMeansClustering.cxx.
K-means clustering is a popular clustering algorithm because it is simple and usually converges to a
reasonable solution. The k-means algorithm works as follows:
1. Obtains the initial k means input from the user.
2. Assigns each measurement vector in a sample container to its closest mean among the k
number of means (i.e., update the membership of each measurement vectors to the nearest
of the k clusters).
3. Calculates each cluster’s mean from the newly assigned measurement vectors (updates the
centroid (mean) of k clusters).
4. Repeats step 2 and step 3 until it meets the termination criteria.
The most common termination criteria is that if there is no measurement vector that changes its
cluster membership from the previous iteration, then the algorithm stops.
The itk::Statistics::KdTreeBasedKmeansEstimator is a variation of this logic. The k-means
clustering algorithm is computationally very expensive because it has to recalculate the mean at each
iteration. To update the mean values, we have to calculate the distance between k means and each
and every measurement vector. To reduce the computational burden, the KdTreeBasedKmeansEs-
timator uses a special data structure: the k-d tree ( itk::Statistics::KdTree) with additional
information. The additional information includes the number and the vector sum of measurement
vectors under each node under the tree architecture.
With such additional information and the k-d tree data structure, we can reduce the computational
cost of the distance calculation and means. Instead of calculating each measurement vectors and k
means, we can simply compare each node of the k-d tree and the k means. This idea of utilizing a
k-d tree can be found in multiple articles [6] [107] [78]. Our implementation of this scheme follows
the article by the Kanungo et al [78].
We use the itk::Statistics::ListSample as the input sample, the itk::Vector as the mea-
surement vector. The following code snippet includes their header ﬁles.
#include "itkListSample.h"
Since this k-means algorithm requires a itk::Statistics::KdTree object as an in-
put, we include the KdTree class header ﬁle. As mentioned above, we need a
k-d tree with the vector sum and the number of measurement vectors. There-
fore we use the itk::Statistics::WeightedCentroidKdTreeGenerator instead of the
itk::Statistics::KdTreeGenerator that generate a k-d tree without such additional informa-
tion.

#include "itkKdTree .h"
#include "itkWeightedCentroidKdTreeGenerator.h"
The KdTreeBasedKmeansEstimator class is the implementation of the k-means algorithm. It does
not create k clusters. Instead, it returns the mean estimates for the k clusters.
#include "itkKdTreeBasedKmeansEstimator.h"
To generate the clusters, we must create k instances of itk::Statistics::EuclideanDistance
function as the membership functions for each cluster and plug that—along with a sample—into an
itk::Statistics::SampleClassifier object to get a itk::Statistics::MembershipSample
that stores pairs of measurement vectors and their associated class labels (k labels).
#include "itkMinimumDecisionRule.h"
#include "itkEuclideanDistance.h"
#include "itkSampleClassifier.h"
We will fill the sample with random variables from two normal distribution using the
itk::Statistics::NormalVariateGenerator.
#include "itkNormalVariateGenerator.h"
Since the NormalVariateGenerator class only supports 1-D, we define our measurement vector
type as one component vector. We then, create a ListSample object for data inputs. Each measure-
ment vector is of length 1. We set this using the SetMeasurementVectorSize() method.
typedef itk::Vector <double,
1>
MeasurementVectorType;
typedef itk::Statistics ::ListSample <MeasurementVectorType > SampleType ;
SampleType ::Pointer sample = SampleType ::New ();
sample ->SetMeasurementVectorSize(1);
The following code snippet creates a NormalVariateGenerator object. Since the random variable
generator returns values according to the standard normal distribution (The mean is zero, and the
standard deviation is one), before pushing random values into the sample, we change the mean
and standard deviation. We want two normal (Gaussian) distribution data. We have two for loops.
Each for loop uses different mean and standard deviation. Before we fill the sample with the sec-
ond distribution data, we call Initialize(random seed) method, to recreate the pool of random
variables in the normalGenerator.
To see the probability density plots from the two distribution, refer to the Figure 19.2.
typedef itk::Statistics :: NormalVariateGenerator NormalGeneratorType;
NormalGeneratorType::Pointer normalGenerator = NormalGeneratorType::New ();
normalGenerator ->Initialize (101);
MeasurementVectorType mv;

Figure 19.2: Two normal distributions’ probability density plot (The means are 100 and 200, and the standard
deviation is 30 )
double mean = 100;
double standardDeviation = 30;
{
mv[0] = (normalGenerator ->GetVariate () * standardDeviation) + mean;
sample ->PushBack(mv);
}
mean = 200;
standardDeviation = 30;
{
sample ->PushBack(mv);
}
We create a k-d tree.
typedef itk::Statistics :: WeightedCentroidKdTreeGenerator<SampleType >
TreeGeneratorType;
TreeGeneratorType::Pointer treeGenerator = TreeGeneratorType::New ();
treeGenerator ->SetSample (sample);
treeGenerator ->SetBucketSize(16);

treeGenerator ->Update();
Once we have the k-d tree, it is a simple procedure to produce k mean estimates.
We create the KdTreeBasedKmeansEstimator. Then, we provide the initial mean values using the
SetParameters(). Since we are dealing with two normal distribution in a 1-D space, the size of
the mean value array is two. The first element is the first mean value, and the second is the second
mean value. If we used two normal distributions in a 2-D space, the size of array would be four, and
the first two elements would be the two components of the first normal distribution’s mean vector.
We plug-in the k-d tree using the SetKdTree().
The remaining two methods specify the termination condition. The estimation process stops when
the number of iterations reaches the maximum iteration value set by the SetMaximumIteration(),
or the distances between the newly calculated mean (centroid) values and previous ones are within
the threshold set by the SetCentroidPositionChangesThreshold(). The final step is to call the
StartOptimization() method.
The for loop will print out the mean estimates from the estimation process.
typedef TreeGeneratorType::KdTreeType TreeType;
typedef itk::Statistics ::KdTreeBasedKmeansEstimator<TreeType > EstimatorType;
EstimatorType::ParametersType initialMeans(2);
estimator ->SetParameters(initialMeans);
estimator ->SetKdTree (treeGenerator ->GetOutput ());
estimator ->SetCentroidPositionChangesThreshold(0.0);
estimator ->StartOptimization();
EstimatorType::ParametersType estimatedMeans = estimator ->GetParameters();
{
std::cout << "cluster[" << i << "] " << std::endl;
}
If we are only interested in finding the mean estimates, we might stop. However, to illustrate how
a classifier can be formed using the statistical classification framework. We go a little bit further in
this example.
Since the k-means algorithm is an minimum distance classifier using the estimated k means and the
measurement vectors. We use the EuclideanDistance class as membership functions. Our choice for
the decision rule is the itk::Statistics::MinimumDecisionRule that returns the index of the
membership functions that have the smallest value for a measurement vector.
After creating a SampleClassifier object and a MinimumDecisionRule object, we plug-in the

decisionRule and the sample to the classifier. Then, we must specify the number of classes that
will be considered using the SetNumberOfClasses() method.
The remainder of the following code snippet shows how to use user-specified class labels. The
classification result will be stored in a MembershipSample object, and for each measurement vector,
its class label will be one of the two class labels, 100 and 200 (unsigned int).
typedef itk::Statistics :: EuclideanDistance
<MeasurementVectorType > MembershipFunctionType;
typedef itk::MinimumDecisionRule DecisionRuleType;
DecisionRuleType::Pointer decisionRule = DecisionRuleType::New ();
typedef itk::Statistics ::SampleClassifier <SampleType > ClassifierType;
ClassifierType::Pointer classifier = ClassifierType::New ();
classifier ->SetDecisionRule((itk::DecisionRuleBase::Pointer) decisionRule);
classifier ->SetSample (sample);
classifier ->SetNumberOfClasses(2);
std::vector <unsigned int> classLabels ;
classLabels .resize (2);
classLabels [0] = 100;
classifier ->SetMembershipFunctionClassLabels(classLabels );
The classifier is almost ready to do the classification process except that it needs two membership
functions that represents two clusters respectively.
In this example, the two clusters are modeled by two Euclidean distance functions. The distance
function (model) has only one parameter, its mean (centroid) set by the SetOrigin() method. To
plug-in two distance functions, we call the AddMembershipFunction() method. Then invocation
of the Update() method will perform the classification.
std::vector <MembershipFunctionType::Pointer> membershipFunctions;
MembershipFunctionType:: OriginType origin(
sample ->GetMeasurementVectorSize());
int index = 0;
{
membershipFunctions.push_back (MembershipFunctionType::New ());
for (unsigned int j = 0; j < sample ->GetMeasurementVectorSize(); ++j)
{
origin[j] = estimatedMeans[index++];
}
membershipFunctions[i]->SetOrigin (origin);
classifier ->AddMembershipFunction(membershipFunctions[i].GetPointer ());
}
classifier ->Update ();
The following code snippet prints out the measurement vectors and their class labels in the sample.

mi1
mi3
min
mi2
mi
m1 m2
mM
Figure 19.3: Kohonen’s Self Organizing Map
ClassifierType::OutputType * membershipSample =
classifier ->GetOutput ();
ClassifierType::OutputType ::ConstIterator iter = membershipSample ->Begin();
while (iter != membershipSample ->End ())
{
std::cout << "measurement vector = " << iter.GetMeasurementVector()
<< "class label = " << iter.GetClassLabel()
<< std::endl;
++iter;
}
19.2.2 Kohonen’s Self Organizing Map
The Self Organizing Map, SOM, introduced by Kohonen is a non-supervised neural learning algo-
rithm. The map is composed of neighboring cells which are in competition by means of mutual
interactions and they adapt in order to match characteristic patterns of the examples given during the
learning. The SOM is usually on a plane (2D).
The algorithm implements a nonlinear projection from a high dimensional feature space to a lower
dimension space, usually 2D. It is able to find the correspondence between a set of structured data
and a network of much lower dimension while keeping the topological relationships existing in the
feature space. Thanks to this topological organization, the final map presents clusters and their
relationships.
Kohonen’s SOM is usually represented as an array of cells where each cell is, i, associated to a
feature (or weight) vector mi = [mi1,mi2,··· ,min]T
∈ Rn (figure 19.3).
A cell (or neuron) in the map is a good detector for a given input vector x = [x1,x2,··· ,xn]T
∈ Rn
if the latter is close to the former. This distance between vectors can be represented by the scalar

product xT ·mi, but for most of the cases other distances can be used, as for instance the Euclidean
one. The cell having the weight vector closest to the input vector is called the winner.
The goal of the learning step is to get a map which is representative of an input example set. It is an
iterative procedure which consists in passing each input example to the map, testing the response of
each neuron and modifying the map to get it closer to the examples.
Algorithm 1 SOM learning:
1. t = 0.
2. Initialize the weight vectors of the map (randomly, for instance).
3. While t < number of iterations, do:
(a) k = 0.
(b) While k < number of examples, do:
i. Find the vector mi(t) which minimizes the distance d(xk,mi(t))
ii. For a neighborhood Nc(t) around the winner cell, apply the transformation:
mi(t + 1) = mi(t)+ β(t)[xk(t)− mi(t)] (19.1)
iii. k = k + 1
(c) t = t + 1.
In 19.1, β(t) is a decreasing function with the geometrical distance to the winner cell. For instance:
β(t) = β0(t)e
−
ri−rc
2
σ2(t) , (19.2)
with β0(t) and σ(t) decreasing functions with time and r the cell coordinates in the output – map –
space.
Therefore the algorithm consists in getting the map closer to the learning set. The use of a neigh-
borhood around the winner cell allows the organization of the map into areas which specialize in the
recognition of different patterns. This neighborhood also ensures that cells which are topologically
close are also close in terms of the distance deﬁned in the feature space.
Building a color table
Examples/Learning/SOMExample.cxx.
This example illustrates the use of the otb::SOM class for building Kohonen’s Self Organizing
Maps.

We will use the SOM in order to build a color table from an input image. Our input image is coded
with 3 × 8 bits and we would like to code it with only 16 levels. We will use the SOM in order to
learn which are the 16 most representative RGB values of the input image and we will assume that
this is the optimal color table for the image.
The first thing to do is include the header file for the class. We will also need the header files for the
map itself and the activation map builder whose utility will be explained at the end of the example.
#include "otbSOMMap .h"
#include "otbSOM.h"
#include "otbSOMActivationBuilder.h"
Since the otb::SOM class uses a distance, we will need to include the header file for the one we
want to use
#include "itkEuclideanDistance.h"
The Self Organizing Map itself is actually an N-dimensional image where each pixel contains a
neuron. In our case, we decide to build a 2-dimensional SOM, where the neurons store RGB values
with floating point precision.
typedef ImageType ::PixelType VectorType ;
The distance that we want to apply between the RGB values is the Euclidean one. Of course we
could choose to use other type of distance, as for instance, a distance defined in any other color
space.
typedef itk::Statistics ::EuclideanDistance <VectorType > DistanceType;
We can now define the type for the map. The otb::SOMMap::class is templated over the neuron
type – PixelType here –, the distance type and the number of dimensions. Note that the number of
dimensions of the map could be different from the one of the images to be processed.
typedef otb::SOMMap <VectorType , DistanceType , Dimension > MapType;
We are going to perform the learning directly on the pixels of the input image. Therefore, the image
type is defined using the same pixel type as we used for the map. We also define the type for the
imge file reader.
Since the otb::SOM class works on lists of samples, it will need to access the input image through
an adaptor. Its type is defined as follows:
typedef itk::Statistics ::ListSample <VectorType > SampleListType;

We can now define the type for the SOM, which is templated over the input sample list and the type
of the map to be produced and the two functors that hold the training behavior.
typedef otb::Functor ::CzihoSOMLearningBehaviorFunctor
LearningBehaviorFunctorType;
typedef otb::Functor ::CzihoSOMNeighborhoodBehaviorFunctor
NeighborhoodBehaviorFunctorType;
typedef otb::SOM<SampleListType , MapType ,
LearningBehaviorFunctorType, NeighborhoodBehaviorFunctorType>
SOMType;
As an alternative to standart SOMType, one can decide to use an otb::PeriodicSOM, which behaves
like otb::SOM but is to be considered to as a torus instead of a simple map. Hence, the neighborhood
behavior of the winning neuron does not depend on its location on the map... otb::PeriodicSOM
is defined in otbPeriodicSOM.h.
We can now start building the pipeline. The first step is to instantiate the reader and pass its output
to the adaptor.
reader ->Update();
SampleListType::Pointer sampleList = SampleListType::New ();
sampleList ->SetMeasurementVectorSize(reader ->GetOutput ()->GetVectorLength());
itk::ImageRegionIterator <ImageType > imgIter (reader ->GetOutput (),
GetBufferedRegion());
imgIter.GoToBegin ();
itk::ImageRegionIterator <ImageType > imgIterEnd (reader ->GetOutput (),
GetBufferedRegion());
imgIterEnd .GoToEnd ();
do
{
sampleList ->PushBack(imgIter.Get());
++imgIter;
}
while (imgIter != imgIterEnd );
We can now instantiate the SOM algorithm and set the sample list as input.
SOMType ::Pointer som = SOMType::New ();
som->SetListSample(sampleList );
We use a SOMType::SizeType array in order to set the sizes of the map.
SOMType ::SizeType size;
size[0] = sizeX;

size[1] = sizeY;
som->SetMapSize (size);
The initial size of the neighborhood of each neuron is set in the same way.
SOMType ::SizeType radius;
radius[0] = neighInitX ;
radius[1] = neighInitY ;
som->SetNeighborhoodSizeInit(radius);
The other parameters are the number of iterations, the initial and the ﬁnal values for the learning rate
– β – and the maximum initial value for the neurons (the map will be randomly initialized).
som->SetNumberOfIterations(nbIterations);
som->SetBetaInit (betaInit );
som->SetBetaEnd (betaEnd);
som->SetMaxWeight(static_cast<PixelType >(initValue ));
Now comes the intialization of the functors.
LearningBehaviorFunctorType learningFunctor;
learningFunctor.SetIterationThreshold(radius , nbIterations);
som->SetBetaFunctor(learningFunctor);
NeighborhoodBehaviorFunctorType neighborFunctor;
som->SetNeighborhoodSizeFunctor(neighborFunctor);
som->Update ();
Finally, we set up the las part of the pipeline where the plug the output of the SOM into the writer.
The learning procedure is triggered by calling the Update() method on the writer. Since the map is
itself an image, we can write it to disk with an otb::ImageFileWriter.
Figure 19.4 shows the result of the SOM learning. Since we have performed a learning on RGB
pixel values, the produced SOM can be interpreted as an optimal color table for the input image. It
can be observed that the obtained colors are topologically organised, so similar colors are also close
in the map. This topological organisation can be exploited to further reduce the number of coding
levels of the pixels without performing a new learning: we can subsample the map to get a new
color table. Also, a bilinear interpolation between the neurons can be used to increase the number
of coding levels.
We can now compute the activation map for the input image. The activation map tells us how many
times a given neuron is activated for the set of examples given to the map. The activation map is
stored as a scalar image and an integer pixel type is usually enough.
typedef otb::ImageFileWriter <OutputImageType > ActivationWriterType;

Figure 19.4: Result of the SOM learning. Left: RGB image. Center: SOM. Right: Activation map
In a similar way to the otb::SOM class the otb::SOMActivationBuilder is templated over the
sample list given as input, the SOM map type and the activation map to be built as output.
typedef otb::SOMActivationBuilder <SampleListType , MapType,
OutputImageType > SOMActivationBuilderType;
We instantiate the activation map builder and set as input the SOM map build before and the image
(using the adaptor).
SOMActivationBuilderType::Pointer somAct
= SOMActivationBuilderType::New();
somAct ->SetInput(som->GetOutput ());
somAct ->SetListSample(sampleList );
somAct ->Update();
The final step is to write the activation map to a file.
if (actMapFileName != NULL)
{
ActivationWriterType::Pointer actWriter = ActivationWriterType::New ();
actWriter ->SetFileName (actMapFileName);
The righthand side of figure 19.4 shows the activation map obtained.
SOM Classification
Examples/Learning/SOMClassifierExample.cxx.
This example illustrates the use of the otb::SOMClassifier class for performing a classification
using an existing Kohonen’s Self Organizing. Actually, the SOM classification consists only in the
attribution of the winner neuron index to a given feature vector.

We will use the SOM created in section 19.2.2 and we will assume that each neuron represents a
class in the image.
The first thing to do is include the header file for the class.
#include "otbSOMClassifier.h"
As for the SOM learning step, we must define the types for the otb::SOMMap, and therefore, also
for the distance to be used. We will also define the type for the SOM reader, which is actually an
otb::ImageFileReader::which the appropiate image type.
typedef itk::Statistics ::EuclideanDistance <PixelType > DistanceType;
typedef otb::SOMMap <PixelType , DistanceType , Dimension > SOMMapType ;
typedef otb::ImageFileReader <SOMMapType > SOMReaderType;
The classification will be performed by the otb::SOMClassifier::, which, as most of the clas-
sifiers, works on itk::Statistics::ListSamples. In order to be able to perform an image clas-
sification, we will need to use the itk::Statistics::ImageToListAdaptor which is templated
over the type of image to be adapted. The SOMClassifier is templated over the sample type, the
SOMMap type and the pixel type for the labels.
typedef itk::Statistics ::ListSample <PixelType > SampleType ;
typedef otb::SOMClassifier <SampleType , SOMMapType , LabelPixelType >
ClassifierType;
The result of the classification will be stored on an image and saved to a file. Therefore, we define
the types needed for this step.
typedef otb::Image <LabelPixelType , Dimension > OutputImageType;
We can now start reading the input image and the SOM given as inputs to the program. We instantiate
the readers as usual.
reader ->SetFileName (imageFilename);
reader ->Update();
SOMReaderType::Pointer somreader = SOMReaderType::New();
somreader ->SetFileName (mapFilename );
somreader ->Update ();
The conversion of the input data from image to list sample is easily done using the adaptor.
itk::ImageRegionIterator <InputImageType > it(reader ->GetOutput (),

it.GoToBegin ();
{
sample ->PushBack(it.Get ());
++it;
}
The classifier can now be instantiated. The input data is set by using the SetSample() method and
the SOM si set using the SetMap() method. The classification is triggered by using the Update()
method.
classifier ->SetSample (sample.GetPointer ());
classifier ->SetMap(somreader ->GetOutput ());
Once the classification has been performed, the sample list obtained at the output of the classifier
must be converted into an image. We create the image as follows :
We can now get a pointer to the classification result.
ClassifierType::OutputType * membershipSample = classifier ->GetOutput ();
And we can declare the iterators pointing to the front and the back of the sample list.
ClassifierType::OutputType ::ConstIterator m_iter = membershipSample ->Begin();
ClassifierType::OutputType ::ConstIterator m_last = membershipSample ->End();
We also declare an itk::ImageRegionIterator::in order to fill the output image with the class
labels.
typedef itk::ImageRegionIterator <OutputImageType > OutputIteratorType;
OutputIteratorType outIt(outputImage , outputImage ->GetLargestPossibleRegion());
We iterate through the sample list and the output image and assign the label values to the image
pixels.
outIt.GoToBegin ();
while (m_iter != m_last && !outIt.IsAtEnd ())
{
outIt.Set(m_iter.GetClassLabel());
++m_iter;
++outIt;
}

Figure 19.5: Result of the SOM learning. Left: RGB image. Center: SOM. Right: Classified Image
Finally, we write the classified image to a file.
writer ->Update();
Figure 19.5 shows the result of the SOM classification.
Multi-band, streamed classification
Examples/Classification/SOMImageClassificationExample.cxx.
In previous examples, we have used the otb::SOMClassifier, which uses the ITK classifica-
tion framework. This good for compatibility with the ITK framework, but introduces the limita-
tions of not being able to use streaming and being able to know at compilation time the number
of bands of the image to be classified. In OTB we have avoided this limitation by developing the
otb::SOMImageClassificationFilter. In this example we will illustrate its use. We start by
including the appropriate header file.
#include "otbSOMImageClassificationFilter.h"
Our classifier will be genric enough to be able to process images with any number of bands. We
read the images as otb::VectorImages. The labeled image will be a scalar image.

image types and the SOM type.
typedef otb::SOMMap <ImageType ::PixelType > SOMMapType ;
typedef otb::SOMImageClassificationFilter<ImageType ,
LabeledImageType ,
SOMMapType >
And finally, we define the readers (for the input image and theSOM) and the writer. Since the
images, to classify can be very big, we will use a streamed writer which will trigger the streaming
ability of the classifier.
typedef otb::ImageFileReader <SOMMapType > SOMReaderType;
We instantiate the classifier and the reader objects and we set the existing SOM obtained in a previ-
ous training step.
SOMReaderType::Pointer somreader = SOMReaderType::New();
somreader ->SetFileName (somfname );
somreader ->Update ();
filter ->SetMap(somreader ->GetOutput ());
We plug the pipeline and trigger its execution by updating the output of the writer.
writer ->Update();
19.2.3 Bayesian Plug-In Classifier
Examples/Classification/BayesianPluginClassifier.cxx.
In this example, we present a system that places measurement vectors into two Gaussian classes.
The Figure 19.6 shows all the components of the classifier system and the data flow. This system
differs with the previous k-means clustering algorithms in several ways. The biggest difference
is that this classifier uses the itk::Statistics::GaussianDensityFunctions as membership

(Parameterestimation)
(Parameterestimation)
Sample(Labeled)
Subsample (Class sample) Subsample (Class sample)
CovarianceCalculator
MeanCalculator
Covariance matrix
Mean
CovarianceCalculator
MeanCalculator
Mean
Measurement
vectors
Covariance matrix
GaussianDensityFunction GaussianDensityFunction
Probability density
Sample size Sample size
GaussianDensityFunction
Index of winning
SampleClassifier
Sample(Test)
MaximumRatioDecisionRule
Sample size
Sample (Training)
Figure 19.6: Bayesian plug-in classiﬁer for two Gaussian classes.
functions instead of the itk::Statistics::EuclideanDistance. Since the membership func-
tion is different, the membership function requires a different set of parameters, mean vectors and
covariance matrices. We choose the itk::Statistics::MeanCalculator (sample mean) and the
itk::Statistics::CovarianceCalculator (sample covariance) for the estimation algorithms of
the two parameters. If we want more robust estimation algorithm, we can replace these estimation
algorithms with more alternatives without changing other components in the classiﬁer system.
It is a bad idea to use the same sample for test and training (parameter estimation) of the parameters.
However, for simplicity, in this example, we use a sample for test and training.
We use the itk::Statistics::ListSample as the sample (test and training). The itk::Vector
is our measurement vector class. To store measurement vectors into two separate sample containers,
we use the itk::Statistics::Subsample objects.

#include "itkSubsample.h"
The following two files provides us the parameter estimation algorithms.
#include "itkMeanCalculator.h"
#include "itkCovarianceCalculator.h"
The following files define the components required by ITK statistical classification framework: the
decision rule, the membership function, and the classifier.
#include "itkMaximumRatioDecisionRule.h"
#include "itkGaussianDensityFunction.h"
#include "itkSampleClassifier.h"
Since the NormalVariateGenerator class only supports 1-D, we define our measurement vector type
as a one component vector. We then, create a ListSample object for data inputs.
We also create two Subsample objects that will store the measurement vectors in sample into two
separate sample containers. Each Subsample object stores only the measurement vectors belonging
to a single class. This class sample will be used by the parameter estimation algorithms.
1>
sample ->SetMeasurementVectorSize(1); // length of measurement vectors
// in the sample.
typedef itk::Statistics ::Subsample <SampleType > ClassSampleType;
std::vector <ClassSampleType::Pointer> classSamples;
{
classSamples.push_back (ClassSampleType::New ());
classSamples[i]->SetSample (sample);
}
generator returns values according to the standard normal distribution (the mean is zero, and the
standard deviation is one) before pushing random values into the sample, we change the mean and
standard deviation. We want two normal (Gaussian) distribution data. We have two for loops. Each
for loop uses different mean and standard deviation. Before we fill the sample with the second dis-
tribution data, we call Initialize(random seed) method, to recreate the pool of random variables

in the normalGenerator. In the second for loop, we ﬁll the two class samples with measurement
vectors using the AddInstance() method.
To see the probability density plots from the two distributions, refer to Figure 19.2.
double mean = 100;
SampleType :: InstanceIdentifier id = 0UL;
{
mv.Fill((normalGenerator ->GetVariate () * standardDeviation) + mean);
sample ->PushBack (mv);
classSamples[0]->AddInstance (id);
++id;
}
mean = 200;
{
mv.Fill((normalGenerator ->GetVariate () * standardDeviation) + mean);
classSamples[1]->AddInstance (id);
++id;
}
In the following code snippet, notice that the template argument for the MeanCalculator and Covari-
anceCalculator is ClassSampleType (i.e., type of Subsample) instead of SampleType (i.e., type of
ListSample). This is because the parameter estimation algorithms are applied to the class sample.
typedef itk::Statistics ::MeanCalculator <ClassSampleType > MeanEstimatorType;
typedef itk::Statistics ::CovarianceCalculator <ClassSampleType >
CovarianceEstimatorType;
std::vector <MeanEstimatorType::Pointer> meanEstimators;
std::vector <CovarianceEstimatorType::Pointer> covarianceEstimators;
{
meanEstimators.push_back (MeanEstimatorType::New ());
meanEstimators[i]->SetInputSample(classSamples[i]);
meanEstimators[i]->Update();
covarianceEstimators.push_back (CovarianceEstimatorType::New ());
covarianceEstimators[i]->SetInputSample(classSamples[i]);
covarianceEstimators[i]->SetMean(meanEstimators[i]->GetOutput ());
covarianceEstimators[i]->Update();

}
We print out the estimated parameters.
{
std::cout << "class[" <GetOutput ())
<< " covariance matrix : "
<< *( covarianceEstimators[i]->GetOutput ()) << std::endl;
}
After creating a SampleClassifier object and a MaximumRatioDecisionRule object, we plug in the
decisionRule and the sample to the classifier. Then, we specify the number of classes that will be
considered using the SetNumberOfClasses() method.
The MaximumRatioDecisionRule requires a vector of a priori probability values. Such a priori
probability will be the P(ωi) of the following variation of the Bayes decision rule:
Decide ωi if
p(−→x |ωi)
p(−→x |ωj)
>
P(ωj)
P(ωi)
for all j = i (19.3)
The remainder of the code snippet shows how to use user-specified class labels. The classification
result will be stored in a MembershipSample object, and for each measurement vector, its class label
will be one of the two class labels, 100 and 200 (unsigned int).
typedef itk::Statistics :: GaussianDensityFunction
<MeasurementVectorType > MembershipFunctionType;
typedef itk:: MaximumRatioDecisionRule DecisionRuleType;
DecisionRuleType::Pointer decisionRule = DecisionRuleType::New ();
DecisionRuleType:: APrioriVectorType aPrioris;
aPrioris .push_back (classSamples[0]->GetTotalFrequency()
/ sample ->GetTotalFrequency());
aPrioris .push_back (classSamples[1]->GetTotalFrequency()
/ sample ->GetTotalFrequency());
decisionRule ->SetAPriori (aPrioris );
typedef itk::Statistics ::SampleClassifier <SampleType > ClassifierType;
classifier ->SetDecisionRule((itk::DecisionRuleBase::Pointer) decisionRule);
classifier ->SetSample (sample);
classifier ->SetNumberOfClasses(2);
std::vector <unsigned int> classLabels ;
classLabels .resize (2);
classifier ->SetMembershipFunctionClassLabels(classLabels );

The classifier is almost ready to perform the classification except that it needs two membership
functions that represent the two clusters.
In this example, we can imagine that the two clusters are modeled by two Euclidean distance
functions. The distance function (model) has only one parameter, the mean (centroid) set by the
SetOrigin() method. To plug-in two distance functions, we call the AddMembershipFunction()
method. Then invocation of the Update() method will perform the classification.
std::vector <MembershipFunctionType::Pointer> membershipFunctions;
{
membershipFunctions.push_back (MembershipFunctionType::New ());
membershipFunctions[i]->SetMean(meanEstimators[i]->GetOutput ());
membershipFunctions[i]->
SetCovariance(covarianceEstimators[i]->GetOutput ());
classifier ->AddMembershipFunction(membershipFunctions[i].GetPointer ());
}
classifier ->Update();
The following code snippet prints out pairs of a measurement vector and its class label in the sample.
ClassifierType::OutputType * membershipSample =
classifier ->GetOutput ();
ClassifierType::OutputType ::ConstIterator iter = membershipSample ->Begin();
while (iter != membershipSample ->End ())
{
std::cout << "measurement vector = " << iter.GetMeasurementVector()
<< "class label = " << iter.GetClassLabel() << std::endl;
++iter;
}
19.2.4 Expectation Maximization Mixture Model Estimation
Examples/Classification/ExpectationMaximizationMixtureModelEstimator.cxx.
In this example, we present ITK’s implementation of the expectation maximization (EM) process to
generate parameter estimates for a two Gaussian component mixture model.
The Bayesian plug-in classifier example (see Section 19.2.3) used two Gaussian probability density
functions (PDF) to model two Gaussian distribution classes (two models for two class). However, in
some cases, we want to model a distribution as a mixture of several different distributions. Therefore,
the probability density function (p(x)) of a mixture model can be stated as follows :
p(x) =
c
∑
i=0
αi fi(x) (19.4)

where i is the index of the component, c is the number of components, αi is the proportion of the
component, and fi is the probability density function of the component.
Now the task is to find the parameters(the component PDF’s parameters and the proportion values)
to maximize the likelihood of the parameters. If we know which component a measurement vector
belongs to, the solutions to this problem is easy to solve. However, we don’t know the membership
of each measurement vector. Therefore, we use the expectation of membership instead of the exact
membership. The EM process splits into two steps:
1. E step: calculate the expected membership values for each measurement vector to each
classes.
2. M step: find the next parameter sets that maximize the likelihood with the expected member-
ship values and the current set of parameters.
The E step is basically a step that calculates the a posteriori probability for each measurement vector.
The M step is dependent on the type of each PDF. Most of distributions be-
longing to exponential family such as Poisson, Binomial, Exponential, and Nor-
mal distributions have analytical solutions for updating the parameter set. The
itk::Statistics::ExpectationMaximizationMixtureModelEstimator class assumes
that such type of components.
In the following example we use the itk::Statistics::ListSample as the sample (test and
training). The itk::Vector::is our measurement vector class. To store measurement vectors into
two separate sample container, we use the itk::Statistics::Subsample objects.
The following two files provide us the parameter estimation algorithms.
#include "itkGaussianMixtureModelComponent.h"
#include "itkExpectationMaximizationMixtureModelEstimator.h"
Since the NormalVariateGenerator class only supports 1-D, we define our measurement vector type
as a one component vector. We then, create a ListSample object for data inputs.
We also create two Subsample objects that will store the measurement vectors in the sample into two
separate sample containers. Each Subsample object stores only the measurement vectors belonging
to a single class. This class sample will be used by the parameter estimation algorithms.
unsigned int numberOfClasses = 2;

1>
sample ->SetMeasurementVectorSize(1); // length of measurement vectors
// in the sample.
generator returns values according to the standard normal distribution (the mean is zero, and the
standard deviation is one) before pushing random values into the sample, we change the mean and
standard deviation. We want two normal (Gaussian) distribution data. We have two for loops. Each
for loop uses different mean and standard deviation. Before we ﬁll the sample with the second
distribution data, we call Initialize() method to recreate the pool of random variables in the
normalGenerator. In the second for loop, we ﬁll the two class samples with measurement vectors
using the AddInstance() method.
To see the probability density plots from the two distribution, refer to Figure 19.2.
double mean = 100;
{
}
mean = 200;
{
}
In the following code snippet notice that the template argument for the MeanCalculator and Covari-
anceCalculator is ClassSampleType (i.e., type of Subsample) instead of SampleType (i.e., type of
ListSample). This is because the parameter estimation algorithms are applied to the class sample.
typedef itk::Array <double> ParametersType;
ParametersType params(2);
std::vector <ParametersType > initialParameters(numberOfClasses);
params[0] = 110.0;
params[1] = 800.0;

initialParameters[0] = params;
params[0] = 210.0;
params[1] = 850.0;
initialParameters[1] = params;
typedef itk::Statistics :: GaussianMixtureModelComponent<SampleType >
ComponentType;
std::vector <ComponentType::Pointer > components ;
{
components .push_back (ComponentType::New());
(components [i])->SetSample (sample);
(components [i])->SetParameters(initialParameters[i]);
}
We run the estimator.
typedef itk::Statistics :: ExpectationMaximizationMixtureModelEstimator<
SampleType > EstimatorType;
estimator ->SetSample (sample);
itk::Array <double> initialProportions(numberOfClasses);
initialProportions[0] = 0.5;
initialProportions[1] = 0.5;
estimator ->SetInitialProportions(initialProportions);
{
estimator ->AddComponent((ComponentType::Superclass *)
(components [i]). GetPointer ());
}
estimator ->Update ();
We then print out the estimated parameters.
{
std::cout << "Cluster[" <GetFullParameters()
<< std::endl;
std::cout << " Proportion : ";
std::cout << " " << (*estimator ->GetProportions())[i] << std::endl;
}

19.2.5 Statistical Segmentations
Stochastic Expectation Maximization
The Stochastic Expectation Maximization (SEM) approach is a stochastic version of the EM mixture
estimation seen on section 19.2.4. It has been introduced by [23] to prevent convergence of the EM
approach from local minima. It avoids the analytical maximization issued by integrating a stochastic
sampling procedure in the estimation process. It induces an almost sure (a.s.) convergence to the
algorithm.
From the initial two step formulation of the EM mixture estimation, the SEM may be decomposed
into 3 steps:
1. E-step, calculates the expected membership values for each measurement vector to each
classes.
2. S-step, performs a stochastic sampling of the membership vector to each classes, according
to the membership values computed in the E-step.
3. M-step, updates the parameters of the membership probabilities (parameters to be defined
through the class itk::Statistics::ModelComponentBase and its inherited classes).
The implementation of the SEM has been turned to a contextual SEM in the sense where the evalua-
tion of the membership parameters is conditioned to membership values of the spatial neighborhood
of each pixels.
Examples/Learning/SEMModelEstimatorExample.cxx.
In this example, we present OTB’s implementation of SEM, through the class
otb::SEMClassifier. This class performs a stochastic version of the EM algorithm, but
instead of inheriting from itk::ExpectationMaximizationMixtureModelEstimator, we
choosed to inherit from itk::Statistics::ListSample< TSample >, in the same way as
otb::SVMClassifier.
The program begins with otb::VectorImage and outputs itb::Image. Then appropriate header
files have to be included:
otb::SEMClassifier performs estimation of mixture to fit the initial histogram. Ac-
tually, mixture of Gaussian pdf can be performed. Those generic pdf are treated in
otb::Statistics::ModelComponentBase. The Gaussian model is taken in charge with the class
otb::Statistics::GaussianModelComponent.

#include "otbGaussianModelComponent.h"
#include "otbSEMClassifier.h"
Input/Output images type are define in a classical way. In fact, a itk::VariableLengthVector is
to be considered for the templated MeasurementVectorType, which will be used in the ListSample
interface.
typedef otb::VectorImage <PixelType , 2> ImageType ;
typedef otb::Image <unsigned char, 2> OutputImageType;
Once the input image is opened, the classifier may be initialised by SmartPointer.
typedef otb::SEMClassifier <ImageType , OutputImageType > ClassifType ;
ClassifType ::Pointer classifier = ClassifType ::New();
Then, it follows, classical initializations of the pipeline.
classifier ->SetNumberOfClasses(numberOfClasses);
classifier ->SetMaximumIteration(numberOfIteration);
classifier ->SetNeighborhood(neighborhood);
classifier ->SetTerminationThreshold(terminationThreshold);
classifier ->SetSample (reader ->GetOutput ());
When an initial segmentation is available, the classifier may use it as image
(of type OutputImageType) or as a itk::SampleClassifier result (of type
itk::Statistics::MembershipSample< SampleType >).
if (fileNameImgInit != NULL)
{
typedef otb::ImageFileReader <OutputImageType > ImgInitReaderType;
ImgInitReaderType::Pointer segReader = ImgInitReaderType::New ();
segReader ->SetFileName (fileNameImgInit);
segReader ->Update();
classifier ->SetClassLabels(segReader ->GetOutput ());
}
By default, otb::SEMClassifier performs initialization of ModelComponentBase by as many
instanciation of otb::Statistics::GaussianModelComponent as the number of classes to esti-
mate in the mixture. Nevertheless, the user may add specific distribution into the mixture estimation.
It is permited by the use of AddComponent for the given class number and the specific distribution.
typedef ClassifType ::ClassSampleType ClassSampleType;
typedef otb::Statistics ::GaussianModelComponent <ClassSampleType >
GaussianType;

for (int i = 0; i < numberOfClasses; ++i)
classifier ->AddComponent(i, GaussianType::New ());
Once the pipeline is instanciated. The segmentation by itself may be launched by using the Update
function.
try
{
}
The segmentation may outputs a result of type itk::Statistics::MembershipSample< SampleType >
as it is the case for the otb::SVMClassifier. But when using GetOutputImage the output is
directly an Image.
Only for visualization purposes, we choose to rescale the image of classes before saving it to a ﬁle.
We will use the itk::RescaleIntensityImageFilter for this purpose.
typedef itk::RescaleIntensityImageFilter<OutputImageType ,
RescalerType::Pointer rescaler = RescalerType::New();
rescaler ->SetOutputMinimum(itk::NumericTraits <unsigned char>:: min());
rescaler ->SetOutputMaximum(itk::NumericTraits <unsigned char>:: max());
rescaler ->SetInput (classifier ->GetOutputImage());
writer ->SetFileName (fileNameOut );
writer ->SetInput (rescaler ->GetOutput ());
writer ->Update ();
Figure 19.7 shows the result of the SEM segmentation with 4 different classes and a contextual
neighborhood of 3 pixels.
As soon as the segmentation is performed by an iterative stochastic process, it is worth verifying the
output status: does the segmentation ends when it has converged or just at the limit of the iteration
numbers.
std::cerr << "Program terminated with a ";
if (classifier ->GetTerminationCode() ==
ClassifType ::CONVERGED ) std::cerr << "converged ";
else std::cerr << "not-converged ";
std::cerr << "code...n";
The text output gives for each class the parameters of the pdf (e.g. mean of each component of the
class and there covariance matrix, in the case of a Gaussian mixture model).
classifier ->Print(std::cerr);

Figure 19.7: SEM Classification results.
19.2.6 Classification using Markov Random Fields
Markov Random Fields are probabilistic models that use the statistical dependency between pixels
in a neighborhood to infeer the value of a give pixel.
ITK framework
The itk::Statistics::MRFImageFilter uses the maximum a posteriori (MAP) estimates for
modeling the MRF. The object traverses the data set and uses the model generated by the Maha-
lanobis distance classifier to get the the distance between each pixel in the data set to a set of known
classes, updates the distances by evaluating the influence of its neighboring pixels (based on a MRF
model) and finally, classifies each pixel to the class which has the minimum distance to that pixel
(taking the neighborhood influence under consideration). The energy function minimization is done
using the iterated conditional modes (ICM) algorithm [13].
Examples/Classification/ScalarImageMarkovRandomField1.cxx.
This example shows how to use the Markov Random Field approach for classifying the pixel of a
scalar image.
The itk::Statistics::MRFImageFilter is used for refining an initial classification by intro-
ducing the spatial coherence of the labels. The user should provide two images as input. The first
image is the one to be classified while the second image is an image of labels representing an initial
classification.
The following headers are related to reading input images, writing the output image, and making the
necessary conversions between scalar and vector images.

#include "itkFixedArray.h"
#include "itkScalarToArrayCastImageFilter.h"
The following headers are related to the statistical classification classes.
#include "itkMRFImageFilter.h"
#include "itkDistanceToCentroidMembershipFunction.h"
#include "itkMinimumDecisionRule.h"
#include "itkImageClassifierBase.h"
First we define the pixel type and dimension of the image that we intend to classify. With this image
type we can also declare the otb::ImageFileReader needed for reading the input image, create
one and set its input filename.
reader ->SetFileName (inputImageFileName);
As a second step we define the pixel type and dimension of the image of labels that provides the
initial classification of the pixels from the first image. This initial labeled image can be the output
of a K-Means method like the one illustrated in section 19.2.1.
typedef unsigned char LabelPixelType;
typedef otb::ImageFileReader <LabelImageType > LabelReaderType;
LabelReaderType::Pointer labelReader = LabelReaderType::New();
labelReader ->SetFileName (inputLabelImageFileName);
Since the Markov Random Field algorithm is defined in general for images whose pixels
have multiple components, that is, images of vector type, we must adapt our scalar image
in order to satisfy the interface expected by the MRFImageFilter. We do this by using the
itk::ScalarToArrayCastImageFilter. With this filter we will present our scalar image as a
vector image whose vector pixels contain a single component.
typedef itk::FixedArray <LabelPixelType , 1> ArrayPixelType;
typedef otb::Image <ArrayPixelType , Dimension > ArrayImageType;
typedef itk::ScalarToArrayCastImageFilter<
ImageType , ArrayImageType > ScalarToArrayFilterType;

ScalarToArrayFilterType::Pointer
scalarToArrayFilter = ScalarToArrayFilterType::New();
scalarToArrayFilter ->SetInput(reader ->GetOutput ());
With the input image type ImageType and labeled image type LabelImageType we instantiate the
type of the itk::MRFImageFilter that will apply the Markov Random Field algorithm in order to
refine the pixel classification.
typedef itk::MRFImageFilter <ArrayImageType , LabelImageType > MRFFilterType;
MRFFilterType::Pointer mrfFilter = MRFFilterType::New();
mrfFilter ->SetInput (scalarToArrayFilter ->GetOutput ());
We set now some of the parameters for the MRF filter. In particular, the number of classes to be
used during the classification, the maximum number of iterations to be run in this filter and the error
tolerance that will be used as a criterion for convergence.
mrfFilter ->SetNumberOfClasses(numberOfClasses);
mrfFilter ->SetMaximumNumberOfIterations(numberOfIterations);
mrfFilter ->SetErrorTolerance(1e-7);
The smoothing factor represents the tradeoff between fidelity to the observed image and the smooth-
ness of the segmented image. Typical smoothing factors have values between 1 5. This factor will
multiply the weights that define the influence of neighbors on the classification of a given pixel. The
higher the value, the more uniform will be the regions resulting from the classification refinement.
mrfFilter ->SetSmoothingFactor(smoothingFactor);
Given that the MRF filter needs to continually relabel the pixels, it needs access to a set of mem-
bership functions that will measure to what degree every pixel belongs to a particular class. The
classification is performed by the itk::ImageClassifierBase class, that is instantiated using the
type of the input vector image and the type of the labeled image.
typedef itk::ImageClassifierBase <
ArrayImageType ,
LabelImageType > SupervisedClassifierType;
SupervisedClassifierType::Pointer classifier =
SupervisedClassifierType::New ();
The classifier needs a decision rule to be set by the user. Note that we must use GetPointer() in the
call of the SetDecisionRule() method because we are passing a SmartPointer, and smart pointer
cannot perform polymorphism, we must then extract the raw pointer that is associated to the smart
pointer. This extraction is done with the GetPointer() method.
typedef itk::MinimumDecisionRule DecisionRuleType;
DecisionRuleType::Pointer classifierDecisionRule = DecisionRuleType::New();

classifier ->SetDecisionRule(classifierDecisionRule.GetPointer ());
We now instantiate the membership functions. In this case we use the
itk::Statistics::DistanceToCentroidMembershipFunction class templated over the
pixel type of the vector image, which in our example happens to be a vector of dimension 1.
typedef itk::Statistics :: DistanceToCentroidMembershipFunction<
ArrayPixelType >
MembershipFunctionType;
typedef MembershipFunctionType::Pointer MembershipFunctionPointer;
double meanDistance = 0;
vnl_vector <double> centroid (1);
{
MembershipFunctionPointer membershipFunction =
MembershipFunctionType::New ();
centroid [0] = atof(argv[i + numberOfArgumentsBeforeMeans]);
membershipFunction ->SetCentroid (centroid );
classifier ->AddMembershipFunction(membershipFunction);
meanDistance += static_cast<double> (centroid [0]);
}
meanDistance /= numberOfClasses;
and we set the neighborhood radius that will define the size of the clique to be used in the compu-
tation of the neighbors’ influence in the classification of any given pixel. Note that despite the fact
that we call this a radius, it is actually the half size of an hypercube. That is, the actual region of
influence will not be circular but rather an N-Dimensional box. For example, a neighborhood radius
of 2 in a 3D image will result in a clique of size 5x5x5 pixels, and a radius of 1 will result in a clique
of size 3x3x3 pixels.
mrfFilter ->SetNeighborhoodRadius(1);
We should now set the weights used for the neighbors. This is done by passing an array of values
that contains the linear sequence of weights for the neighbors. For example, in a neighborhood
of size 3x3x3, we should provide a linear array of 9 weight values. The values are packaged in a
std::vector and are supposed to be double. The following lines illustrate a typical set of values
for a 3x3x3 neighborhood. The array is arranged and then passed to the filter by using the method
SetMRFNeighborhoodWeight().
std::vector <double> weights;
weights.push_back (1.5);

weights.push_back (0.0); // This is the central pixel
We now scale weights so that the smoothing function and the image fidelity functions have com-
parable value. This is necessary since the label image and the input image can have different
dynamic ranges. The fidelity function is usually computed using a distance function, such as
the itk::DistanceToCentroidMembershipFunction or one of the other membership functions.
They tend to have values in the order of the means specified.
double totalWeight = 0;
for (std ::vector <double>:: const_iterator wcIt = weights.begin();
wcIt != weights.end (); ++wcIt)
{
totalWeight += *wcIt;
}
for (std ::vector <double>:: iterator wIt = weights.begin();
wIt != weights.end (); wIt ++)
{
*wIt = static_cast<double> ((*wIt) * meanDistance / (2 * totalWeight ));
}
mrfFilter ->SetMRFNeighborhoodWeight(weights);
Finally, the classifier class is connected to the Markov Random Fields filter.
mrfFilter ->SetClassifier(classifier );
The output image produced by the itk::MRFImageFilter has the same pixel type as the labeled
input image. In the following lines we use the OutputImageType in order to instantiate the type of
a otb::ImageFileWriter. Then create one, and connect it to the output of the classification filter
after passing it through an intensity rescaler to rescale it to an 8 bit dynamic range
typedef MRFFilterType::OutputImageType OutputImageType;
writer ->SetInput(intensityRescaler ->GetOutput ());
writer ->SetFileName (outputImageFileName);
We are now ready for triggering the execution of the pipeline. This is done by simply invoking the
Update() method in the writer. This call will propagate the update request to the reader and then to
the MRF filter.
try

Figure 19.8: Effect of the MRF filter.
{
writer ->Update ();
}
catch (itk::ExceptionObject& excp)
{
std::cerr << "Problem encountered while writing ";
std::cerr << " image file : " << argv[2] << std::endl;
std::cerr << excp << std::endl;
}
Figure 19.8 illustrates the effect of this filter with four classes. In this example the filter was run with
a smoothing factor of 3. The labeled image was produced by ScalarImageKmeansClassifier.cxx and
the means were estimated by ScalarImageKmeansModelEstimator.cxx described in section 19.2.1.
The obtained result can be compared with the one of figure 19.1 to see the interest of using the MRF
approach in order to ensure the regularization of the classified image.
OTB framework
The ITK approach was considered not to be flexible enough for some remote sensing applications.
Therefore, we decided to implement our own framework.
Examples/Markov/MarkovClassification1Example.cxx.
This example illustrates the details of the otb::MarkovRandomFieldFilter. This filter is an
application of the Markov Random Fields for classification, segmentation or restauration.
This example applies the otb::MarkovRandomFieldFilter to classify an image into four classes
defined by their mean and variance. The optimization is done using an Metropolis algorithm with a
random sampler. The regularization energy is defined by a Potts model and the fidelity by a Gaussian

Figure 19.9: OTB Markov Framework.
model.
#include "otbMRFEnergyPotts.h"
#include "otbMRFEnergyGaussianClassification.h"
#include "otbMRFOptimizerMetropolis.h"
#include "otbMRFSamplerRandom.h"
Then we must decide what pixel type to use for the image. We choose to make all computations
with double precision. The labelled image is of type unsigned char which allows up to 256 different
classes.
We define a reader for the image to be classified, an initialization for the classification (which could
be random) and a writer for the final classification.
typedef otb::ImageFileWriter <LabelledImageType > WriterType ;

const char * outputFilename = argv[2];
Finally, we define the different classes necessary for the Markov classification. A
otb::MarkovRandomFieldFilter is instanciated, this is the main class which connect the other
to do the Markov classification.
An otb::MRFSamplerRandomMAP, which derives from the otb::MRFSampler, is instan-
ciated. The sampler is in charge of proposing a modification for a given site. The
otb::MRFSamplerRandomMAP, randomly pick one possible value according to the MAP probability.
typedef otb::MRFSamplerRandom <InputImageType , LabelledImageType > SamplerType ;
An otb::MRFOptimizerMetropoli, which derives from the otb::MRFOptimizer, is instanci-
ated. The optimizer is in charge of accepting or rejecting the value proposed by the sampler. The
otb::MRFSamplerRandomMAP, accept the proposal according to the variation of energy it causes
and a temperature parameter.
Two energy, deriving from the otb::MRFEnergy class need to be instanciated. One energy is
required for the regularization, taking into account the relashionship between neighborhing pixels
in the classified image. Here it is done with the otb::MRFEnergyPotts which implement a Potts
model.
The second energy is for the fidelity to the original data. Here it is done with an
otb::MRFEnergyGaussianClassification class, which defines a gaussian model for the data.
typedef otb::MRFEnergyPotts
typedef otb:: MRFEnergyGaussianClassification
MarkovRandomFieldFilterType::Pointer markovFilter =
MarkovRandomFieldFilterType::New();
EnergyRegularizationType::Pointer energyRegularization =
EnergyRegularizationType::New ();
EnergyFidelityType::Pointer energyFidelity = EnergyFidelityType::New ();

OptimizerType::Pointer optimizer = OptimizerType::New ();
SamplerType ::Pointer sampler = SamplerType ::New();
Parameter for the otb::MRFEnergyGaussianClassification class, meand and standard devia-
tion are created.
unsigned int nClass = 4;
energyFidelity ->SetNumberOfParameters(2 * nClass);
EnergyFidelityType::ParametersType parameters ;
parameters .SetSize(energyFidelity ->GetNumberOfParameters());
parameters [0] = 10.0; //Class 0 mean
parameters [1] = 10.0; //Class 0 stdev
parameters [7] = 10.0; //Class 3 stde
energyFidelity ->SetParameters(parameters );
Parameters are given to the different class an the sampler, optimizer and energies are connected with
the Markov ﬁlter.
OptimizerType::ParametersType param(1);
param.Fill(atof(argv[5]));
optimizer ->SetParameters(param);
The pipeline is connected. An itk::RescaleIntensityImageFilter rescale the classiﬁed image
before saving it.
<LabelledImageType , LabelledImageType > RescaleType ;
rescaleFilter ->SetInput(markovFilter ->GetOutput ());
writer ->SetInput(rescaleFilter ->GetOutput ());

image for classification. The result is obtained after 20 iterations with a random sampler and a Metropolis
optimizer. From left to right : original image, classification.
Finally, the pipeline execution is trigerred.
writer ->Update();
Figure 19.10 shows the output of the Markov Random Field classification after 20 iterations with a
random sampler and a Metropolis optimizer.
Using a similar structure as the previous program and the same energy function, we are now going to
slightly alter the program to use a different sampler and optimizer. The proposed sample is proposed
randomly according to the MAP probability and the optimizer is the ICM which accept the proposed
sample if it enable a reduction of the energy.
First, we need to include header specific to these class:
#include "otbMRFSamplerRandomMAP.h"
#include "otbMRFOptimizerICM.h"
And to declare these new type:
typedef otb::MRFSamplerRandomMAP <InputImageType ,
LabelledImageType > SamplerType ;
// typedef otb::MRFSamplerRandom< InputImageType, LabelledImageType> SamplerType;
typedef otb::MRFOptimizerICM OptimizerType;
As the otb::MRFOptimizerICM does not have any parameters, the call to
optimizer->SetParameters() must be removed

image for classification. The result is obtained after 5 iterations with a MAP random sampler and an ICM
optimizer. From left to right : original image, classification.
Apart from these, no further modification is required.
Figure 19.11 shows the output of the Markov Random Field classification after 5 iterations with a
MAP random sampler and an ICM optimizer.
This example illustrates the details of the MarkovRandomFieldFilter by using the Fisher distribu-
tion to model the likelihood energy. This filter is an application of the Markov Random Fields for
classification.
This example applies the MarkovRandomFieldFilter to classify an image into four classes defined
by their Fisher distribution parameters L, M and mu. The optimization is done using a Metropo-
lis algorithm with a random sampler. The regularization energy is defined by a Potts model and
the fidelity or likelihood energy is modelled by a Fisher distribution. The parameter of the Fisher
distribution was determined for each class in a supervised step. ( See the File OtbParameterEstima-
tioOfFisherDistribution ) This example is a contribution from Jan Wegner.
Then we must decide what pixel type to use for the image. We choose to make all computations
with double precision. The labeled image is of type unsigned char which allows up to 256 different
classes.
We define a reader for the image to be classified, an initialization for the classification (which could

be random) and a writer for the final classification.
typedef otb::ImageFileReader < InputImageType > ReaderType ;
typedef otb::ImageFileWriter < LabelledImageType > WriterType ;
Finally, we define the different classes necessary for the Markov classification. A MarkovRan-
domFieldFilter is instantiated, this is the main class which connect the other to do the Markov
classification.
An MRFSamplerRandomMAP, which derives from the MRFSampler, is instanciated. The sampler
is in charge of proposing a modification for a given site. The MRFSamplerRandomMAP, randomly
pick one possible value according to the MAP probability.
typedef otb::MRFSamplerRandom < InputImageType , LabelledImageType > SamplerType ;
An MRFOptimizerMetropolis, which derives from the MRFOptimizer, is instanciated. The opti-
mizer is in charge of accepting or rejecting the value proposed by the sampler. The MRFSampler-
RandomMAP, accept the proposal according to the variation of energy it causes and a temperature
parameter.
Two energy, deriving from the MRFEnergy class need to be instantiated. One energy is required for
the regularization, taking into account the relationship between neighboring pixels in the classified
image. Here it is done with the MRFEnergyPotts, which implements a Potts model.
The second energy is used for the fidelity to the original data. Here it is done with a MRFEnergy-
FisherClassification class, which defines a Fisher distribution to model the data.
typedef otb::MRFEnergyPotts
typedef otb::MRFEnergyFisherClassification
MarkovRandomFieldFilterType::Pointer markovFilter = MarkovRandomFieldFilterType::New ();
EnergyRegularizationType::Pointer energyRegularization = EnergyRegularizationType::New();
EnergyFidelityType::Pointer energyFidelity = EnergyFidelityType::New ();
SamplerType ::Pointer sampler = SamplerType ::New();

Parameter for the MRFEnergyFisherClassification class are created. The shape parameters M, L and
the weighting parameter mu are computed in a supervised step
unsigned int nClass =4;
energyFidelity ->SetNumberOfParameters(3* nClass);
EnergyFidelityType::ParametersType parameters ;
parameters .SetSize(energyFidelity ->GetNumberOfParameters());
//Class 0
parameters [0] = 12.353042; //Class 0 mu
parameters [1] = 2.156422; //Class 0 L
parameters [2] = 4.920403; //Class 0 M
//Class 1
//Class 2
//Class 3
energyFidelity ->SetParameters(parameters );
Parameters are given to the different classes and the sampler, optimizer and energies are connected
with the Markov filter.
OptimizerType::ParametersType param(1);
param.Fill(atof(argv[6]));
optimizer ->SetParameters(param);
The pipeline is connected. An itkRescaleIntensityImageFilter rescales the classified image before
saving it.
< LabelledImageType , LabelledImageType > RescaleType ;

image for classification into four classes using the Fisher-distribution as likehood term. From left to right : original
image, classification.
rescaleFilter ->SetInput( markovFilter ->GetOutput () );
writer ->SetInput( rescaleFilter ->GetOutput () );
writer ->Update();
We can now create an image file writer and save the image.
typedef otb::ImageFileWriter <RGBImageType > WriterRescaledType;
WriterRescaledType::Pointer writerRescaled = WriterRescaledType::New ();
writerRescaled ->SetFileName ( outputRescaledImageFileName );
writerRescaled ->SetInput( colormapper ->GetOutput () );
writerRescaled ->Update ();
Figure 19.12 shows the output of the Markov Random Field classification into four classes using the
Fisher-distribution as likelihood term.
Examples/Markov/MarkovRegularizationExample.cxx.
This example illustrates the use of the otb::MarkovRandomFieldFilter. to regularize a classifi-
cation obtained previously by another classifier. Here we will apply the regularization to the output
of an SVM classifier presented in 19.3.4.
The reference image and the starting image are both going to be the original classification. Both
regularization and fidelity energy are defined by Potts model.
The convergence of the Markov Random Field is done with a random sampler and a Metropolis
model as in example 1. As you should get use to the general program structure to use the MRF

19.3. Supervised classification 501
Figure 19.13: Result of applying the otb::MarkovRandomFieldFilter to regularized the result of another
classification. From left to right : original classification, regularized classification
framework, we are not going to repeat the entire example. However, remember you can find the full
source code for this example in your OTB source directory.
To find the number of classes available in the original image we use the
itk::LabelStatisticsImageFilter and more particularly the method GetNumberOfLabels().
typedef itk:: LabelStatisticsImageFilter
<LabelledImageType , LabelledImageType > LabelledStatType;
LabelledStatType::Pointer labelledStat = LabelledStatType::New ();
labelledStat ->SetInput(reader ->GetOutput ());
labelledStat ->SetLabelInput(reader ->GetOutput ());
labelledStat ->Update();
unsigned int nClass = labelledStat ->GetNumberOfLabels();
Figure 19.13 shows the output of the Markov Random Field regularization on the classification
output of another method.
19.3 Supervised classification
19.3.1 Generic machine learning framework
The OTB supervised classification is implemented as a generic Machine Learning framework, sup-
porting several possible machine learning libraries as backends. As of now both libSVM (the ma-
chine learning library historically integrated in OTB) and the machine learning methods of OpenCV
library ([15]) are available.
The current list of classifiers available through the same generic interface within the OTB is:
• LibSVM: Support Vector Machines classifier based on libSVM.

• SVM: Support Vector Machines classifier based on OpenCV, itself based on libSVM.
• Bayes: Normal Bayes classifier based on OpenCV.
• Boost: Boost classifier based on OpenCV.
• DT: Decision Tree classifier based on OpenCV.
• RF: Random Forests classifier based on the Random Trees in OpenCV.
• GBT: Gradient Boosted Tree classifier based on OpenCV.
• KNN: K-Nearest Neighbors classifier based on OpenCV.
• ANN: Artificial Neural Network classifier based on OpenCV.
19.3.2 An example of supervised classification method: Support Vector Machines
SVM general description
Kernel based learning methods in general and the Support Vector Machines (SVM) in particular,
have been introduced in the last years in learning theory for classification and regression tasks,
[136]. SVM have been successfully applied to text categorization, [74], and face recognition, [105].
Recently, they have been successfully used for the classification of hyperspectral remote-sensing
images, [16].
Simply stated, the approach consists in searching for the separating surface between 2 classes by
the determination of the subset of training samples which best describes the boundary between the 2
classes. These samples are called support vectors and completely define the classification system. In
the case where the two classes are nonlinearly separable, the method uses a kernel expansion in order
to make projections of the feature space onto higher dimensionality spaces where the separation of
the classes becomes linear.
SVM mathematical formulation
This subsection reminds the basic principles of SVM learning and classification. A good tutorial on
SVM can be found in, [18].
We have N samples represented by the couple (yi,xi),i = 1...N where yi ∈ {−1,+1} is the class
label and xi ∈ Rn is the feature vector of dimension n. A classifier is a function
f(x,α) : x → y
where α are the classifier parameters. The SVM finds the optimal separating hyperplane which
fulfills the following constraints :

• The samples with labels +1 and −1 are on different sides of the hyperplane.
• The distance of the closest vectors to the hyperplane is maximised. These are the support
vectors (SV) and this distance is called the margin.
The separating hyperplane has the equation
w·x + b = 0;
with w being its normal vector and x being any point of the hyperplane. The orthogonal distance
to the origin is given by |b|
w . Vectors located outside the hyperplane have either w · x + b > 0 or
w·x + b < 0.
Therefore, the classiﬁer function can be written as
f(x,w,b) = sgn(w·x + b).
The SVs are placed on two hyperplanes which are parallel to the optimal separating one. In order to
ﬁnd the optimal hyperplane, one sets w and b :
w·x + b = ±1.
Since there must not be any vector inside the margin, the following constraint can be used:
w·xi + b ≥ +1 if yi = +1;
w·xi + b ≤ −1 if yi = −1;
which can be rewritten as
yi(w·xi + b)− 1 ≥ 0 ∀i.
The orthogonal distances of the 2 parallel hyperplanes to the origin are |1−b|
w and |−1−b|
w . Therefore
the modulus of the margin is equal to 2
w and it has to be maximised.
Thus, the problem to be solved is:
• Find w and b which minimise 1
2 w 2
• under the constraint : yi(w·xi + b) ≥ 1 i = 1...N.
This problem can be solved by using the Lagrange multipliers with one multiplier per sample. It can
be shown that only the support vectors will have a positive Lagrange multiplier.
In the case where the two classes are not exactly linearly separable, one can modify the constraints
above by using
w·xi + b ≥ +1 − ξi if yi = +1;

w·xi + b ≤ −1 + ξi if yi = −1;
ξi ≥ 0 ∀i.
If ξi > 1, one considers that the sample is wrong. The function which has then to be minimised is
1
2 w 2 +C(∑i ξi);, where C is a tolerance parameter. The optimisation problem is the same than in
the linear case, but one multiplier has to be added for each new constraint ξi ≥ 0.
If the decision surface needs to be non-linear, this solution cannot be applied and the kernel approach
has to be adopted.
One drawback of the SVM is that, in their basic version, they can only solve two-class problems.
Some works exist in the field of multi-class SVM (see [5, 141], and the comparison made by [60]),
but they are not used in our system.
You have to be aware that to achieve better convergence of the algorithm it is strongly advised to
normalize feature vector components in the [−1;1] interval.
For problems with N > 2 classes, one can choose either to train N SVM (one class against all the
others), or to train N × (N − 1) SVM (one class against each of the others). In the second approach,
which is the one that we use, the final decision is taken by choosing the class which is most often
selected by the whole set of SVM.
19.3.3 Learning from samples
Examples/Learning/TrainMachineLearningModelFromSamplesExample.cxx.
This example illustrates the use of the otb::SVMMachineLearningModel class, which inherits
from the otb::MachineLearningModel class. This class allows the estimation of a classification
model (supervised learning) from samples. In this example, we will train an SVM model with 4
output classes, from 1000 randomly generated training samples, each of them having 7 components.
We start by including the appropriate header files.
// List sample generator
#include "otbListSampleGenerator.h"
// Random number generator
#include "itkMersenneTwisterRandomVariateGenerator.h"
// SVM model Estimator
#include "otbSVMMachineLearningModel.h"
The input parameters of the sample generator and of the SVM classifier are initialized.
int nbSamples = 1000;
int nbSampleComponents = 7;
int nbClasses = 4;

Two lists are generated into a itk::Statistics::ListSample which is the structure used
to handle both lists of samples and of labels for the machine learning classes derived from
otb::MachineLearningModel. The first list is composed of feature vectors representing multi-
component samples, and the second one is filled with their corresponding class labels. The list of
labels is composed of scalar values.
// Input related typedefs
typedef float InputValueType;
typedef itk::VariableLengthVector <InputValueType > InputSampleType;
typedef itk::Statistics ::ListSample <InputSampleType > InputListSampleType;
// Target related typedefs
typedef int TargetValueType;
typedef itk::FixedArray <TargetValueType , 1> TargetSampleType;
typedef itk::Statistics ::ListSample <TargetSampleType > TargetListSampleType;
InputListSampleType::Pointer InputListSample = InputListSampleType::New ();
TargetListSampleType::Pointer TargetListSample = TargetListSampleType::New();
In this example, the list of multi-component training samples is randomly filled with a random num-
ber generator based on the itk::Statistics::MersenneTwisterRandomVariateGenerator
class. Each component’s value is generated from a normal law centered around the correspond-
ing class label of each sample multiplied by 100, with a standard deviation of 10.
itk::Statistics :: MersenneTwisterRandomVariateGenerator::Pointer randGen;
randGen = itk::Statistics :: MersenneTwisterRandomVariateGenerator::GetInstance ();
// Filling the two input training lists
for (int i = 0; i < nbSamples ; ++i)
{
InputSampleType sample;
TargetValueType label = (i % nbClasses ) + 1;
// Multi-component sample randomly filled from a normal law for each component
sample.SetSize(nbSampleComponents);
for (int itComp = 0; itComp < nbSampleComponents; ++itComp)
{
sample[itComp] = randGen ->GetNormalVariate(100 * label , 10);
}
InputListSample ->PushBack (sample);
TargetListSample ->PushBack (label);
}
Once both sample and label lists are generated, the second step consists in declaring the
machine learning classifier. In our case we use an SVM model with the help of the
otb::SVMMachineLearningModel class which is derived from the otb::MachineLearningModel
class. This pure virtual class is based on the machine learning framework of the OpenCV library
([15]) which handles other classifiers than the SVM.
typedef otb::SVMMachineLearningModel <InputValueType , TargetValueType > SVMType;
SVMType ::Pointer SVMClassifier = SVMType ::New ();

SVMClassifier ->SetInputListSample(InputListSample);
SVMClassifier ->SetTargetListSample(TargetListSample);
SVMClassifier ->SetKernelType(CvSVM::LINEAR);
Once the classifier is parametrized with both input lists and default parameters, except for the kernel
type in our example of SVM model estimation, the model training is computed with the Train
method. Finally, the Save method exports the model to a text file. All the available classifiers based
on OpenCV are implemented with these interfaces. Like for the SVM model training, the other
classifiers can be parametrized with specific settings.
SVMClassifier ->Train();
SVMClassifier ->Save(outputModelFileName);
19.3.4 Learning from images
Examples/Learning/TrainMachineLearningModelFromImagesExample.cxx.
This example illustrates the use of the otb::MachineLearningModel class. This class allows the
estimation of a classification model (supervised learning) from images. In this example, we will
train an SVM with 4 classes. We start by including the appropriate header files.
// List sample generator
#include "otbListSampleGenerator.h"
// Extract a ROI of the vectordata
#include "otbVectorDataIntoImageProjectionFilter.h"
// SVM model Estimator
#include "otbSVMMachineLearningModel.h"
In this framework, we must transform the input samples store in a vector data into a
itk::Statistics::ListSample which is the structure compatible with the machine learning
classes. On the one hand, we are using feature vectors for the characterization of the classes,
and on the other hand, the class labels are scalar values. We first re-project the input vector data
over the input image, using the otb::VectorDataIntoImageProjectionFilter class. To con-
vert the input samples store in a vector data into a itk::Statistics::ListSample, we use the
otb::ListSampleGenerator class.
// VectorData projection filter
typedef otb::VectorDataIntoImageProjectionFilter<VectorDataType , InputImageType >
VectorDataReprojectionType;
InputReaderType::Pointer inputReader = InputReaderType::New();
inputReader ->SetFileName (inputImageFileName);

InputImageType::Pointer image = inputReader ->GetOutput ();
image ->UpdateOutputInformation();
// Read the Vectordata
VectorDataReaderType::Pointer vectorReader = VectorDataReaderType::New ();
vectorReader ->SetFileName (trainingShpFileName);
vectorReader ->Update();
VectorDataType::Pointer vectorData = vectorReader ->GetOutput ();
vectorData ->Update ();
VectorDataReprojectionType::Pointer vdreproj = VectorDataReprojectionType::New ();
vdreproj ->SetInputImage(image);
vdreproj ->SetInput(vectorData );
vdreproj ->SetUseOutputSpacingAndOriginFromImage(false);
vdreproj ->Update ();
typedef otb::ListSampleGenerator <InputImageType , VectorDataType >
ListSampleGeneratorType;
ListSampleGeneratorType::Pointer sampleGenerator;
sampleGenerator = ListSampleGeneratorType::New();
sampleGenerator ->SetInput(image);
sampleGenerator ->SetInputVectorData(vdreproj ->GetOutput ());
sampleGenerator ->SetClassKey ("Class");
sampleGenerator ->Update ();
Now, we need to declare the machine learning model which will be used by the classifier. In this
example, we train an SVM model. The otb::SVMMachineLearningModel class inherits from the
pure virtual class otb::MachineLearningModel which is templated over the type of values used
for the measures and the type of pixels used for the labels. Most of the classification and regression
algorithms available through this interface in OTB is based on the OpenCV library [15]. Specific
methods can be used to set classifier parameters. In the case of SVM, we set here the type of the
kernel. Other parameters are let with their default values.
typedef otb:: SVMMachineLearningModel
<InputImageType::InternalPixelType ,
ListSampleGeneratorType::ClassLabelType > SVMType;
SVMType ::Pointer SVMClassifier = SVMType ::New ();
SVMClassifier ->SetInputListSample(sampleGenerator ->GetTrainingListSample());
SVMClassifier ->SetTargetListSample(sampleGenerator ->GetTrainingListLabel());
SVMClassifier ->SetKernelType(CvSVM::LINEAR);
The machine learning interface is generic and gives access to other classifiers. We now train the
SVM model using the Train and save the model to a text file using the Save method.

SVMClassifier ->Train();
SVMClassifier ->Save(outputModelFileName);
You can now use the Predict method which takes a itk::Statistics::ListSample
as input and estimates the label of each input sample using the model. Finally, the
otb::ImageClassificationModel inherits from the itk::ImageToImageFilter and allows to
classify pixels in the input image by predicting their labels using a model.
19.3.5 Multi-band, streamed classification
Examples/Classification/SupervisedImageClassificationExample.cxx.
In OTB, a generic streamed filter called otb::ImageClassificationFilter is available to clas-
sify any input multi-channel image according to an input classification model file. This filter is
generic because it works with any classification model type (SVM, KNN, Artificial Neural Net-
work,...) generated within the OTB generic Machine Learning framework based on OpenCV ([15]).
The input model file is smartly parsed according to its content in order to identify which learning
method was used to generate it. Once the classification method and model are known, the input
image can be classified. More details are given in subsections 19.3.3 and 19.3.4 to generate a clas-
sification model either from samples or from images. In this example we will illustrate its use. We
start by including the appropriate header files.
#include "otbMacro.h"
#include "otbMachineLearningModelFactory.h"
#include "otbImageClassificationFilter.h"
Our classifier is generic enough to be able to process images with any number of bands. We read
the input image as a otb::VectorImage. The labeled image will be a scalar image.
image types.
typedef otb::ImageClassificationFilter<ImageType , LabeledImageType >
typedef ClassificationFilterType::ModelType ModelType ;

Moreover, it is necessary to define a otb::MachineLearningModelFactory which is templated
over its input and output pixel types. This factory is used to parse the input model file and to define
which classification method to use.
typedef otb::MachineLearningModelFactory<PixelType , LabeledPixelType >
MachineLearningModelFactoryType;
And finally, we define the reader and the writer. Since the images to classify can be very big, we
will use a streamed writer which will trigger the streaming ability of the classifier.
We instantiate the classifier and the reader objects and we set the existing model obtained in a
previous training step.
The input model file is parsed according to its content and the generated model is then loaded within
the otb::ImageClassificationFilter.
ModelType ::Pointer model;
model = MachineLearningModelFactoryType::CreateMachineLearningModel(
modelfname ,
MachineLearningModelFactoryType::ReadMode );
model ->Load(modelfname );
filter ->SetModel(model);
writer ->Update();
19.3.6 Generic Kernel SVM
OTB has developed a specific interface for user-defined kernels. However, the following functions
use a deprecated OTB interface. A function k(·,·) is considered to be a kernel when:
∀g(·) ∈ L2
(Ên
) so that g(x)2
dx be finite, (19.5)
then k(x,y)g(x)g(y)dxdy 0,

which is known as the Mercer condition.
When defined through the OTB, a kernel is a class that inherits from GenericKernelFunctorBase.
Several virtual functions have to be overloaded:
• The Evaluate function, which implements the behavior of the kernel itself. For instance, the
classical linear kernel could be re-implemented with:
double
MyOwnNewKernel
::Evaluate ( const svm_node * x, const svm_node * y,
const svm_parameter & param ) const
{
return this->dot(x,y);
}
This simple example shows that the classical dot product is already implemented into
otb::GenericKernelFunctorBase::dot() as a protected function.
• The Update() function which synchronizes local variables and their integration into the initial
SVM procedure. The following examples will show the way to use it.
Some pre-defined generic kernels have already been implemented in OTB:
• otb::MixturePolyRBFKernelFunctor which implements a linear mixture of a polynomial
and a RBF kernel;
• otb::NonGaussianRBFKernelFunctor which implements a non gaussian RBF kernel;
• otb::SpectralAngleKernelFunctor, a kernel that integrates the Spectral Angle, instead of
the Euclidean distance, into an inverse multiquadric kernel. This kernel may be appropriated
when using multispectral data.
• otb::ChangeProfileKernelFunctor, a kernel which is dedicated to the supervized classi-
fication of the multiscale change profile presented in section 21.5.1.
Learning with User Defined Kernels
Examples/Learning/SVMGenericKernelImageModelEstimatorExample.cxx.
This example illustrates the modifications to be added to the use of
otb::SVMImageModelEstimator in order to add a user defined kernel. This initial program
has been explained in section 19.3.4.
The first thing to do is to include the header file for the new kernel.

#include "otbSVMImageModelEstimator.h"
#include "otbMixturePolyRBFKernelFunctor.h"
Once the otb::SVMImageModelEstimator is instanciated, it is possible to add the new kernel and
its parameters.
Then in addition to the initial code:
EstimatorType::Pointer svmEstimator = EstimatorType::New ();
svmEstimator ->SetSVMType (C_SVC);
svmEstimator ->SetInputImage(inputReader ->GetOutput ());
svmEstimator ->SetTrainingImage(trainingReader ->GetOutput ());
The instanciation of the kernel is to be implemented. The kernel which is used here is a linear
combination of a polynomial kernel and an RBF one. It is written as
µk1(x,y)+ (1 − µ)k2(x,y)
with k1(x,y) = (γ1x·y+ c0)d
and k2(x,y) = exp −γ2 x− y 2 . Then, the speciﬁc parameters of this
kernel are:
• Mixture (µ),
• GammaPoly (γ1),
• CoefPoly (c0),
• DegreePoly (d),
• GammaRBF (γ2).
Their instanciations are achieved through the use of the SetValue function.
otb:: MixturePolyRBFKernelFunctor myKernel ;
myKernel .SetValue("Mixture", 0.5);
myKernel .SetValue("GammaPoly ", 1.0);
myKernel .SetValue("CoefPoly ", 0.0);
myKernel .SetValue("DegreePoly ", 1);
myKernel .SetValue("GammaRBF ", 1.5);
myKernel .Update();
Once the kernel’s parameters are affected and the kernel updated, the connection to
otb::SVMImageModelEstimator takes place here.
svmEstimator ->SetKernelFunctor(&myKernel );
svmEstimator ->SetKernelType(GENERIC );
The model estimation procedure is triggered by calling the estimator’s Update method.

svmEstimator ->Update();
The rest of the code remains unchanged...
svmEstimator ->SaveModel (outputModelFileName);
In the file outputModelFileName a specific line will appear when using a generic kernel. It gives
the name of the kernel and its parameters name and value.
Classification with user defined kernel
Examples/Learning/SVMGenericKernelImageClassificationExample.cxx.
This example illustrates the modifications to be added to use the otb::SVMClassifier class for
performing SVM classification on images with a user-defined kernel. In this example, we will use
an SVM model estimated in the previous section to separate between water and non-water pixels by
using the RGB values only. The first thing to do is include the header file for the class as well as the
header of the appropriated kernel to be used.
#include "otbSVMClassifier.h"
#include "otbMixturePolyRBFKernelFunctor.h"
We need to declare the SVM model which is to be used by the classifier. The SVM model is
templated over the type of value used for the measures and the type of pixel used for the labels.
typedef otb::SVMModel <PixelType , LabelPixelType > ModelType ;
ModelType ::Pointer model = ModelType ::New ();
After instantiation, we can load a model saved to a file (see section 19.3.4 for an example of model
estimation and storage to a file).
When using a user defined kernel, an explicit instanciation has to be performed.
otb:: MixturePolyRBFKernelFunctor myKernel ;
model ->SetKernelFunctor(&myKernel );
Then, the rest of the classification program remains unchanged.
model ->LoadModel (modelFilename);

19.4. Fusion of Classification maps 513
19.4 Fusion of Classification maps
19.4.1 General approach of image fusion
In order to obtain a relevant image classification it is sometimes necessary to fuse several classifi-
cation maps coming from different classification methods (SVM, KNN, Random Forest, Artificial
Neural Networks,...). The fusion of classification maps combines them in a more robust and precise
one. Two methods are available in the OTB: the majority voting and the Demspter Shafer frame-
work.
19.4.2 Majority voting
General description
For each input pixel, the Majority Voting method consists in choosing the more frequent class label
among all classification maps to fuse. In case of not unique more frequent class labels, the undecided
value is set for such pixels in the fused output image.
An example of majority voting fusion
Examples/Classification/MajorityVotingFusionOfClassificationMapsExample.cxx.
The Majority Voting fusion filter itk::LabelVotingImageFilter used is based on ITK. For each
pixel, it chooses the more frequent class label among the input classification maps. In case of not
unique more frequent class labels, the output pixel is set to the undecidedLabel value. We start by
including the appropriate header file.
#include "itkLabelVotingImageFilter.h"
We will assume unsigned short type input labeled images.
The input labeled images to be fused are expected to be scalar images.
The Majority Voting fusion filter itk::LabelVotingImageFilter based on ITK is templated over
the input and output labeled image type.
// Majority Voting
typedef itk::LabelVotingImageFilter <LabelImageType , LabelImageType >
LabelVotingFilterType;

Both reader and writer are defined. Since the images to classify can be very big, we will use a
streamed writer which will trigger the streaming ability of the fusion filter.
typedef otb::ImageFileReader <LabelImageType > ReaderType ;
typedef otb::ImageFileWriter <LabelImageType > WriterType ;
The input classification maps to be fused are pushed into the itk::LabelVotingImageFilter.
Moreover, the label value for the undecided pixels (in case of not unique majority voting) is set too.
ReaderType ::Pointer reader;
LabelVotingFilterType::Pointer labelVotingFilter = LabelVotingFilterType::New ();
for (unsigned int itCM = 0; itCM < nbClassificationMaps; ++itCM)
{
std::string fileNameClassifiedImage = argv[itCM + 1];
reader = ReaderType ::New();
reader ->SetFileName (fileNameClassifiedImage);
reader ->Update ();
labelVotingFilter ->SetInput(itCM , reader ->GetOutput ());
}
labelVotingFilter ->SetLabelForUndecidedPixels(undecidedLabel);
Once it is plugged the pipeline triggers its execution by updating the output of the writer.
writer ->SetInput(labelVotingFilter ->GetOutput ());
writer ->Update();
19.4.3 Dempster Shafer
General description
A more adaptive fusion method using the Dempster Shafer theory
(http://en.wikipedia.org/wiki/Dempster-Shafer theory) is available within the OTB. This method
is adaptive as it is based on the so-called belief function of each class label for each classification
map. Thus, each classified pixel is associated to a degree of confidence according to the classifier
used. In the Dempster Shafer framework, the expert’s point of view (i.e. with a high belief function)
is considered as the truth. In order to estimate the belief function of each class label, we use the
Dempster Shafer combination of masses of belief for each class label and for each classification
map. In this framework, the output fused label of each pixel is the one with the maximal belief
function.
Like for the majority voting method, the Dempster Shafer fusion handles not unique class labels
with the maximal belief function. In this case, the output fused pixels are set to the undecided value.

19.4. Fusion of Classification maps 515
The confidence levels of all the class labels are estimated from a comparision of the classification
maps to fuse with a ground truth, which results in a confusion matrix. For each classification maps,
these confusion matrices are then used to estimate the mass of belief of each class label.
Mathematical formulation of the combination algorithm
A description of the mathematical formulation of the Dempster Shafer
combination algorithm is available in the following OTB Wiki page:
http://wiki.orfeo-toolbox.org/index.php/Information fusion framework.
An example of Dempster Shafer fusion
Examples/Classification/DempsterShaferFusionOfClassificationMapsExample.cxx.
The fusion filter otb::DSFusionOfClassifiersImageFilter is based on the Dempster Shafer
(DS) fusion framework. For each pixel, it chooses the class label Ai for which the belief function
bel(Ai) is maximal after the DS combination of all the available masses of belief of all the class
labels. The masses of belief (MOBs) of all the labels present in each classification map are read
from input *.CSV confusion matrix files. Moreover, the pixels into the input classification maps to
be fused which are equal to the nodataLabel value are ignored by the fusion process. In case of not
unique class labels with the maximal belief function, the output pixels are set to the undecidedLabel
value. We start by including the appropriate header files.
#include "otbConfusionMatrixToMassOfBelief.h"
#include "otbDSFusionOfClassifiersImageFilter.h"
#include <fstream>
We will assume unsigned short type input labeled images. We define a type for confu-
sion matrices as itk::VariableSizeMatrix which will be used to estimate the masses
of belief of all the class labels for each input classification map. For this purpose, the
otb::ConfusionMatrixToMassOfBelief will be used to convert each input confusion matrix into
masses of belief for each class label.
typedef unsigned long ConfusionMatrixEltType;
typedef itk::VariableSizeMatrix <ConfusionMatrixEltType > ConfusionMatrixType;
typedef otb:: ConfusionMatrixToMassOfBelief
<ConfusionMatrixType , LabelPixelType > ConfusionMatrixToMassOfBeliefType;
typedef ConfusionMatrixToMassOfBeliefType::MapOfClassesType MapOfClassesType;
The input labeled images to be fused are expected to be scalar images.

typedef otb::VectorImage <LabelPixelType , Dimension > VectorImageType;
We declare an otb::ImageListToVectorImageFilter which will stack all the input clas-
sification maps to be fused as a single VectorImage for which each band is a classifica-
tion map. This VectorImage will then be the input of the Dempster Shafer fusion filter
otb::DSFusionOfClassifiersImageFilter.
typedef otb::ImageList <LabelImageType > LabelImageListType;
typedef otb::ImageListToVectorImageFilter
<LabelImageListType , VectorImageType > ImageListToVectorImageFilterType;
The Dempster Shafer fusion filter otb::DSFusionOfClassifiersImageFilter is declared.
// Dempster Shafer
typedef otb:: DSFusionOfClassifiersImageFilter
<VectorImageType , LabelImageType > DSFusionOfClassifiersImageFilterType;
Both reader and writer are defined. Since the images to classify can be very big, we will use a
streamed writer which will trigger the streaming ability of the fusion filter.
typedef otb::ImageFileReader <LabelImageType > ReaderType ;
typedef otb::ImageFileWriter <LabelImageType > WriterType ;
The image list of input classification maps is filled. Moreover, the input confusion matrix files are
converted into masses of belief.
ReaderType ::Pointer reader;
LabelImageListType::Pointer imageList = LabelImageListType::New ();
ConfusionMatrixToMassOfBeliefType::Pointer confusionMatrixToMassOfBeliefFilter;
confusionMatrixToMassOfBeliefFilter = ConfusionMatrixToMassOfBeliefType::New ();
MassOfBeliefDefinitionMethod massOfBeliefDef;
// Several parameters are available to estimate the masses of belief
// from the confusion matrices: PRECISION, RECALL, ACCURACY and KAPPA
massOfBeliefDef = ConfusionMatrixToMassOfBeliefType::PRECISION ;
VectorOfMapOfMassesOfBeliefType vectorOfMapOfMassesOfBelief;
for (unsigned int itCM = 0; itCM < nbClassificationMaps; ++itCM)
{
std::string fileNameClassifiedImage = argv[itCM + 1];
std::string fileNameConfMat = argv[itCM + 1 + nbClassificationMaps];
reader = ReaderType ::New();
reader ->SetFileName (fileNameClassifiedImage);
reader ->Update ();
imageList ->PushBack (reader ->GetOutput ());
MapOfClassesType mapOfClassesClk;
ConfusionMatrixType confusionMatrixClk;

19.5. Classification map regularization 517
// The data (class labels and confusion matrix values) are read and
// extracted from the *.CSV file with an ad-hoc file parser
CSVConfusionMatrixFileReader(
fileNameConfMat , mapOfClassesClk , confusionMatrixClk);
// The parameters of the ConfusionMatrixToMassOfBelief filter are set
confusionMatrixToMassOfBeliefFilter->SetMapOfClasses(mapOfClassesClk);
confusionMatrixToMassOfBeliefFilter->SetConfusionMatrix(confusionMatrixClk);
confusionMatrixToMassOfBeliefFilter->SetDefinitionMethod(massOfBeliefDef);
confusionMatrixToMassOfBeliefFilter->Update ();
// Vector containing ALL the K (= nbClassificationMaps) std::map<Label, MOB>
// of Masses of Belief
vectorOfMapOfMassesOfBelief.push_back (
confusionMatrixToMassOfBeliefFilter->GetMapMassOfBelief());
}
The image list of input classification maps is converted into a VectorImage to be used as input of the
otb::DSFusionOfClassifiersImageFilter.
// Image List To VectorImage
ImageListToVectorImageFilterType::Pointer imageListToVectorImageFilter;
imageListToVectorImageFilter = ImageListToVectorImageFilterType::New();
imageListToVectorImageFilter->SetInput(imageList );
DSFusionOfClassifiersImageFilterType::Pointer dsFusionFilter;
dsFusionFilter = DSFusionOfClassifiersImageFilterType::New ();
// The parameters of the DSFusionOfClassifiersImageFilter are set
dsFusionFilter ->SetInput(imageListToVectorImageFilter->GetOutput ());
dsFusionFilter ->SetInputMapsOfMassesOfBelief(& vectorOfMapOfMassesOfBelief);
dsFusionFilter ->SetLabelForNoDataPixels(nodataLabel );
dsFusionFilter ->SetLabelForUndecidedPixels(undecidedLabel);
Once it is plugged the pipeline triggers its execution by updating the output of the writer.
writer ->SetInput(dsFusionFilter ->GetOutput ());
writer ->Update();
19.5 Classification map regularization
Examples/Classification/ClassificationMapRegularizationExample.cxx.
After having generated a classification map, it is possible to regularize such a labeled image in
order to obtain more homogeneous areas, which makes the interpretation of its classes easier. For

this purpose, the otb::NeighborhoodMajorityVotingImageFilter was implemented. Like a
morphological filter, this filter uses majority voting in a ball shaped neighborhood in order to set
each pixel of the classification map to the more representative label value in its neighborhood.
In this example we will illustrate its use. We start by including the appropriate header file.
#include "otbNeighborhoodMajorityVotingImageFilter.h"
Since the input image is a classification map, we will assume a single band input image for which
each pixel value is a label coded on 8 bits as an integer between 0 and 255.
typedef unsigned char IOLabelPixelType; // 8 bits
Thus, both input and output images are single band labeled images, which are composed of the same
type of pixels in this example (unsigned char).
typedef otb::Image <IOLabelPixelType , Dimension > IOLabelImageType;
We can now define the type for the neighborhood majority voting filter, which is templated over its
input and output images types and over its structuring element type. Choosing only the input image
type in the template of this filter induces that, both input and output images types are the same and
that the structuring element is a ball ( itk::BinaryBallStructuringElement).
// Neighborhood majority voting filter type
typedef otb::NeighborhoodMajorityVotingImageFilter<IOLabelImageType >
NeighborhoodMajorityVotingFilterType;
Since the otb::NeighborhoodMajorityVotingImageFilter is a neighborhood based image fil-
ter, it is necessary to set the structuring element which will be used for the majority voting process.
By default, the structuring element is a ball ( itk::BinaryBallStructuringElement) with a ra-
dius defined by two sizes (respectively along X and Y). Thus, it is possible to handle anisotropic
structuring elements such as ovals.
// Binary ball Structuring Element type
typedef NeighborhoodMajorityVotingFilterType::KernelType StructuringType;
typedef StructuringType::RadiusType RadiusType ;
Finally, we define the reader and the writer.
typedef otb::ImageFileReader <IOLabelImageType > ReaderType ;
typedef otb::ImageFileWriter <IOLabelImageType > WriterType ;
We instantiate the otb::NeighborhoodMajorityVotingImageFilter and the reader objects.
// Neighborhood majority voting filter
NeighborhoodMajorityVotingFilterType::Pointer NeighMajVotingFilter;
NeighMajVotingFilter = NeighborhoodMajorityVotingFilterType::New ();

19.5. Classification map regularization 519
The ball shaped structuring element seBall is instantiated and its two radii along X and Y are initial-
ized.
StructuringType seBall;
RadiusType rad;
rad [0] = radiusX;
rad [1] = radiusY;
seBall.SetRadius (rad);
seBall.CreateStructuringElement();
Then, this ball shaped neighborhood is used as the kernel structuring element for the
otb::NeighborhoodMajorityVotingImageFilter.
NeighMajVotingFilter ->SetKernel (seBall);
Not classified input pixels are assumed to have the noDataValue label and will keep this label in the
output image.
NeighMajVotingFilter ->SetLabelForNoDataPixels(noDataValue );
Moreover, since the majority voting regularization may lead to not unique majority labels in the
neighborhood, it is important to define which behaviour the filter must have in this case. For this
purpose, a Boolean parameter is used in the filter to choose whether pixels with more than one
majority class are set to undecidedValue (true), or to their Original labels (false = default value) in
the output image.
NeighMajVotingFilter ->SetLabelForUndecidedPixels(undecidedValue);
if (KeepOriginalLabelBoolStr.compare("true") == 0)
{
NeighMajVotingFilter ->SetKeepOriginalLabelBool(true);
}
else
{
NeighMajVotingFilter ->SetKeepOriginalLabelBool(false);
}
NeighMajVotingFilter ->SetInput(reader ->GetOutput ());
writer ->SetFileName (outputFileName);
writer ->SetInput(NeighMajVotingFilter ->GetOutput ());
writer ->Update();

CHAPTER
TWENTY
OBJECT-BASED IMAGE ANALYSIS
Object-Based Image Analysis (OBIA) focusses on analyzing images at the object level instead of
working at the pixel level. This approach is particularly well adapted for high resolution images and
leads to more robust and less noisy results.
OTB allows to implement OBIA by using ITK’s Label Object framework
(http://www.insight-journal.org/browse/publication/176). This allows to repre-
sent a segmented image as a set of regions and not anymore as a set of pixels. Added to the
compression rate achieved by this kind of description, the main advantage of this approach is the
possibility to operate at the segment (or object level).
A classical OBIA pipeline will use the following steps:
1. Image segmentation (the whole or only parts of it);
2. Image to LabelObjectMap (a kind of std::map<LabelObject>) transformation;
3. Eventual relabeling;
4. Attribute computation for the regions using the image before segmentation:
(a) Shape attributes;
(b) Statistics attributes;
(c) Attributes for radiometry, textures, etc.
5. Object ﬁltering
(a) Remove/select objects under a condition (area less than X, NDVI higher than X, etc.)
(b) Keep N objects;
(c) etc.
6. LabelObjectMap to image transformation.

522 Chapter 20. Object-based Image Analysis
20.1 From Images to Objects
Examples/OBIA/ImageToLabelToImage.cxx.
This example shows the basic approach for the transformation of a segmented (labeled) image into a
LabelObjectMap and then back to an image. For this matter we will need the following header files
which contain the basic classes.
#include "itkLabelObject.h"
#include "itkLabelMap .h"
#include "itkBinaryImageToLabelMapFilter.h"
#include "itkLabelMapToLabelImageFilter.h"
The image types are defined using pixel types and dimension. The input image is defined as an
otb::Image.
const int dim = 2;
typedef otb::Image <PixelType , dim> ImageType ;
typedef itk::LabelObject <PixelType , dim> LabelObjectType;
typedef itk::LabelMap <LabelObjectType > LabelMapType;
As usual, the reader is instantiated and the input image is set.
Then the binary image is transformed to a collection of label objects. Arguments are:
• FullyConnected: Set whether the connected components are defined strictly by face connec-
tivity or by face+edge+vertex connectivity. Default is FullyConnectedOff.
• InputForegroundValue/OutputBackgroundValue: Specify the pixel value of input/output
of the foreground/background.
typedef itk::BinaryImageToLabelMapFilter<ImageType , LabelMapType > I2LType;
I2LType ::Pointer i2l = I2LType ::New ();
i2l->SetInput (reader ->GetOutput ());
i2l->SetFullyConnected(atoi(argv[5]));
i2l->SetInputForegroundValue(atoi(argv[6]));
i2l->SetOutputBackgroundValue(atoi(argv[7]));
Then the inverse process is used to recreate a image of labels. The
itk::LabelMapToLabelImageFilter converts a LabelMap to a labeled image.

20.2. Object Attributes 523
Figure 20.1: Transforming an image (left) into a label object map and back to an image (right).
typedef itk::LabelMapToLabelImageFilter<LabelMapType , ImageType > L2IType;
L2IType ::Pointer l2i = L2IType::New ();
l2i->SetInput (i2l->GetOutput ());
The output can be passed to a writer. The invocation of the Update() method on the writer triggers
the execution of the pipeline.
writer ->SetInput(l2i->GetOutput ());
writer ->Update();
Figure 20.1 shows the effect of transforming an image into a label object map and back to an image
20.2 Object Attributes
Examples/OBIA/ShapeAttributeComputation.cxx.
This basic example shows how compute shape attributes at the object level. The input image is
firstly converted into a set of regions ( itk::ShapeLabelObject), some attribute values of each
object are computed and then saved to an ASCII file.
#include "itkShapeLabelObject.h"
#include "itkLabelMap .h"
#include "itkLabelImageToLabelMapFilter.h"
#include "itkShapeLabelMapFilter.h"
The image types are defined using pixel types and dimensions. The input image is defined as an
otb::Image.

const int dim = 2;
typedef unsigned long PixelType ;
typedef otb::Image <PixelType , dim> ImageType ;
typedef unsigned long LabelType ;
typedef itk::ShapeLabelObject <LabelType , dim> LabelObjectType;
typedef itk::LabelImageToLabelMapFilter
<ImageType , LabelMapType > ConverterType;
Firstly, the image reader is instantiated.
Here the itk::ShapeLabelObject type is chosen in order to read some attributes re-
lated to the shape of the objects, by opposition to the content of the object, with the
itk::StatisticsLabelObject.
typedef itk::ShapeLabelMapFilter <LabelMapType > ShapeFilterType;
The input image is converted in a collection of objects
ConverterType::Pointer converter = ConverterType::New();
converter ->SetInput (reader ->GetOutput ());
converter ->SetBackgroundValue(itk::NumericTraits <LabelType >:: min ());
ShapeFilterType::Pointer shape = ShapeFilterType::New();
shape ->SetInput(converter ->GetOutput ());
Update the shape ﬁlter, so its output will be up to date.
shape ->Update();
Then, we can read the attribute values we’re interested in. The
itk::BinaryImageToShapeLabelMapFilter produces consecutive labels, so we can use a
for loop and GetLabelObject() method to retrieve the label objects. If the labels are not
consecutive, the GetNthLabelObject() method must be use instead of GetLabelObject(), or an
iterator on the label object container of the label map. In this example, we write 2 shape attributes
of each object to a text ﬁle (the size and the centroid coordinates).
std::ofstream outfile(argv[2]);
LabelMapType::Pointer labelMap = shape ->GetOutput ();
for (unsigned long label = 1;
label <= labelMap ->GetNumberOfLabelObjects();
label++)
{
// We don’t need a SmartPointer of the label object here,

20.3. Object Filtering based on radiometric and statistics attributes 525
// because the reference is kept in the label map.
const LabelObjectType * labelObject = labelMap ->GetLabelObject(label);
outfile << label << "t" << labelObject ->GetPhysicalSize() << "t"
<< labelObject ->GetCentroid () << std::endl;
}
outfile.close();
20.3 Object Filtering based on radiometric and statistics attributes
Examples/OBIA/RadiometricAttributesLabelMapFilterExample.cxx.
This example shows the basic approach to perform object based analysis on a im-
age. The input image is firstly segmented using the otb::MeanShiftImageFilter
Then each segmented region is converted to a Map of labeled objects. Afterwards the
otb::otbMultiChannelRAndNIRIndexImageFilter computes radiometric attributes for each
object. In this example the NDVI is computed. The computed feature is passed to the
otb::BandsStatisticsAttributesLabelMapFilter which computes statistics over the result-
ing band. Therefore, region’s statistics over each band can be access by concatening STATS,
the band number and the statistical attribute separated by colons. In this example the mean
of the first band (which contains the NDVI) is access over all the regions with the attribute:
’STATS::Band1::Mean’.
Firstly, segment the input image by using the Mean Shift algorithm (see 8.7.2 for deeper explana-
tions).
typedef otb:: MeanShiftVectorImageFilter
<VectorImageType , VectorImageType , LabeledImageType > FilterType ;
filter ->SetSpatialRadius(spatialRadius);
filter ->SetRangeRadius(rangeRadius );
filter ->SetMinimumRegionSize(minRegionSize);
filter ->SetScale(scale);
The otb::MeanShiftImageFilter type is instantiated using the image types.
filter ->SetInput(vreader->GetOutput ());
The itk::LabelImageToLabelMapFilter type is instantiated using the output of the
otb::MeanShiftImageFilter. This filter produces a labeled image where each segmented region
has a unique label.
LabelMapFilterType::Pointer labelMapFilter = LabelMapFilterType::New();
labelMapFilter ->SetInput(filter ->GetLabeledClusteredOutput());
labelMapFilter ->SetBackgroundValue(itk::NumericTraits <LabelType >::min ());

ShapeLabelMapFilterType::Pointer shapeLabelMapFilter =
ShapeLabelMapFilterType::New ();
shapeLabelMapFilter ->SetInput(labelMapFilter ->GetOutput ());
Instantiate the otb::BandsStatisticsAttributesLabelMapFilter to compute statistics of the
feature image on each label object.
RadiometricLabelMapFilterType::Pointer radiometricLabelMapFilter
= RadiometricLabelMapFilterType::New();
Feature image could be one of the following image:
• GEMI
• NDVI
• IR
• IC
• IB
• NDWI2
• Intensity
Input image must be convert to the desired coefficient. In our case, statistics are computed on the
NDVI coefficient on each label object.
NDVIImageFilterType:: Pointer ndviImageFilter = NDVIImageFilterType::New();
ndviImageFilter ->SetRedIndex (3);
ndviImageFilter ->SetNIRIndex (4);
ndviImageFilter ->SetInput (vreader ->GetOutput ());
ImageToVectorImageCastFilterType::Pointer ndviVectorImageFilter =
ImageToVectorImageCastFilterType::New ();
ndviVectorImageFilter ->SetInput(ndviImageFilter ->GetOutput ());
radiometricLabelMapFilter->SetInput(shapeLabelMapFilter ->GetOutput ());
radiometricLabelMapFilter->SetFeatureImage(ndviVectorImageFilter ->GetOutput ());
The otb::AttributesMapOpeningLabelMapFilter will perform the selection. There are three
parameters. AttributeName specifies the radiometric attribute, Lambda controls the thresholding of
the input and ReverseOrdering make this filter to remove the object with an attribute value greater
than Lambda instead.

20.3. Object Filtering based on radiometric and statistics attributes 527
OpeningLabelMapFilterType::Pointer opening = OpeningLabelMapFilterType::New ();
opening ->SetInput(radiometricLabelMapFilter->GetOutput ());
opening ->SetAttributeName(attr);
opening ->SetLambda (thresh);
opening ->SetReverseOrdering(lowerThan );
opening ->Update();
Then, Label objects selected are transform in a Label Image using the
itk::LabelMapToLabelImageFilter.
LabelMapToBinaryImageFilterType::Pointer labelMap2LabeledImage
= LabelMapToBinaryImageFilterType::New();
labelMap2LabeledImage ->SetInput(opening->GetOutput ());
And ﬁnally, we declare the writer and call its Update() method to trigger the full pipeline execution.
writer ->SetInput(labelMap2LabeledImage ->GetOutput ());
writer ->Update();
Figure 20.2 shows the result of applying the object selection based on radiometric attributes.
Figure 20.2: Vegetation mask resulting from processing.

20.4 Hoover metrics to compare segmentations
Examples/OBIA/HooverMetricsEstimation.cxx.
The following example shows how to compare two segmentations, using Hoover metrics. For in-
stance, it can be used to compare a segmentation produced by your algorithm against a partial ground
truth segmentation. In this example, the ground truth segmentation will be refered by the letters GT
whereas the machine segmentation will be refered by MS.
The estimation of Hoover metrics is done with two filters : otb::HooverMatrixFilter and
otb::HooverInstanceFilter. The first one produces a matrix containing the number of over-
lapping pixels between MS regions and GT regions. The second one classifies each region among
four types (called Hoover instances):
• Correct detection : a region is matched with an other one in the opposite segmentation, be-
cause they cover nearly the same area.
• Over-segmentation : a GT region is matched with a group of MS regions because they cover
nearly the same area.
• Under-segmentation : a MS region is matched with a group of GT regions because they cover
nearly the same area.
• Missed detection (for GT regions) or Noise (for MS region) : un-matched regions.
Note that a region can be tagged with two types. When the Hoover instance have been found, the
instance filter computes overall scores for each category : they are the Hoover metrics 1.
#include "otbHooverMatrixFilter.h"
#include "otbHooverInstanceFilter.h"
#include "otbLabelMapToAttributeImageFilter.h"
The filters otb::HooverMatrixFilter and otb::HooverInstanceFilter are designed to han-
dle itk::LabelMap images, made with otb::AttributesMapLabelObject. This type of label
object allows to store generic attributes. Each region can store a set of attributes: in this case, Hoover
instances and metrics will be stored.
typedef otb::AttributesMapLabelObject<unsigned int, 2, float> LabelObjectType;
typedef otb::HooverMatrixFilter <LabelMapType > HooverMatrixFilterType;
typedef otb::HooverInstanceFilter <LabelMapType > InstanceFilterType;
typedef itk::LabelImageToLabelMapFilter
<ImageType , LabelMapType > ImageToLabelMapFilterType;
1see http://www.trop.mips.uha.fr/pdf/ORASIS-2009.pdf

20.4. Hoover metrics to compare segmentations 529
typedef otb::VectorImage <float, 2> VectorImageType;
typedef otb:: LabelMapToAttributeImageFilter
<LabelMapType , VectorImageType > AttributeImageFilterType;
The first step is to convert the images to label maps : we use
itk::LabelImageToLabelMapFilter. The background value sets the label value of regions
considered as background: there is no label object for the background region.
ImageToLabelMapFilterType::Pointer gt_filter = ImageToLabelMapFilterType::New ();
gt_filter ->SetInput (gt_reader ->GetOutput ());
gt_filter ->SetBackgroundValue(0);
ImageToLabelMapFilterType::Pointer ms_filter = ImageToLabelMapFilterType::New ();
ms_filter ->SetInput (ms_reader ->GetOutput ());
ms_filter ->SetBackgroundValue(0);
The Hoover matrix filter has to be updated here. This matrix must be computed before being given
to the instance filter.
HooverMatrixFilterType::Pointer hooverFilter = HooverMatrixFilterType::New();
hooverFilter ->SetGroundTruthLabelMap(gt_filter ->GetOutput ());
hooverFilter ->SetMachineSegmentationLabelMap(ms_filter ->GetOutput ());
hooverFilter ->Update();
The instance filter computes the Hoover metrics for each region. These metrics are stored as at-
tributes in each label object. The threshold parameter corresponds to the overlapping ratio above
which two regions can be matched. The extended attributes can be used if the user wants to keep
a trace of the associations between MS and GT regions : i.e. if a GT region has been matched as
a correct detection, it will carry an attribute containing the label value of the associated MS region
(the same principle goes for other types of instance).
InstanceFilterType::Pointer instances = InstanceFilterType::New ();
instances ->SetGroundTruthLabelMap(gt_filter ->GetOutput ());
instances ->SetMachineSegmentationLabelMap(ms_filter ->GetOutput ());
instances ->SetThreshold(0.75);
instances ->SetHooverMatrix(hooverFilter ->GetHooverConfusionMatrix());
instances ->SetUseExtendedAttributes(false);
The otb::LabelMapToAttributeImageFilter is designed to extract attributes values from a la-
bel map and output them in the channels of a vector image. We set the attribute to plot in each
channel.
AttributeImageFilterType::Pointer attributeImageGT = AttributeImageFilterType::New ();
attributeImageGT ->SetInput (instances ->GetOutputGroundTruthLabelMap());
attributeImageGT ->SetAttributeForNthChannel(0, InstanceFilterType::GetNameFromAttribute(Inst
writer ->SetInput(attributeImageGT ->GetOutput ());

writer ->Update();
The output image contains for each GT region its correct detection score (”RC”, band 1), its over-
segmentation score (”RF”, band 2), its under-segmentation score (”RA”, band 3) and its missed
detection score (”RM”, band 4).
std::cout << "Mean RC ="<< instances ->GetMeanRC () << std::endl;
std::cout << "Mean RF ="<< instances ->GetMeanRF () << std::endl;
std::cout << "Mean RA ="<< instances ->GetMeanRA () << std::endl;
std::cout << "Mean RM ="<< instances ->GetMeanRM () << std::endl;
std::cout << "Mean RN ="<< instances ->GetMeanRN () << std::endl;
The Hoover scores are also computed for the whole segmentations. Here is some explanation about
the score names : C = correct, F = fragmentation, A = aggregation, M = missed, N = noise.

CHAPTER
TWENTYONE
CHANGE DETECTION
21.1 Introduction
Change detection techniques try to detect and locate areas which have changed between two or more
observations of the same scene. These changes can be of different types, with different origins and
of different temporal length. This allows to distinguish different kinds of applications:
• land use monitoring, which corresponds to the characterization of the evolution of the vege-
tation, or its seasonal changes;
• natural resources management, which corresponds mainly to the characterization of the evo-
lution of the urban areas, the evolution of the deforestation, etc.
• damage mapping, which corresponds to the location of damages caused by natural or indus-
trial disasters.
From the point of view of the observed phenomena, one can distinguish 2 types of changes whose
nature is rather different: the abrupt changes and the progressive changes, which can eventually be
periodic. From the data point of view, one can have:
• Image pairs before and after the event. The applications are mainly the abrupt changes.
• Multi-temporal image series on which 2 types on changes may appear:
– The slow changes like for instance the erosion, vegetation evolution, etc. The knowledge
of the studied phenomena and of their consequences on the geometrical and radiomet-
rical evolution at the different dates is a very important information for this kind of
analysis.
– The abrupt changes may pose different kinds of problems depending on whether the date
of the change is known in the image series or not. The detection of areas affected by a
change occurred at a known date may exploit this a priori information in order to split

532 Chapter 21. Change Detection
the image series into two sub-series (before an after) and use the temporal redundancy
in order to improve the detection results. On the other hand, when the date of the change
is not known, the problem has a higher difficulty.
From this classification of the different types of problems, one can infer 4 cases for which one can
look for algorithms as a function of the available data:
1. Abrupt changes in an image pair. This is no doubt the field for which more work has been
done. One can find tools at the 3 classical levels of image processing: data level (differences,
ratios, with or without pre-filtering, etc.), feature level (edges, targets, etc.), and interpretation
level (post-classification comparison).
2. Abrupt changes within an image series and a known date. One can rely on bi-date techniques,
either by fusing the images into 2 stacks (before and after), or by fusing the results obtained
by different image couples (one after and one before the event). One can also use specific
discontinuity detection techniques to be applied in the temporal axis.
3. Abrupt changes within an image series and an unknown date. This case can be seen either as
a generalization of the preceding one (testing the N-1 positions for N dates) or as a particular
case of the following one.
4. Progressive changes within an image series. One can work in two steps:
(a) detect the change areas using stability criteria in the temporal areas;
(b) identify the changes using prior information about the type of changes of interest.
21.1.1 Surface-based approaches
In this section we discuss about the damage assessment techniques which can be applied when only
two images (before/after) are available.
As it has been shown in recent review works [31, 92, 116, 118], a relatively high number of methods
exist, but most of them have been developed for optical and infrared sensors. Only a few recent
works on change detection with radar images exist [129, 17, 104, 69, 39, 12, 71]. However, the
intrinsic limits of passive sensors, mainly related to their dependence on meteorological and illumi-
nation conditions, impose severe constraints for operational applications. The principal difficulties
related to change detection are of four types:
1. In the case of radar images, the speckle noise makes the image exploitation difficult.
2. The geometric configuration of the image acquisition can produce images which are difficult
to compare.

21.2. Change Detection Framework 533
3. Also, the temporal gap between the two acquisitions and thus the sensor aging and the inter-
calibration are sources of variability which are difficult to deal with.
4. Finally, the normal evolution of the observed scenes must not be confused with the changes
of interest.
The problem of detecting abrupt changes between a pair of images is the following: Let I1,I2 be two
images acquired at different dates t1,t2; we aim at producing a thematic map which shows the areas
where changes have taken place.
Three main categories of methods exist:
• Strategy 1: Post Classification Comparison
The principle of this approach [35] is two obtain two land-use maps independently for each
date and comparing them.
• Strategy 2: Joint classification
This method consists in producing the change map directly from a joint classification of both
images.
• Strategy 3: Simple detectors
The last approach consists in producing an image of change likelihood (by differences, ratios
or any other approach) and thresholding it in order to produce the change map.
Because of its simplicity and its low computation overhead, the third strategy is the one which has
been chosen for the processing presented here.
21.2 Change Detection Framework
Examples/ChangeDetection/ChangeDetectionFrameworkExample.cxx.
This example illustrates the Change Detector framework implemented in OTB. This
framework uses the generic programming approach. All change detection filters are
otb::BinaryFunctorNeighborhoodImageFilters, that is, they are filters taking two images as
input and providing one image as output. The change detection computation itself is performed on
a the neighborhood of each pixel of the input images.
The first step required to build a change detection filter is to include the header of the parent class.
#include "otbBinaryFunctorNeighborhoodImageFilter.h"

The change detection operation itself is one of the templates of the change detection filters and takes
the form of a function, that is, something accepting the syntax foo(). This can be implemented
using classical C/C++ functions, but it is preferable to implement it using C++ functors. These are
classical C++ classes which overload the () operator. This allows to use them with the same syntax
as C/C++ functions.
Since change detectors operate on neighborhoods, the functor call will take 2 arguments which are
itk::ConstNeighborhoodIterators.
The change detector functor is templated over the types of the input iterators and the output result
type. The core of the change detection is implemented in the operator() section.
template<class TInput1 , class TInput2, class TOutput>
class MyChangeDetector
{
public:
// The constructor and destructor.
MyChangeDetector() {}
˜MyChangeDetector() {}
// Change detection operation
inline TOutput operator ()(const TInput1& itA,
const TInput2& itB)
{
TOutput result = 0.0;
for (unsigned long pos = 0; pos < itA.Size(); ++pos)
{
result += static_cast<TOutput >(itA.GetPixel (pos) - itB.GetPixel (pos ));
}
return static_cast<TOutput >(result / itA.Size());
}
};
The interest of using functors is that complex operations can be performed using internal protected
class methods and that class variables can be used to store information so different pixel locations
can access to results of previous computations.
The next step is the definition of the change detector filter. As stated above, this filter will inherit
from otb::BinaryFunctorNeighborhoodImageFilter which is templated over the 2 input im-
age types, the output image type and the functor used to perform the change detection operation.
Inside the class only a few typedefs and the constructors and destructors have to be declared.
template <class TInputImage1 , class TInputImage2 , class TOutputImage >
class ITK_EXPORT MyChangeDetectorImageFilter :
public otb:: BinaryFunctorNeighborhoodImageFilter<
TInputImage1 , TInputImage2 , TOutputImage ,
MyChangeDetector <
typename itk::ConstNeighborhoodIterator<TInputImage1 >,

21.2. Change Detection Framework 535
typename TOutputImage::PixelType > >
{
public:
/** Standard class typedefs. */
typedef MyChangeDetectorImageFilter Self;
typedef typename otb::BinaryFunctorNeighborhoodImageFilter<
TInputImage1 , TInputImage2 , TOutputImage ,
MyChangeDetector <
typename TOutputImage::PixelType >
> Superclass ;
/** Method for creation through the object factory. */
itkNewMacro (Self);
protected:
MyChangeDetectorImageFilter() {}
virtual ˜MyChangeDetectorImageFilter() {}
private:
MyChangeDetectorImageFilter(const Self &); //purposely not implemented
void operator =(const Self&); //purposely not implemented
};
Pay attention to the fact that no .txx file is needed, since filtering operation is implemented in the
otb::BinaryFunctorNeighborhoodImageFilter class. So all the algorithmics part is inside the
functor.
We can now write a program using the change detector.
As usual, we start by defining the image types. The internal computations will be performed with
floating point precision, while the output image will be stored using one byte per pixel.
typedef otb::Image <InternalPixelType , Dimension > InputImageType1;
typedef otb::Image <InternalPixelType , Dimension > ChangeImageType;
We declare the readers, the writer, but also the itk::RescaleIntensityImageFilter which will
be used to rescale the result before writing it to a file.
typedef otb::ImageFileReader <InputImageType1 > ReaderType1 ;
typedef itk::RescaleIntensityImageFilter<ChangeImageType ,

The next step is declaring the filter for the change detection.
typedef MyChangeDetectorImageFilter<InputImageType1 , InputImageType2 ,
ChangeImageType > FilterType ;
We connect the pipeline.
reader1 ->SetFileName (inputFilename1);
filter ->SetInput1 (reader1->GetOutput ());
filter ->SetRadius (atoi(argv[3]));
And that is all.
21.3 Simple Detectors
21.3.1 Mean Difference
The simplest change detector is based on the pixel-wise differencing of image values:
ID(i, j) = I2(i, j)− I1(i, j). (21.1)
In order to make the algorithm robust to noise, one actually uses local means instead of pixel values.
Examples/ChangeDetection/DiffChDet.cxx.
This example illustrates the class otb::MeanDifferenceImageFilter for detecting changes be-
tween pairs of images. This filter computes the mean intensity in the neighborhood of each pixel
of the pair of images to be compared and uses the difference of means as a change indicator. This
example will use the images shown in figure 21.1. These correspond to the near infrared band of
two Spot acquisitions before and during a flood.
#include "otbMeanDifferenceImageFilter.h"

21.3. Simple Detectors 537
Figure 21.1: Images used for the change detection. Left: Before the flood. Right: during the flood.
We start by declaring the types for the two input images, the change image and the image to be
stored in a file for visualization.
We can now declare the types for the readers and the writer.
The change detector will give positive and negative values depending on the sign of the difference.
We are usually interested only in the asbolute value of the difference. For this purpose, we will use
the itk::AbsImageFilter. Also, before saving the image to a file in, for instance, PNG format,
we will rescale the results of the change detection in order to use all the output pixel type range of
values.
typedef itk::AbsImageFilter <ChangeImageType ,
ChangeImageType > AbsType;
typedef itk::RescaleIntensityImageFilter<ChangeImageType ,

The otb::MeanDifferenceImageFilter is templated over the types of the two input images and
the type of the generated change image.
typedef otb::MeanDifferenceImageFilter<
InputImageType1 ,
InputImageType2 ,
ReaderType1 ::Pointer reader1 = ReaderType1 ::New ();
ReaderType2 ::Pointer reader2 = ReaderType2 ::New ();
AbsType ::Pointer absFilter = AbsType ::New();
We set the parameters of the different elements of the pipeline.
The only parameter for this change detector is the radius of the window used for computing the
mean of the intensities.
filter ->SetRadius (atoi(argv[4]));
We build the pipeline by plugging all the elements together.
absFilter ->SetInput (filter ->GetOutput ());
rescaler ->SetInput (absFilter ->GetOutput ());
Since the processing time of large images can be long, it is interesting to monitor the evolution of the
computation. In order to do so, the change detectors can use the command/observer design pattern.
This is easily done by attaching an observer to the ﬁlter.
typedef otb::CommandProgressUpdate <FilterType > CommandType ;
CommandType ::Pointer observer = CommandType ::New ();
filter ->AddObserver (itk::ProgressEvent(), observer );
Figure 21.2 shows the result of the change detection by difference of local means.

Figure 21.2: Result of the mean difference change detector
21.3.2 Ratio Of Means
This detector is similar to the previous one except that it uses a ratio instead of the difference:
IR(i, j) =
I2(i, j)
I1(i, j)
. (21.2)
The use of the ratio makes this detector robust to multiplicative noise which is a good model for the
speckle phenomenon which is present in radar images.
In order to have a bounded and normalized detector the following expression is actually used:
IR(i, j) = 1 − min
I2(i, j)
I1(i, j)
,
I1(i, j)
I2(i, j)
. (21.3)
Examples/ChangeDetection/RatioChDet.cxx.
This example illustrates the class otb::MeanRatioImageFilter for detecting changes between
pairs of images. This ﬁlter computes the mean intensity in the neighborhood of each pixel of the
pair of images to be compared and uses the ratio of means as a change indicator. This change
indicator is then normalized between 0 and 1 by using the classical
r = 1 − min{
µA
µB
,
µB
µA
}, (21.4)

Figure 21.3: Images used for the change detection. Left: Before the eruption. Right: after the eruption.
where µA and µB are the local means. This example will use the images shown in figure 21.3.
These correspond to 2 Radarsat fine mode acquisitions before and after a lava flow resulting from a
volcanic eruption.
#include "otbMeanRatioImageFilter.h"
We start by declaring the types for the two input images, the change image and the image to be
stored in a file for visualization.
We can now declare the types for the readers. Since the images can be vey large, we will force the
pipeline to use streaming. For this purpose, the file writer will be streamed. This is achieved by
using the otb::ImageFileWriter class.
The change detector will give a normalized result between 0 and 1. In order to store the result in
PNG format we will rescale the results of the change detection in order to use all the output pixel
type range of values.
typedef itk::ShiftScaleImageFilter <ChangeImageType ,
The otb::MeanRatioImageFilter is templated over the types of the two input images and the
type of the generated change image.
typedef otb::MeanRatioImageFilter <

Figure 21.4: Result of the ratio of means change detector
InputImageType1 ,
InputImageType2 ,
ReaderType1 ::Pointer reader1 = ReaderType1 ::New();
ReaderType2 ::Pointer reader2 = ReaderType2 ::New();
FilterType ::Po

Software guide 3.20.0

More Related Content

What's hot (18)

Viewers also liked (8)

Similar to Software guide 3.20.0 (20)

Recently uploaded (20)

Software guide 3.20.0