SlideShare a Scribd company logo
Image Information Mining and
Knowledge Discovery from Earth
Observation Data
Towards the Sentinels Era

P.G. Marchetti ESA, M. Iapaolo Randstad
Ground Segment and Mission Operations Department
Research and Ground Segment Technology Section
Earth Observation Programmes Directorate
pier.giorgio.marchetti@esa.int michele.iapaolo@esa.int

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
Outline

1. Background on the European Space Agency
2. Motivation
3. Overview of ESA activities in the IIM field
4. Systems and services for EO data exploitation
5. The road ahead

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
The Heritage: ERS and ENVISAT
• ERS and Envisat missions 1991-2012
•

More than 2 Petabytes of data

•

Two decades of global change records

•

Need for data preservation, availability
and exploitation

ESA UNCLASSIFIED – For Official Use

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
Ten Years of Envisat Science
5000 scientific
projects
using Envisat data

Iceland
2010

Ozone hole 2005

Arctic 2007

First images
L’Aquila 2009

Global air
pollution

B-15A
iceberg
Chlorophyll
concentration

Japan 2011

Bam earthquake
Prestige tanker
oil slick

CO2 map

Launch

Hurricane
Katrina

Envisat
Symposium
Salzburg (A)

Mar 02

Sep 04

Envisat was the Sentinel
“precursor” for many operational
users

Envisat
Symposium

Montreux (CH)
Apr 07

Living Planet
Symposium
Bergen (N)
Jun 10

Living Planet
Symposium

Edinburgh (UK)
Sep 13

and many workshops dedicated to specific Envisat user communities
ESA UNCLASSIFIED – For Official Use

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
The Copernicus Programme
Copernicus (formerly known
as GMES) is a European space
flagship programme led by
the European Union

Space
Component

Provides the necessary data
for operational monitoring of
the environment and for civil
security
ESA coordinates the space(*)
component
(*)spacecraft, flight operation segment, ground
segment
ESA UNCLASSIFIED – For Official Use

In-Situ
Component

Services
Component

6

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
Copernicus Space Component: Dedicated
Missions

S1A/B: Radar Mission
S2A/B: High Resolution Optical Mission
S3A/B: Medium Resolution Imaging and Altimetry Mission
S4A/B: Geostationary Atmospheric Chemistry Mission
S5P: Low Earth Orbit Atmospheric Chemistry Precursor Mission
S5A/B/C: Low Earth Orbit Atmospheric Chemistry Mission
Jason-CS A/B: Altimetry Mission
7
ESA UNCLASSIFIED – For Official Use

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
Copernicus Contributing Missions

COSMO-Skymed

SPOT (VGT)

TerraSAR–X
Tandem-X

PROBA-V

Radarsat
DMC

Pléiades

Copernicus
Contributing
Missions

Cryosat

Deimos-2

RapidEye

Jason

Atmospheric
missions

SPOT (HRS)
MetOp
ESA UNCLASSIFIED – For Official Use

Meteosat 2nd Generation

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
Motivation
1. Foster the use of IIM and derived technologies in support of the EO
data exploitation
2. Develop state-of-the-art data processing for improving access and
dissemination of future EO data (e.g. Sentinels mission)
3. Implement systems and services for supporting the “scientific
exploitation” of EO data
4. Investigate new approaches and methodologies to exploit data from

all available missions and archives (joint effort with Long Term Data
Preservation programme)
International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
Image Information Mining
Coordination Group (IIMCG)
The Image Information Mining Coordination Group (IIMCG):
Space Agencies (ESA, DLR, CNES, ASI)
European Institutions (EUSC, JRC)
 National Research Institutes (Uni-Trento, ETHZ, INGV, Mississippi
State University)



Main objectives:
 Inform Agencies and partners, promote research and technological
activities on IIM (automatic information extraction from EO data for
image understanding and retrieval)
 Promote the use of IIM techniques for management and exploitation
of very large EO data archives/missions (PB of data)
 Foster the role of IIM in the context of future missions and existing
archives
 Involve industry and agency partners to increase the relevance of
IIM activities in Europe

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
10 years of IIM activities at ESA
2000

2002

2004

2006

2008

2010

2012

Knowledge Driven Information
Mining Prototype
Knowledge-centred Earth
Observation Prototype

Multi-sensor Evolution Analysis Prototype


Technology activities over last decade



Main achievements:






KIM System: IIM reference prototype @ ESRIN
Platforms for EO data exploitation (KEO, GPOD, SSE, etc.)
Tools for multi-temporal and evolution analysis (MEA)

Issues:




Limited number of scientific and industrial partners involved
National efforts not coordinated and harmonised
Funds limited wrt the size of the research goals

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
Image Information Mining
From Data to Information

Data (PB)

Acquisition

EO Data

Catalogue & Ordering

Image Information
Mining (IIM)
Information

Algorithms & Applications

Knowledge

Models & Ground Truth

Information (KB)

 Provides processing tools to extract features from images and associate meaning
to extracted features (bridging the gap between data and information)
 Empower users (researchers, service providers, decision makers) to identify and
reuse relevant information for their applications
 Encourage the use of common cooperative environments to achieve a common
knowledge
International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
KIM

(Knowledge-based Information Mining)

The KIM prototype developed @ ESRIN permits:
 Intelligent and effective access to information in large EO datasets
 Improved exploration and use of EO images for scientific research
 Extraction of relevant information for different applications (change
detection, global monitoring, disaster management, …)
 Implementation, integration and validation of services derived from IIM methods

Three main components:
1. Ingestion Software (Primitive Feature Extraction / Clustering)
2. Database (storing extracted information)
3. Interactive Client Application
i.
ii.
iii.
iv.

Training and definition of “semantic rules”
Application of training (rules) to the entire collection
Definition of “semantic labels” for extracted information
Store for successive re-use

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
KIM Architectural Elements
Input
 EO images

Output
 Identifiers of searched images
 Feature Maps / Thematic maps

laveirteR egam I desaB tnetnoC
laveirteR egam I desaB tnetnoC

KIM
seirotisopeR ataD
seirotisopeR ataD

Ingestion

Feature Extraction
Clustering

Database
XUA
XUA

noitacifissalC desivrepusnU
nInformation usnU
oitacifissalC desivrep

Mining

EO Images
seires emiT
ssiisyls na iT
e re a em
sisylana

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013

Client
KIM Search and label
KIM permits to inspect a collection of images…
…interactively define “semantic features” using the “primitive features” extracted by the
system…
…search for the defined feature within the entire collection…

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
KIM Information Extraction
…and extract Feature Maps or Thematic Maps
Cloud masks

Flooded areas

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013

Forest Monitoring
KIM Primitive features
Spectral

Spectral signature

Texture

Structural information extracted with the Gibbs Marcov Random Fields (GMRF) model
S0 - full resolution images; S1 - sub-sampled images

DCT

Discrete Cosine Transform: transforms signals and images from the spatial domain to
the frequency domain

EMBD

Enhanced-Model-Based-Despeckling: performs a high quality despeckling of SAR
images

Area

Area of the objects detected with the segmentation process

Compactness

Compactness of the objects detected with the segmentation process

Spectral Mean

Mean value of the radiometric information of the image inside the closed area detected
by the segmenter

Spectral
Variance

Variance of the radiometric information of the image inside the closed area detected by
the segmenter

Hu Moments

Hu-Moment Invariants: shape information conveyed by the contour points. Hu moments
are invariant to scale, rotation and translation (the first 4 out of 7 invariant moments as
shape descriptors have been used).

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
KIM Validation
KIM has been tested and validated with different datasets:
1. MERIS RR / MERIS FR

2. ERS / ASAR
3. SPOT
4. Landsat
5. Maps (Level 2 / Level 3 products)
Large number of collection created
Low number of significant semantic features identified

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
KIM for Information Extraction
1. Flood Detection (SAR data)
2. Cloud Detection (MERIS RR)
3. Long-term Forest Monitoring (Landsat)

4. Rapid Mapping / Damage Assessment (VHR optical data)

Potentialities of the tool have been highlighted and
confirmed in different contexts

End-users expectations not always achieved

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
KEO

(Knowledge centred Earth Observation)

KEO is a distributed Component-based Processing Environment (CPE)
permitting to:
a.

Create & semantically identify internal/external Processing Components

b.

Graphically chain Processing Components into processing chains

c.

Create Processing Components from IIM components (KIM training)

d.

Export and store outputs into Web Servers (WFS, WMS, WCS)

KEO also provides some relevant Reference Data Sets:
a.

Heterogeneous data and information, growing with external
contributions (images, documents, DEMs, photos, processors, etc.)

b.

In support of various applications: Classification, Time Series
Analysis, Ortho-rectification, Urban Monitoring, Interferometry, etc.)

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
KEO CPE
Graphical Processor Designer

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
MEA

(Multi-temporal Evolution Analysis)

Multi-temporal analysis of HR / VHR products:
1. Select multi-temporal applications that might benefit from such extension
2. Design, implement and integrate the automatic multi-temporal algorithms
to support the selected applications

3. Create the needed HR/VHR Reference Data Sets and Evolution Models
4. Develop standard interfaces between the different systems for common
exploitation of ingested data and processing capabilities
5. Integrate algorithms and Evolution Models provided by other independent
projects

6. Validate (with the support of a Validation Group) the Automatic Multitemporal algorithms and Evolution Models

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
MEA

(Multi-temporal Evolution Analysis)

The MEA-ASIM system aims at providing:
1. Advanced tools for Land Use / Land Cover change analysis
2. Level-2 EO products for real time exploitation
3. Interfaces to external systems (G-POD, KEO, data providers, etc.)
4. Access to data via standard WCS OGC interface

RSS Data Farm

5. Native support for Sentinel-2 datasets

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
MEA Pixel and Coverage analysis

Time-Series Analisys
Single and multi plot functionality

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
MEA Pixel and Coverage analysis

Cross-comparison of EO products

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
Exploitation Platforms for EO
Development and implementation of collaborative Exploitation
Platforms (G-POD, SSEP, E-CEO, etc.):
1. Fostering the scientific exploitation of EO data
2. Automating the creation data mining and information extraction
experiments and algorithms

3. Supporting the creation of EO-based applications and services
4. Supporting the entire scientific research process:
a.

Addressing specific scientific challenges and tackling new research
problems in a “parallel and collaborative way”

b.

Generation of reproducible results that can be easily shared and
validated

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
Research and Service Support:
Research Process

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
RSS G-POD Process Steps

Principal Investigators
•
•

RSS

EO algorithms delivery
Data type and range indication
•

Data are made available in the RSS
catalogue

•

Algorithm porting and Integration

•

Output validation

•

Test and validation (involving the PI)

•

On-demand EO data processing

•

On-demand EO data processing

•

Use of produced data (scientific
projects delivery)

•

Delivery

•

Publications

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
RSS Flexible Resources
On-demand processing service:
Process

EO data

delivery

G-POD

EO Scientists
Principal
Investigators

Platform

Volume accessed by PI projects in 2012:
Infrastructure
• Total Number Submitted Jobs 38,774
• Average Number of Products per Job: 35
• Average Product Size: 700 MB
• Total Size Data Processed: 906 TB
ESRIN
- 172 cores
- 400 TB

UK-PAC
- 96 cores
- 300 TB

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

Flexible/
Unlimited
Infrastructure
- 10-200 cores
- 1-10 TB

26/11/2013

Flexible Infrastructure satisfies:
HW requirements
Connectivity requirements
SLA (HA, help desk, ticketing
systems, etc.)
RSS Facts & Figures
On-demand Processing: actual figures in the last 3 years
•Supported more than 40 active users per year
•Supported >20 processing/re-processing campaigns (included entire missions, e.g.
MERIS, ASAR, SMOS and TPM)
•Integrated ~10 new algorithms per year
•Upgraded ~15 algorithms per year
•Set-up flexible (additional) processing capacity in less than 2 working days
•Managed >450 TB data farm (ESA, TPM and scientific products)
•

ESA – ENVISAT (~320TB), ERS (~50TB), SMOS (~10TB)

•

TPM – MSG (~19TB), METOP (~11TB), ALOS (~2TB)

•

Scientific products – AARDVARC Swansea University and MGVI JRC (produced
by GPOD and distributed via SSE), MKL3 ACRI

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
SMOS Testbed
Service Purpose
SMOS Testbed aims to provide a flexible test environment to support the ESA calibration team for L1
calibration, and the Expert Support Laboratories (ESLs) for L2 Soil Moisture and Ocean Salinity pre-validation.

G-POD support elements
–

Fast integration of new versions

–

SMOS L0 NRT ingestion chain set-up for L1 NRT custom re-processing

–

Access to online data for bulk re-processing

–

Access to flexible cloud resources for meeting deadlines

–

On-demand SMOS L1 and L2 processors available for SMOS Teams

SMOS New
Processor
Delivery

Processor
Integration in
G-POD

G-POD
Processing
Campaign

Results
Analysis and
Validation

Auxiliary And
Calibration Datasets
ESA UNCLASSIFIED – For Official Use

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
SMOS Testbed
SMOS L1 TESTBED
Processor:s Calibration,
Telemetry, Level 1A, 1B, 1C
Supported versions: 3.46, 5.00,
5.01, 5.02, 5.03, 5.04, 5.05, 6.00,
6.01
Reference data series: L0, L1A,
L1B, L1C (reprocessed)
Auxiliary data baseline: as
per Operational environment

ESA UNCLASSIFIED – For Official Use

SMOS L2 SOIL
MOISTURE TESTBED

SMOS L2 OCEAN
SALINITY TESTBED

Processor: SM L2, SM L2 postprocessing
Supported versions: 4.00, 4.01
Reference data series: L1C
(reprocessed)
Auxiliary data baseline: as
per CESBIO reprocessing

Processor: OS L2
Supported versions: 5.00, 5.50
Reference data series: L1C
(reprocessed)
Auxiliary data baseline: as
per Operational environment

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
The road ahead (1)
Funds for a full scale research programme are needed to foster:
 The widening of competence and expertise in several research
centres/industrial actors in Europe
 The widening of efforts to cover time series analysis and data
analytics in general
 Research and development of multi dimensional and scalable DB
solutions (including nosql databases, hadoop, etc.)
 Large collaborative and persistent effort on crowdsourcing,
benchmarking, image and feature annotation and evaluation
 Establishing a theoretical framework to bridge the semantic gap and
be able to assign “discriminating power” to extracted features and
“categorization” of extracted classes/objects
 High quality software and algorithm developments able to reach at
least the “software prototype” readiness level

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
The road ahead (2)
To achieve these goals it is necessary to:
 Establish a common “Big Data Mining” framework with
interdisciplinary partners
 Establish a R&D network to sustain this field
 Establish a network of users, and give them access to IIM resources
(system, data, …)
 Enlarge the scope of “Image” Mining to the physical parameters
measured by EO instruments
 Address the “instrument” gap, instrument-application
 Develop methods to use heterogeneous data: in
situ, metadata, linked data, models, etc.

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
The road ahead (3)
 Activities to be started:
 Promote IIM technology acceptance for EO users
 Extend and adapt methods from multimedia and social nets
 Apply human computing, gather knowledge from the use of the
system, adaptation, personalization, etc.
 Focus on Web/Internet based systems
 Develop simple and specific HMI and GUI
 Focus on Visual Data Mining, Visual Analytics, and related methods

 In the PDGS identify “long term data preservation” and “interactive
data exploitation” components
 Design data representations: actionable information

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
The road ahead (4)
Merge the best of :
• data mining approach
• time series capability
• ability to support and host the user
algorithm

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013
MANY THANKS!

ESA UNCLASSIFIED – For Official Use

International Conference Frontiers in Diagnostic Technologies (ICFDT 2013)

26/11/2013

More Related Content

PPTX
Big data from space technology 150611 @ spaceops 2015
PPTX
Earth Observation - DUE/VAE
PDF
Beyond the sky
PPTX
Copernicus & H2020-Earth Observation Call
PPTX
Better Hackathon 2020 - SatCen - SAR-Based Change Detection - Enhanced Analys...
PPTX
GEM Risk: main achievements during the first implementation phase
PDF
GEO Expert Advisory Group - ESA Thematic Exploitation Platforms - Geohazards
PPTX
Geohazards Exploitation Platform (GEP) at EuroGEOSS Workshop 2018
Big data from space technology 150611 @ spaceops 2015
Earth Observation - DUE/VAE
Beyond the sky
Copernicus & H2020-Earth Observation Call
Better Hackathon 2020 - SatCen - SAR-Based Change Detection - Enhanced Analys...
GEM Risk: main achievements during the first implementation phase
GEO Expert Advisory Group - ESA Thematic Exploitation Platforms - Geohazards
Geohazards Exploitation Platform (GEP) at EuroGEOSS Workshop 2018

What's hot (19)

PPTX
ESCAPE Kick-off meeting - The Extremely Large Telescope (and ESO) (Feb 2019)
PDF
INSPIRE Data harmonisation : methodology and tools
PDF
Small Satellites and Earth Observation. The UPC NanoSat program
PDF
Copernicus and AI
PDF
Extreme earth overview
PDF
Environmental mapping: drones, aerial or satellite images?
PDF
Machine Learning for Better Maps
PPTX
Drones and A.I in Earth Science
PDF
GMT AITC brief_final
PDF
Inspire-hands_on-data_transformation
PDF
IRJET- Geological Boundary Detection for Satellite Images using AI Technique
PPT
INSPIRE in action
PPT
Symposium 2008
PPTX
Satellite Technology, Applications & Engineering Standardisation
PDF
A visualization-oriented 3D method for efficient computation of urban solar r...
PPTX
Presentation Template
PDF
Design and Fabrication of Ground Station Antenna
PDF
an-open-source-3d-solar-radiation-model-integrated-with-a-3d-geographic-infor...
PDF
Haris Haralambous
ESCAPE Kick-off meeting - The Extremely Large Telescope (and ESO) (Feb 2019)
INSPIRE Data harmonisation : methodology and tools
Small Satellites and Earth Observation. The UPC NanoSat program
Copernicus and AI
Extreme earth overview
Environmental mapping: drones, aerial or satellite images?
Machine Learning for Better Maps
Drones and A.I in Earth Science
GMT AITC brief_final
Inspire-hands_on-data_transformation
IRJET- Geological Boundary Detection for Satellite Images using AI Technique
INSPIRE in action
Symposium 2008
Satellite Technology, Applications & Engineering Standardisation
A visualization-oriented 3D method for efficient computation of urban solar r...
Presentation Template
Design and Fabrication of Ground Station Antenna
an-open-source-3d-solar-radiation-model-integrated-with-a-3d-geographic-infor...
Haris Haralambous
Ad

Similar to Big Data, Data and Information Mining for Earth Observation (20)

PDF
A Review on Multispectral Satellite Image Dehazing Techniques
DOCX
Remote sensing 311
PPT
Critical Infrastructure Monitoring Using UAV Imagery
PPTX
Digital Photogrammetry, UAV, Lidar, Drone Image processing.
PDF
Intern report final
PPT
Glasgow University Geo Metadata Workshop
PPT
Satellite Image Data Service
PPTX
understanding the planet using satellites and deep learning
PDF
REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS
PDF
A Review Of Different Approaches Of Land Cover Mapping
PPT
In sar 1-1-2011
PDF
ORFEO ToolBox at CS-SI From research to operational applications
 
PDF
National Highway Alignment from Namakkal to Erode Using GIS
PDF
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
PDF
Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...
PDF
Ijetcas14 474
PDF
Advances in Agricultural remote sensings
PDF
Person Detection in Maritime Search And Rescue Operations
PDF
Person Detection in Maritime Search And Rescue Operations
PDF
3d Modelling of Structures using terrestrial laser scanning technique
A Review on Multispectral Satellite Image Dehazing Techniques
Remote sensing 311
Critical Infrastructure Monitoring Using UAV Imagery
Digital Photogrammetry, UAV, Lidar, Drone Image processing.
Intern report final
Glasgow University Geo Metadata Workshop
Satellite Image Data Service
understanding the planet using satellites and deep learning
REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS
A Review Of Different Approaches Of Land Cover Mapping
In sar 1-1-2011
ORFEO ToolBox at CS-SI From research to operational applications
 
National Highway Alignment from Namakkal to Erode Using GIS
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
Next Gen Computational Ophthalmic Imaging for Neurodegenerative Diseases and ...
Ijetcas14 474
Advances in Agricultural remote sensings
Person Detection in Maritime Search And Rescue Operations
Person Detection in Maritime Search And Rescue Operations
3d Modelling of Structures using terrestrial laser scanning technique
Ad

Recently uploaded (20)

PPTX
A Presentation on Artificial Intelligence
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Encapsulation theory and applications.pdf
PPTX
Cloud computing and distributed systems.
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Modernizing your data center with Dell and AMD
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
A Presentation on Artificial Intelligence
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Encapsulation theory and applications.pdf
Cloud computing and distributed systems.
Chapter 3 Spatial Domain Image Processing.pdf
Encapsulation_ Review paper, used for researhc scholars
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
CIFDAQ's Market Insight: SEC Turns Pro Crypto
20250228 LYD VKU AI Blended-Learning.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
NewMind AI Monthly Chronicles - July 2025
Modernizing your data center with Dell and AMD
Reach Out and Touch Someone: Haptics and Empathic Computing
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Digital-Transformation-Roadmap-for-Companies.pptx

Big Data, Data and Information Mining for Earth Observation

  • 1. Image Information Mining and Knowledge Discovery from Earth Observation Data Towards the Sentinels Era P.G. Marchetti ESA, M. Iapaolo Randstad Ground Segment and Mission Operations Department Research and Ground Segment Technology Section Earth Observation Programmes Directorate pier.giorgio.marchetti@esa.int michele.iapaolo@esa.int International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 2. Outline 1. Background on the European Space Agency 2. Motivation 3. Overview of ESA activities in the IIM field 4. Systems and services for EO data exploitation 5. The road ahead International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 3. The Heritage: ERS and ENVISAT • ERS and Envisat missions 1991-2012 • More than 2 Petabytes of data • Two decades of global change records • Need for data preservation, availability and exploitation ESA UNCLASSIFIED – For Official Use International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 4. Ten Years of Envisat Science 5000 scientific projects using Envisat data Iceland 2010 Ozone hole 2005 Arctic 2007 First images L’Aquila 2009 Global air pollution B-15A iceberg Chlorophyll concentration Japan 2011 Bam earthquake Prestige tanker oil slick CO2 map Launch Hurricane Katrina Envisat Symposium Salzburg (A) Mar 02 Sep 04 Envisat was the Sentinel “precursor” for many operational users Envisat Symposium Montreux (CH) Apr 07 Living Planet Symposium Bergen (N) Jun 10 Living Planet Symposium Edinburgh (UK) Sep 13 and many workshops dedicated to specific Envisat user communities ESA UNCLASSIFIED – For Official Use International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 5. The Copernicus Programme Copernicus (formerly known as GMES) is a European space flagship programme led by the European Union Space Component Provides the necessary data for operational monitoring of the environment and for civil security ESA coordinates the space(*) component (*)spacecraft, flight operation segment, ground segment ESA UNCLASSIFIED – For Official Use In-Situ Component Services Component 6 International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 6. Copernicus Space Component: Dedicated Missions S1A/B: Radar Mission S2A/B: High Resolution Optical Mission S3A/B: Medium Resolution Imaging and Altimetry Mission S4A/B: Geostationary Atmospheric Chemistry Mission S5P: Low Earth Orbit Atmospheric Chemistry Precursor Mission S5A/B/C: Low Earth Orbit Atmospheric Chemistry Mission Jason-CS A/B: Altimetry Mission 7 ESA UNCLASSIFIED – For Official Use International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 7. Copernicus Contributing Missions COSMO-Skymed SPOT (VGT) TerraSAR–X Tandem-X PROBA-V Radarsat DMC Pléiades Copernicus Contributing Missions Cryosat Deimos-2 RapidEye Jason Atmospheric missions SPOT (HRS) MetOp ESA UNCLASSIFIED – For Official Use Meteosat 2nd Generation International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 8. Motivation 1. Foster the use of IIM and derived technologies in support of the EO data exploitation 2. Develop state-of-the-art data processing for improving access and dissemination of future EO data (e.g. Sentinels mission) 3. Implement systems and services for supporting the “scientific exploitation” of EO data 4. Investigate new approaches and methodologies to exploit data from all available missions and archives (joint effort with Long Term Data Preservation programme) International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 9. Image Information Mining Coordination Group (IIMCG) The Image Information Mining Coordination Group (IIMCG): Space Agencies (ESA, DLR, CNES, ASI) European Institutions (EUSC, JRC)  National Research Institutes (Uni-Trento, ETHZ, INGV, Mississippi State University)   Main objectives:  Inform Agencies and partners, promote research and technological activities on IIM (automatic information extraction from EO data for image understanding and retrieval)  Promote the use of IIM techniques for management and exploitation of very large EO data archives/missions (PB of data)  Foster the role of IIM in the context of future missions and existing archives  Involve industry and agency partners to increase the relevance of IIM activities in Europe International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 10. 10 years of IIM activities at ESA 2000 2002 2004 2006 2008 2010 2012 Knowledge Driven Information Mining Prototype Knowledge-centred Earth Observation Prototype Multi-sensor Evolution Analysis Prototype  Technology activities over last decade  Main achievements:     KIM System: IIM reference prototype @ ESRIN Platforms for EO data exploitation (KEO, GPOD, SSE, etc.) Tools for multi-temporal and evolution analysis (MEA) Issues:    Limited number of scientific and industrial partners involved National efforts not coordinated and harmonised Funds limited wrt the size of the research goals International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 11. Image Information Mining From Data to Information Data (PB) Acquisition EO Data Catalogue & Ordering Image Information Mining (IIM) Information Algorithms & Applications Knowledge Models & Ground Truth Information (KB)  Provides processing tools to extract features from images and associate meaning to extracted features (bridging the gap between data and information)  Empower users (researchers, service providers, decision makers) to identify and reuse relevant information for their applications  Encourage the use of common cooperative environments to achieve a common knowledge International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 12. KIM (Knowledge-based Information Mining) The KIM prototype developed @ ESRIN permits:  Intelligent and effective access to information in large EO datasets  Improved exploration and use of EO images for scientific research  Extraction of relevant information for different applications (change detection, global monitoring, disaster management, …)  Implementation, integration and validation of services derived from IIM methods Three main components: 1. Ingestion Software (Primitive Feature Extraction / Clustering) 2. Database (storing extracted information) 3. Interactive Client Application i. ii. iii. iv. Training and definition of “semantic rules” Application of training (rules) to the entire collection Definition of “semantic labels” for extracted information Store for successive re-use International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 13. KIM Architectural Elements Input  EO images Output  Identifiers of searched images  Feature Maps / Thematic maps laveirteR egam I desaB tnetnoC laveirteR egam I desaB tnetnoC KIM seirotisopeR ataD seirotisopeR ataD Ingestion Feature Extraction Clustering Database XUA XUA noitacifissalC desivrepusnU nInformation usnU oitacifissalC desivrep Mining EO Images seires emiT ssiisyls na iT e re a em sisylana International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013 Client
  • 14. KIM Search and label KIM permits to inspect a collection of images… …interactively define “semantic features” using the “primitive features” extracted by the system… …search for the defined feature within the entire collection… International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 15. KIM Information Extraction …and extract Feature Maps or Thematic Maps Cloud masks Flooded areas International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013 Forest Monitoring
  • 16. KIM Primitive features Spectral Spectral signature Texture Structural information extracted with the Gibbs Marcov Random Fields (GMRF) model S0 - full resolution images; S1 - sub-sampled images DCT Discrete Cosine Transform: transforms signals and images from the spatial domain to the frequency domain EMBD Enhanced-Model-Based-Despeckling: performs a high quality despeckling of SAR images Area Area of the objects detected with the segmentation process Compactness Compactness of the objects detected with the segmentation process Spectral Mean Mean value of the radiometric information of the image inside the closed area detected by the segmenter Spectral Variance Variance of the radiometric information of the image inside the closed area detected by the segmenter Hu Moments Hu-Moment Invariants: shape information conveyed by the contour points. Hu moments are invariant to scale, rotation and translation (the first 4 out of 7 invariant moments as shape descriptors have been used). International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 17. KIM Validation KIM has been tested and validated with different datasets: 1. MERIS RR / MERIS FR 2. ERS / ASAR 3. SPOT 4. Landsat 5. Maps (Level 2 / Level 3 products) Large number of collection created Low number of significant semantic features identified International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 18. KIM for Information Extraction 1. Flood Detection (SAR data) 2. Cloud Detection (MERIS RR) 3. Long-term Forest Monitoring (Landsat) 4. Rapid Mapping / Damage Assessment (VHR optical data) Potentialities of the tool have been highlighted and confirmed in different contexts End-users expectations not always achieved International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 19. KEO (Knowledge centred Earth Observation) KEO is a distributed Component-based Processing Environment (CPE) permitting to: a. Create & semantically identify internal/external Processing Components b. Graphically chain Processing Components into processing chains c. Create Processing Components from IIM components (KIM training) d. Export and store outputs into Web Servers (WFS, WMS, WCS) KEO also provides some relevant Reference Data Sets: a. Heterogeneous data and information, growing with external contributions (images, documents, DEMs, photos, processors, etc.) b. In support of various applications: Classification, Time Series Analysis, Ortho-rectification, Urban Monitoring, Interferometry, etc.) International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 20. KEO CPE Graphical Processor Designer International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 21. MEA (Multi-temporal Evolution Analysis) Multi-temporal analysis of HR / VHR products: 1. Select multi-temporal applications that might benefit from such extension 2. Design, implement and integrate the automatic multi-temporal algorithms to support the selected applications 3. Create the needed HR/VHR Reference Data Sets and Evolution Models 4. Develop standard interfaces between the different systems for common exploitation of ingested data and processing capabilities 5. Integrate algorithms and Evolution Models provided by other independent projects 6. Validate (with the support of a Validation Group) the Automatic Multitemporal algorithms and Evolution Models International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 22. MEA (Multi-temporal Evolution Analysis) The MEA-ASIM system aims at providing: 1. Advanced tools for Land Use / Land Cover change analysis 2. Level-2 EO products for real time exploitation 3. Interfaces to external systems (G-POD, KEO, data providers, etc.) 4. Access to data via standard WCS OGC interface RSS Data Farm 5. Native support for Sentinel-2 datasets International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 23. MEA Pixel and Coverage analysis Time-Series Analisys Single and multi plot functionality International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 24. MEA Pixel and Coverage analysis Cross-comparison of EO products International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 25. Exploitation Platforms for EO Development and implementation of collaborative Exploitation Platforms (G-POD, SSEP, E-CEO, etc.): 1. Fostering the scientific exploitation of EO data 2. Automating the creation data mining and information extraction experiments and algorithms 3. Supporting the creation of EO-based applications and services 4. Supporting the entire scientific research process: a. Addressing specific scientific challenges and tackling new research problems in a “parallel and collaborative way” b. Generation of reproducible results that can be easily shared and validated International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 26. Research and Service Support: Research Process International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 27. RSS G-POD Process Steps Principal Investigators • • RSS EO algorithms delivery Data type and range indication • Data are made available in the RSS catalogue • Algorithm porting and Integration • Output validation • Test and validation (involving the PI) • On-demand EO data processing • On-demand EO data processing • Use of produced data (scientific projects delivery) • Delivery • Publications International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 28. RSS Flexible Resources On-demand processing service: Process EO data delivery G-POD EO Scientists Principal Investigators Platform Volume accessed by PI projects in 2012: Infrastructure • Total Number Submitted Jobs 38,774 • Average Number of Products per Job: 35 • Average Product Size: 700 MB • Total Size Data Processed: 906 TB ESRIN - 172 cores - 400 TB UK-PAC - 96 cores - 300 TB International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) Flexible/ Unlimited Infrastructure - 10-200 cores - 1-10 TB 26/11/2013 Flexible Infrastructure satisfies: HW requirements Connectivity requirements SLA (HA, help desk, ticketing systems, etc.)
  • 29. RSS Facts & Figures On-demand Processing: actual figures in the last 3 years •Supported more than 40 active users per year •Supported >20 processing/re-processing campaigns (included entire missions, e.g. MERIS, ASAR, SMOS and TPM) •Integrated ~10 new algorithms per year •Upgraded ~15 algorithms per year •Set-up flexible (additional) processing capacity in less than 2 working days •Managed >450 TB data farm (ESA, TPM and scientific products) • ESA – ENVISAT (~320TB), ERS (~50TB), SMOS (~10TB) • TPM – MSG (~19TB), METOP (~11TB), ALOS (~2TB) • Scientific products – AARDVARC Swansea University and MGVI JRC (produced by GPOD and distributed via SSE), MKL3 ACRI International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 30. SMOS Testbed Service Purpose SMOS Testbed aims to provide a flexible test environment to support the ESA calibration team for L1 calibration, and the Expert Support Laboratories (ESLs) for L2 Soil Moisture and Ocean Salinity pre-validation. G-POD support elements – Fast integration of new versions – SMOS L0 NRT ingestion chain set-up for L1 NRT custom re-processing – Access to online data for bulk re-processing – Access to flexible cloud resources for meeting deadlines – On-demand SMOS L1 and L2 processors available for SMOS Teams SMOS New Processor Delivery Processor Integration in G-POD G-POD Processing Campaign Results Analysis and Validation Auxiliary And Calibration Datasets ESA UNCLASSIFIED – For Official Use International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 31. SMOS Testbed SMOS L1 TESTBED Processor:s Calibration, Telemetry, Level 1A, 1B, 1C Supported versions: 3.46, 5.00, 5.01, 5.02, 5.03, 5.04, 5.05, 6.00, 6.01 Reference data series: L0, L1A, L1B, L1C (reprocessed) Auxiliary data baseline: as per Operational environment ESA UNCLASSIFIED – For Official Use SMOS L2 SOIL MOISTURE TESTBED SMOS L2 OCEAN SALINITY TESTBED Processor: SM L2, SM L2 postprocessing Supported versions: 4.00, 4.01 Reference data series: L1C (reprocessed) Auxiliary data baseline: as per CESBIO reprocessing Processor: OS L2 Supported versions: 5.00, 5.50 Reference data series: L1C (reprocessed) Auxiliary data baseline: as per Operational environment International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 32. The road ahead (1) Funds for a full scale research programme are needed to foster:  The widening of competence and expertise in several research centres/industrial actors in Europe  The widening of efforts to cover time series analysis and data analytics in general  Research and development of multi dimensional and scalable DB solutions (including nosql databases, hadoop, etc.)  Large collaborative and persistent effort on crowdsourcing, benchmarking, image and feature annotation and evaluation  Establishing a theoretical framework to bridge the semantic gap and be able to assign “discriminating power” to extracted features and “categorization” of extracted classes/objects  High quality software and algorithm developments able to reach at least the “software prototype” readiness level International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 33. The road ahead (2) To achieve these goals it is necessary to:  Establish a common “Big Data Mining” framework with interdisciplinary partners  Establish a R&D network to sustain this field  Establish a network of users, and give them access to IIM resources (system, data, …)  Enlarge the scope of “Image” Mining to the physical parameters measured by EO instruments  Address the “instrument” gap, instrument-application  Develop methods to use heterogeneous data: in situ, metadata, linked data, models, etc. International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 34. The road ahead (3)  Activities to be started:  Promote IIM technology acceptance for EO users  Extend and adapt methods from multimedia and social nets  Apply human computing, gather knowledge from the use of the system, adaptation, personalization, etc.  Focus on Web/Internet based systems  Develop simple and specific HMI and GUI  Focus on Visual Data Mining, Visual Analytics, and related methods  In the PDGS identify “long term data preservation” and “interactive data exploitation” components  Design data representations: actionable information International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 35. The road ahead (4) Merge the best of : • data mining approach • time series capability • ability to support and host the user algorithm International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013
  • 36. MANY THANKS! ESA UNCLASSIFIED – For Official Use International Conference Frontiers in Diagnostic Technologies (ICFDT 2013) 26/11/2013

Editor's Notes

  • #7: European independence in data sources for environment and security monitoringEuropean independence & contribution to global observing system Copernicus is a European system for monitoring the Earth. Copernicus consists of a complex set of systems which collect data from multiple sources: earth observation satellites and in situ sensors such as ground stations, airborne and sea-borne sensors. It processes these data and provides users with reliable and up-to-date information through a set of services related to environmental and security issues.
  • #8: TheSentinel-Satellites(S1A/B, S2A/B, S3A/B, S4A/B and S5 Precursor) are under development, S-5 and Jason-CS are under definition Satellite launchesas from beginning 2014 Theground segment (data reception, processing and dissemination)is getting ready for Sentinel launchesEUis responsible for Copernicus overall and for servicesESAis responsible forthe Space Component
  • #9: Available today or planned at European, national and international levelDeveloped for other purposes but making important data available for CopernicusList not exhaustive (+ Seosar, SPOT-6/-7, TanDEM-X, EnMAP, Venμs, Altika, Deimos2, etc. )… will evolve based on service requirements and mission availabilities