SlideShare a Scribd company logo
INSPIRE Transformation with Stetl
-
A lightweight Python Framework
for Geospatial ETL
Just van den Broecke
EuroGeographics - KEN Workshop
Paris, Oct 8, 2013
www.justobjects.nl
About Me
Independent Open Source Geospatial Professional
Secretary OSGeo Dutch Local Chapter
Member of the Dutch OpenGeoGroep
Just van den Broecke
just@justobjects.nl
www.justobjects.nl
We have a
Problem
The Rich GML
Problem
Rich GML = Complex Mess
INSPIRE
Dutch National Datasets
Germany:AFIS-ALKIS-ATKIS
UK: OS Mastermap
.
.
“Semi GML”
e.g. Dutch Addresses & Buildings (BAG)
Arbitrary
Nesting
The Street Name!
A Street Element in an INSPIRE Annex I Address..
Complex
Model
Transformations
100+ MB
GML Files
Stetl for INSPIRE Data Transformation
Millions
of
Objects
10s of Millions
of
<Elements>
Multiple
Transformation
Steps
Solution is
Spatial ETL
But How ?
(with FOSS)
FOSS ETL - DIY ? Maybe
FOSS ETL - High Level
FOSS ETL - Lower Level
Each powerful individually but
cannot do the entire ETL
ogr2ogr
FOSS ETL - How to Combine?
=+ + ?
ogr2ogr
Example - 2011 Kadaster ESDIN
http://inspire.kademo.nl/doc/design-etl.html
Good ideas but
hard to scale and reuse.
Need Framework
FOSS ETL :Add Python to Equation
=+ + ?( )
ogr2ogr
=+ +
Stetl
( )
ogr2ogr
Stetl
=
Simple
Streaming
Spatial
Speedy
ETL
GML1
GML2
Stetl
From Barrels of GML to Maps
Stetl for INSPIRE Data Transformation
From Local National Data
to INSPIRE DL Services
Source
<GML>
NLExtract
Stetl
deegree
WFS
INSPIRE
<GML>
Atom
Feed
INSPIRE
Addresses
Dutch
Addresses+
Buildings
deegree
blobstore
Stetl
Stetl
Concepts
Process Chain
Input Filter OutputFilter
Stetl concepts
Source Target
Process Chain
Input Filter Output
gml
Filter
Stetl concepts
Example: GML to PostGIS
Reader ogr2ogr
gml
Stetl concepts
Example: INSPIRE Model Transform
ogr2ogr XSLT Writer
gml
Stetl concepts
Simple
Features
Complex
Features
Example: deegree Store
ogr2ogr XSLT
deegree
Writer
Stetl concepts
Or via
WFS-T
Process Chain - How?
Input Filters Output
Stetl concepts
Example: XML to Shape
XML
Input
XSLT
Filter
ogr2ogr
Output
Example: XML to Shape
The Source
Example: XML to Shape
XML
Input
Example: XML to Shape
XML
Input
XSLT
Filter
Example: XML to Shape
Prepare XSLT Script
Example: XML to Shape
XSLT GML Output
Example: XML to Shape
XML
Input
XSLT
Filter
ogr2ogr
Output
Example: XML to Shape
The Stetl Config File
Process
Chain
XML
InputXSLT
Filter
ogr2ogr
Output
Running Stetl
stetl -c etl.cfg
Result Shapefile viewed in QGIS
Installing Stetl
via PyPi
Deps
•GDAL+Python bindings
•lxml (xml proc)
•psycopg2 (Postgres)
sudo pip install stetl
Speed: Streaming
Input Filter Output
gml
Stetl concepts
Speed: Going Native
Input Filter Output
gml
ogr2ogr StetlStetl
Native C Libs/Progs
Calls
Stetl concepts
Example Components
Input Filters Output
Stetl concepts
XMLFile XSLT GMLFile
ogr2ogr XMLAssembler ogr2ogr
LineStream XMLValidator WFS-T
deegree* FeatureExtractor deegree*
YourInput YourFilter YourOutput
Example: XsltFilter Python
from util import Util, etree
from filter import Filter
from packet import FORMAT
log = Util.get_log("xsltfilter")
class XsltFilter(Filter):
# Constructor
def __init__(self, configdict, section):
Filter.__init__(self, configdict, section, consumes=FORMAT.etree_doc, produces=FORMAT.etree_doc)
self.xslt_file_path = self.cfg.get('script')
self.xslt_file = open(self.xslt_file_path, 'r')
# Parse XSLT file only once
self.xslt_doc = etree.parse(self.xslt_file)
self.xslt_obj = etree.XSLT(self.xslt_doc)
self.xslt_file.close()
def invoke(self, packet):
if packet.data is None:
return packet
return self.transform(packet)
def transform(self, packet):
packet.data = self.xslt_obj(packet.data)
log.info("XSLT Transform OK")
return packet
[etl]
chains = input_xml_file|my_filter|output_std
[input_xml_file]
class = inputs.fileinput.XmlFileInput
file_path = input/cities.xml
# My custom component
[my_filter]
class = my.myfilter.MyFilter
[output_std]
class = outputs.standardoutput.StandardXmlOutput
class MyFilter(Filter):
# Constructor
def __init__(self, configdict, section):
Filter.__init__(self, configdict, section, consumes=FORMAT.etree_doc,
produces=FORMAT.etree_doc)
def invoke(self, packet):
log.info("CALLING MyFilter OK!!!!")
return packet
Your Own Components
Stetl concepts
Step 1- Define Class
Step 2- Config Class
Data Structures
Stetl concepts
• Components exchange Packets
• Packet contains data and status
• Data formats, e.g. :
xml_line_stream
etree_doc
etree_element (feature)
etree_element_array
string
any
.
.
deegree Integration
Stetl concepts
•Input
DeegreeBlobstoreInput
•Output
DeegreeBlobstoreInput
DeegreeFSLoaderOutput
WFSTOutput
Cases - The Netherlands
•INSPIRE Download Services
publish to deegree store (WFS)
generate GML files (for Atom Feed)
•National GML Datasets
GML to PostGIS (Top10NL, BGT)
[etl]
chains = input_sql_pre|schema_name_filter|output_postgres,
input_big_gml_files|xml_assembler|transformer_xslt|output_ogr2ogr,
input_sql_post|schema_name_filter|output_postgres
# Pre SQL file inputs to be executed
[input_sql_pre]
class = inputs.fileinput.StringFileInput
file_path = sql/drop-tables.sql,sql/create-schema.sql
# Post SQL file inputs to be executed
[input_sql_post]
class = inputs.fileinput.StringFileInput
file_path = sql/delete-duplicates.sql
# Generic filter to substitute Python-format string values like {schema} in string
[schema_name_filter]
class = filters.stringfilter.StringSubstitutionFilter
# format args {schema} is schema name
format_args = schema:{schema}
[output_postgres]
class = outputs.dboutput.PostgresDbOutput
database = {database}
host = {host}
port = {port}
user = {user}
password = {password}
schema = {schema}
# The source input file(s) from dir and produce gml:featureMember elements
[input_big_gml_files]
class = inputs.fileinput.XmlElementStreamerFileInput
file_path = {gml_files}
element_tags = featureMember
Top10NL Extract
Parameter
Substitution
Top10NL+BAG (Dutch Topo + Buildings)
BGT - Dutch Large Scale Topo
Cases - INSPIRE Transforms
•Simple: Dutch Admin Borders to AU
•Advanced: Dutch Addresses to AD
INSPIRE - XSLT STRUCTURE
Local CP GML
to
INSPIRE SpatialDataset
Local CP GML
to
INSPIRE GML
Generate
CP INSPIRE GML
Reusable
XSLT ScriptsReusable
XSLT Scripts
Theme CP
Local AU GML
to
INSPIRE SpatialDataset
Local AU GML
to
INSPIRE GML
Generate
AU INSPIRE GML
Theme AU
Local GN GML
to
INSPIRE SpatialDataset
Local GN GML
to
INSPIRE GML
Generate
GN INSPIRE GML
Theme GN
Called by All
Locally
Specific XSL
Generic
XSL
XSLT Template Call
XSLT - 3 MAIN STEPS/SCRIPTS
1.Generate Spatial Dataset GML Container (specific)
2.Extract data values from local OGR simple feature data (specific)
3. Call XSLT template per Theme Feature type (generic)
XSLT AU - STEP 1
XSLT AU - STEP 2
XSLT AU - STEP 3
XSLT - REUSE
STETL CONFIG
STETL CONFIG AD
Case: INSPIRE DL Services -
Dutch Addresses
Source
<GML>
NLExtract
Stetl
deegree
WFS
INSPIRE
<GML>
Atom
Feed
INSPIRE
Addresses
Dutch
Addresses+
Buildings
deegree
blobstore
Stetl
Other Uses (Geocoder etc)
Project Status - Sept 21, 2013
• v1.0.4 installable via PyPi
• Documentation on www.stetl.org
• Real world transforms done
• Seeking feedback, support and
contributors
Rich GML
Problem Solved?
ThankYou !
www.stetl.org
github.com/justb4/stetl

More Related Content

PDF
Geospatial ETL with Stetl - GeoPython 2016
PDF
5 Minute Intro to Stetl
PDF
Taming Rich GML with Stetl - FOSS4G 2013 Nottingham
PDF
Stetl-engine-nlextract-smartem
PDF
Introduction to Go programming language
PDF
Kotlin boost yourproductivity
PPTX
Fall in love with Kotlin
PPTX
Go Language Hands-on Workshop Material
Geospatial ETL with Stetl - GeoPython 2016
5 Minute Intro to Stetl
Taming Rich GML with Stetl - FOSS4G 2013 Nottingham
Stetl-engine-nlextract-smartem
Introduction to Go programming language
Kotlin boost yourproductivity
Fall in love with Kotlin
Go Language Hands-on Workshop Material

What's hot (20)

PDF
Why Kotlin makes Java null and void
PDF
FTD JVM Internals
PDF
Why Python (for Statisticians)
PDF
Generics Past, Present and Future (Latest)
PDF
Introduction to go language programming
PDF
Go. why it goes v2
PDF
Python and GObject Introspection
PDF
Simple ETL in python 3.5+ with Bonobo - PyParis 2017
PDF
10 reasons to be excited about go
PDF
Java 8 Stream API and RxJava Comparison
PDF
Simple Data Engineering in Python 3.5+ — Pycon.DE 2017 Karlsruhe — Bonobo ETL
PDF
PyCon 2013 : Scripting to PyPi to GitHub and More
PDF
2017: Kotlin - now more than ever
PDF
Taking Kotlin to production, Seriously
PDF
Golang preso
ODP
Theming Plone with Deliverance
ODP
Journeys with Transmogrifier and friends or How not to get stuck in the Plone...
PDF
Coding in GO - GDG SL - NSBM
PPTX
Go. Why it goes
PDF
XPath for web scraping
Why Kotlin makes Java null and void
FTD JVM Internals
Why Python (for Statisticians)
Generics Past, Present and Future (Latest)
Introduction to go language programming
Go. why it goes v2
Python and GObject Introspection
Simple ETL in python 3.5+ with Bonobo - PyParis 2017
10 reasons to be excited about go
Java 8 Stream API and RxJava Comparison
Simple Data Engineering in Python 3.5+ — Pycon.DE 2017 Karlsruhe — Bonobo ETL
PyCon 2013 : Scripting to PyPi to GitHub and More
2017: Kotlin - now more than ever
Taking Kotlin to production, Seriously
Golang preso
Theming Plone with Deliverance
Journeys with Transmogrifier and friends or How not to get stuck in the Plone...
Coding in GO - GDG SL - NSBM
Go. Why it goes
XPath for web scraping
Ad

Viewers also liked (7)

PDF
Intégration des données avec Talend ETL
PDF
Tracer la voie vers le big data avec Talend et AWS
PDF
Séminaire Expérience Client
PDF
Données Personnelles
PDF
How to choose the right Integration Framework - Apache Camel (JBoss, Talend),...
PDF
The Secrets of Delivering Impacftul Presentations #ImpactfulPrez
PDF
Phygital
Intégration des données avec Talend ETL
Tracer la voie vers le big data avec Talend et AWS
Séminaire Expérience Client
Données Personnelles
How to choose the right Integration Framework - Apache Camel (JBoss, Talend),...
The Secrets of Delivering Impacftul Presentations #ImpactfulPrez
Phygital
Ad

Similar to Stetl for INSPIRE Data Transformation (20)

PDF
Geospatial ETL with Stetl
PDF
PyDX Presentation about Python, GeoData and Maps
PDF
Using python to analyze spatial data
PDF
GeoKettle: A powerful open source spatial ETL tool
PDF
DSD-INT 2017 The use of big data for dredging - De Boer
PDF
Harnessing Spark Catalyst for Custom Data Payloads
PDF
Developing Geospatial software with Python, Part 1
PDF
Postgres Vision 2018: PostGIS and Spatial Extensions
 
PPTX
Info gdal 20150915
PDF
Integrating PostGIS in Web Applications
PDF
EuroPython 2019: GeoSpatial Analysis using Python and JupyterHub
PDF
Pycon 2012 Taiwan
PDF
那些年 Python 攻佔了 GIS / The Year Python Takes Over GIS
PDF
ISWC 2014 - Dandelion: from raw data to dataGEMs for developers
PPTX
Using GDAL In Your GIS Workflow
PDF
Getting Started with PostGIS
 
PPTX
Apache con big data 2015 magellan
PPTX
[FOSS4G Seoul 2015] New Geoprocessing Toolbox in uDig Desktop GIS
PDF
CARTO ENGINE
Geospatial ETL with Stetl
PyDX Presentation about Python, GeoData and Maps
Using python to analyze spatial data
GeoKettle: A powerful open source spatial ETL tool
DSD-INT 2017 The use of big data for dredging - De Boer
Harnessing Spark Catalyst for Custom Data Payloads
Developing Geospatial software with Python, Part 1
Postgres Vision 2018: PostGIS and Spatial Extensions
 
Info gdal 20150915
Integrating PostGIS in Web Applications
EuroPython 2019: GeoSpatial Analysis using Python and JupyterHub
Pycon 2012 Taiwan
那些年 Python 攻佔了 GIS / The Year Python Takes Over GIS
ISWC 2014 - Dandelion: from raw data to dataGEMs for developers
Using GDAL In Your GIS Workflow
Getting Started with PostGIS
 
Apache con big data 2015 magellan
[FOSS4G Seoul 2015] New Geoprocessing Toolbox in uDig Desktop GIS
CARTO ENGINE

More from Just van den Broecke (20)

PDF
Just's Career Highlights - Version 2
PDF
Just's Career Highlights - Version 1
PDF
Open Sensor Networks
PDF
Open Sensor Networks with LoRa TTN and SensorThings API
PDF
Sensor SDI in PDOK with Smart Emission Platform
PDF
osgeonl-opening-foss4gnl-2018
PDF
OSGeo.nl-NewYearsParty-2018-Opening
PDF
Opening OSGeo.nl Day 2017
PDF
Smart Emission Data Platform
PPT
De Levenscyclus van Open Geodata met Open Source Tools
PDF
NLExtract Project - OGT Award Pitch GeoBuzz 2016
PDF
Smart Emission - Citizens measuring Air Quality - Overview
PDF
Smart Emission - Data - Viewers - Standards
PDF
NLExtract voor BAG - overview
PDF
3D Breakthrough Meeting - 3D Standards progress
PDF
Wandelen met GPS en De Evolutie van Navigatie
PDF
OSGeo.nl - Year 2014 Highlights
PDF
Nederland Ontsloten! OSGeo.nl Dag 2014
PDF
Big Data - Introduction and Research Topics - for Dutch Kadaster
PDF
SensorWeb SOS Pilot RIVM/Geonovum - Status
Just's Career Highlights - Version 2
Just's Career Highlights - Version 1
Open Sensor Networks
Open Sensor Networks with LoRa TTN and SensorThings API
Sensor SDI in PDOK with Smart Emission Platform
osgeonl-opening-foss4gnl-2018
OSGeo.nl-NewYearsParty-2018-Opening
Opening OSGeo.nl Day 2017
Smart Emission Data Platform
De Levenscyclus van Open Geodata met Open Source Tools
NLExtract Project - OGT Award Pitch GeoBuzz 2016
Smart Emission - Citizens measuring Air Quality - Overview
Smart Emission - Data - Viewers - Standards
NLExtract voor BAG - overview
3D Breakthrough Meeting - 3D Standards progress
Wandelen met GPS en De Evolutie van Navigatie
OSGeo.nl - Year 2014 Highlights
Nederland Ontsloten! OSGeo.nl Dag 2014
Big Data - Introduction and Research Topics - for Dutch Kadaster
SensorWeb SOS Pilot RIVM/Geonovum - Status

Recently uploaded (20)

PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PPT
Teaching material agriculture food technology
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Mushroom cultivation and it's methods.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
OMC Textile Division Presentation 2021.pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
August Patch Tuesday
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Getting Started with Data Integration: FME Form 101
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
A Presentation on Artificial Intelligence
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Univ-Connecticut-ChatGPT-Presentaion.pdf
Teaching material agriculture food technology
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Mushroom cultivation and it's methods.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
OMC Textile Division Presentation 2021.pptx
Programs and apps: productivity, graphics, security and other tools
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
August Patch Tuesday
Reach Out and Touch Someone: Haptics and Empathic Computing
Spectral efficient network and resource selection model in 5G networks
Network Security Unit 5.pdf for BCA BBA.
Getting Started with Data Integration: FME Form 101
Encapsulation_ Review paper, used for researhc scholars
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Digital-Transformation-Roadmap-for-Companies.pptx
MIND Revenue Release Quarter 2 2025 Press Release
A Presentation on Artificial Intelligence

Stetl for INSPIRE Data Transformation