SlideShare a Scribd company logo
in	
  conjunc(on	
  with	
  
Data Management & Warehousing   http://www.datamgmt.com
What	
  is	
  the	
  Spa(al	
  Module?	
  

•  It’s	
  the	
  ability	
  to	
  analyse	
  informa(on	
  in	
  a	
  
   geographic	
  context:	
  
          –  Where	
  is	
  the	
  nearest	
  petrol	
  sta(on?	
  
          –  Which	
  road	
  am	
  I	
  on?	
  
          –  How	
  many	
  ATMs	
  are	
  in	
  this	
  area?	
  
•  It’s	
  not	
  maps	
  and	
  images	
  
          –  These	
  come	
  later	
  with	
  tools	
  that	
  help	
  present	
  the	
  
             informa(on	
  

Wednesday,	
  July	
  28,	
  2010	
     ©	
  2010	
  Data	
  Management	
  &	
  Warehousing	
     2	
  
The	
  three	
  types	
  of	
  data	
  &	
  many	
  ques(ons	
  

•  Points	
                                                                •  How	
  close	
  are	
  two	
  
          –  OS	
  Grid	
                                                     points?	
  
          –  La(tude	
  &	
  Longitude	
  	
                               •  Does	
  a	
  point	
  touch	
  a	
  
•  Lines	
                                                                    line?	
  
          –  Pairs	
  of	
  points	
                                       •  Is	
  a	
  point	
  inside	
  or	
  
          –  e.g.	
  Road	
  Segments	
                                       outside	
  a	
  polygon?	
  
•  Polygons	
                                                              •  Does	
  a	
  line	
  cross	
  a	
  
          –  A	
  series	
  of	
  points	
  that	
                            polygon?	
  
             define	
  a	
  boundary	
                                      •  How	
  many	
  points	
  are	
  in	
  
          –  e.g.	
  Postcode	
  Boundaries	
                                 a	
  polygon?	
  

Wednesday,	
  July	
  28,	
  2010	
     ©	
  2010	
  Data	
  Management	
  &	
  Warehousing	
                   3	
  
Using	
  Spa(al	
  Data	
  Is	
  Complex	
  
•  Different	
  distances	
  
   between	
  points	
  at	
  
   different	
  longitudes	
  and	
  
   la(tudes	
  
•  Measurement	
  over	
  a	
  
   curved	
  irregular	
  surface	
  
•  Mul(ple	
  input	
  and	
  output	
  
   formats	
  
•  Mul(ple	
  co-­‐ordinate	
  
   systems	
  see:
   A	
  Guide	
  to	
  Coordinate	
  
   Systems	
  in	
  Great	
  Britain	
  	
  

Wednesday,	
  July	
  28,	
  2010	
     ©	
  2010	
  Data	
  Management	
  &	
  Warehousing	
     4	
  
Sources	
  of	
  Informa(on	
  –	
  GPS	
  
•  In	
  Car	
  Device	
  
          –  Sends	
  frequent	
  data	
  sets	
  to	
  
             processing	
  centre	
  
          –  Point	
  Data	
  
                    •  Speed,	
  Direc(on,	
  	
  
                       Loca(on	
  and	
  G-­‐force	
  
          –  Aggregate	
  Data	
  
                    •  Speed	
  and	
  Direc(on	
  
•  Other	
  Devices	
  
          –  Sat	
  Nav	
  Systems	
  
          –  Smart	
  Phone	
  Apps	
  	
  
             e.g.	
  ‘GPS	
  Tracker’	
  
          –  Cameras	
  

Wednesday,	
  July	
  28,	
  2010	
          ©	
  2010	
  Data	
  Management	
  &	
  Warehousing	
     5	
  
Sources	
  of	
  Informa(on	
  –	
  Ordnance	
  Survey	
  

•  Integrated	
  Road	
  Network:	
  
   A	
  series	
  of	
  3	
  million	
  
   ‘linestrings’	
  and	
  17	
  million	
  
   points	
  that	
  describe	
  every	
  
   road	
  in	
  the	
  UK	
  
•  Linestrings	
  have	
  between	
  2	
  
   and	
  655	
  points,	
  most	
  have	
  
   less	
  than	
  10	
  
•  23	
  points	
  for	
  this	
  picture	
  	
  	
  

Wednesday,	
  July	
  28,	
  2010	
     ©	
  2010	
  Data	
  Management	
  &	
  Warehousing	
     6	
  
Sources	
  of	
  Informa(on	
  –	
  Post	
  Office/GAdm	
  

•  Postal	
  Address	
  File:	
  
   A	
  series	
  of	
  c.1.75M	
  UK	
  
   postcodes	
  
          –  Postcode	
  Boundaries	
  	
  
          –  Over	
  28M	
  complete	
  
             addresses	
  
•  Global	
  Admin	
  Boundaries	
  
          –  Na(onal	
  and	
  regional	
  
             boundaries	
  for	
  c.245	
  
             countries	
  
          –  hgp://www.gadm.org	
  	
  

Wednesday,	
  July	
  28,	
  2010	
     ©	
  2010	
  Data	
  Management	
  &	
  Warehousing	
     7	
  
Data	
  Layers	
  –	
  Enriching	
  what	
  you	
  have	
  

•  Data	
  Layers	
  are	
  sets	
  of	
  informa(on	
  (ed	
  to	
  a	
  
   geographic	
  point	
  
          –  Road	
  Speed	
  for	
  a	
  given	
  road	
  segment	
  
          –  ATM	
  Loca(on	
  
          –  House	
  Price	
  for	
  a	
  postcode	
  
•  Where	
  data	
  has	
  loca(on	
  informa(on	
  it	
  is	
  
   known	
  as	
  ‘Geo-­‐tagged’	
  


Wednesday,	
  July	
  28,	
  2010	
     ©	
  2010	
  Data	
  Management	
  &	
  Warehousing	
     8	
  
Data	
  Layer	
  Sources	
  (1)	
  

•  Ordnance	
  Survey	
  
          –  Road	
  Types,	
  Limits,	
  Closures,	
  etc.	
  
•  Government	
  
          –  UK	
  Government	
  now	
  providing	
  masses	
  of	
  	
  
             geo-­‐tagged	
  info	
  (hgp://data.gov.uk)	
  
•  Met	
  Office	
  /	
  HM	
  Nau(cal	
  Almanac	
  Office	
  	
  
          –  Weather,	
  Daylight	
  to	
  Postcode	
  Level	
  


Wednesday,	
  July	
  28,	
  2010	
     ©	
  2010	
  Data	
  Management	
  &	
  Warehousing	
     9	
  
Data	
  Layer	
  Sources	
  (2)	
  
•  Wikipedia	
  
          –  Geo-­‐tag	
  Access	
  API	
  –	
  what’s	
  nearby?	
  
•  Google	
  Maps	
  
          –  Road	
  level	
  photographic	
  images	
  
•  Commercial	
  Sources	
  
          –  Fast	
  Food	
  Outlets,	
  Supermarkets,	
  Petrol	
  Sta(ons,	
  ATMs,	
  
             etc.	
  

•  Massive	
  growth	
  in	
  both	
  commercial	
  and	
  public	
  domain	
  
   geo-­‐tagged	
  data	
  


Wednesday,	
  July	
  28,	
  2010	
     ©	
  2010	
  Data	
  Management	
  &	
  Warehousing	
     10	
  
Issues	
  with	
  Geo-­‐tagged	
  data	
  

•  Geo-­‐tagging	
  uses	
  different	
  formats	
  
          –  Longitude	
  &	
  La(tude,	
  OS	
  Grid	
  Reference,	
  etc	
  
•  Geo-­‐tagging	
  at	
  different	
  levels	
  
          –  Data	
  for	
  a	
  postcode	
  or	
  a	
  an	
  en(re	
  county	
  which	
  makes	
  
             it	
  difficult	
  to	
  compare	
  
•  Geo-­‐tagging	
  coverage	
  is	
  patchy	
  and/or	
  historic	
  
          –  Rate	
  of	
  change	
  of	
  fine	
  detail	
  data	
  is	
  very	
  high	
  	
  
          –  e.g.	
  OS	
  issues	
  monthly	
  updates	
  to	
  the	
  UK	
  mapping	
  
•  Mul(ple	
  standards	
  and	
  formats	
  
          –  XML	
  &	
  CSV,	
  different	
  file	
  formats,	
  etc.	
  	
  

Wednesday,	
  July	
  28,	
  2010	
     ©	
  2010	
  Data	
  Management	
  &	
  Warehousing	
     11	
  
Our	
  Model	
  For	
  Delivering	
  Spa(al	
  Data	
  

             Source	
                   1.       Load	
  Mul(ple	
  File	
  Formats	
                                                          Netezza	
  
                                        2.       Standardise	
  Geo-­‐Tagging	
  
                                        3.       Extract	
  &	
  Load	
  CSVs	
  




                                                                                                             	
  (Proximity,	
  Contains,	
  Excludes)	
  




                                                                                                                                                                                                         (Tableau,	
  Google	
  Maps,	
  etc.)	
  
                                                                                                                                                                                                          Query	
  &	
  Presenta(on	
  Tools	
  
             Source	
                   4.       Perform	
  Spa(al	
  Analysis	
  




                                                                                                                                                             (Sets	
  of	
  data	
  with	
  spa(al	
  
                                                                                                                                                               Spa(al	
  Presenta(on	
  
                                        5.       Create	
  User	
  Access	
  Area	
  




                                                                                                                        Spa(al	
  Analysis	
  




                                                                                                                                                                       agributes)	
  
             Source	
  


             Source	
                                      (Small)	
  
                                         1	
              Postgres	
                     3	
  

                                                          Database	
  
             Source	
  

                                                                  2	
  
             Source	
                                                                                                               4	
                                        5	
  




Wednesday,	
  July	
  28,	
  2010	
                     ©	
  2010	
  Data	
  Management	
  &	
  Warehousing	
                                                                                                            12	
  
Netezza	
  Spa(al	
  Value	
  Add	
  
•  Netezza	
  Spa(al	
  is	
  fast	
                                            •  Netezza	
  Spa(al	
  is	
  easy	
  
          –  Analysis	
                                                                  –  Distance	
  and	
  proximity	
  
                    •  Look	
  up	
  a	
  typical	
  18	
  point	
                          calcula(ons	
  are	
  simple	
  
                       trip	
  in	
  the	
  3M	
  linestrings	
  to	
                    –  ‘Touches’,	
  ‘Overlaps’	
  &	
  
                       find	
  the	
  roads	
  that	
  the	
                                 ‘Contains’	
  queries	
  allow	
  
                       vehicle	
  was	
  on	
  in	
  less	
  than	
  
                       1	
  second	
                                                        instant	
  value	
  add	
  	
  
                    •  Overnight	
  batch	
  process	
  of	
  
                       300,000	
  points	
  to	
  matching	
                    •  Netezza	
  Spa(al	
  integrates	
  
                       road	
  names	
  in	
  under	
  30	
  
                       minutes	
                                                         –  Works	
  well	
  with	
  Tableau	
  
          –  Presenta(on	
                                                               –  Easy	
  to	
  generate	
  KML	
  for	
  
                    •  Tools	
  rely	
  on	
  fast	
  query	
                               use	
  with	
  Google	
  Earth	
  and	
  
                       access	
  to	
  render	
  any	
                                      Google	
  Maps	
  
                       queried	
  map	
  with	
  sub-­‐
                       second	
  response	
  (mes	
  


Wednesday,	
  July	
  28,	
  2010	
               ©	
  2010	
  Data	
  Management	
  &	
  Warehousing	
                            13	
  
Netezza	
  Spa(al	
  Limita(ons	
  
•  Fails	
  the	
  Slar(barpast	
  Test:	
  
          –  Polygons	
  for	
  very	
  detailed	
  maps	
  
             are	
  too	
  big	
  to	
  be	
  loaded	
  as	
  
             Netezza	
  limits	
  the	
  maximum	
  
             block	
  size	
  to	
  64000	
  characters	
                                                   Norway	
  
          –  Named	
  aqer	
  the	
  Hitch-­‐Hikers	
  
             Guide	
  to	
  the	
  Galaxy	
  coastline	
  
             designer	
  responsible	
  for	
  the	
  
             twiddly	
  bits	
  around	
  the	
  
             Norwegian	
  rords	
  
•  Work-­‐around:	
  
          –  Use	
  regional	
  boundaries	
  (e.g.	
  
             UK	
  Coun(es,	
  US	
  States,	
  etc.)	
  
             and	
  then	
  aggregate	
  into	
  
             na(onal	
  boundaries	
  
          –  If	
  a	
  point	
  is	
  in	
  Berkshire	
  then	
  by	
                                         Slar(barpast	
  
             defini(on	
  it	
  is	
  also	
  in	
  England	
  

Wednesday,	
  July	
  28,	
  2010	
               ©	
  2010	
  Data	
  Management	
  &	
  Warehousing	
                           Page	
  14	
  
Current	
  Uses	
  …	
  

•        M/A/B	
  road	
  driving	
  profiles	
  
•        Time	
  of	
  day	
  driving	
  profiles	
  
•        Speed	
  Limits	
  vs.	
  Driven	
  Speed	
  
•        Matching	
  GPS	
  posi(ons	
  to	
  road	
  names	
  
•        Out	
  of	
  bounds	
  driving	
  
•        Customer	
  Demographic	
  Profiles	
  
     	
  …	
  but	
  this	
  is	
  only	
  the	
  start	
  in	
  a	
  very	
  short	
  (me	
  

Wednesday,	
  July	
  28,	
  2010	
     ©	
  2010	
  Data	
  Management	
  &	
  Warehousing	
     15	
  
in	
  conjunc(on	
  with	
  
Data Management & Warehousing   http://www.datamgmt.com

More Related Content

PDF
A poster version of HadoopXML
PPTX
Hadoop
PDF
Maps4 finland 28.8.2012, jari reini
PDF
Laserdata i skyen - Geomatikkdagene 2013
PDF
RIPE NCC Data Sets
PDF
BI SaaS & Cloud Strategies for Telcos
PDF
An introduction to social network data
PDF
Building an analytical platform
A poster version of HadoopXML
Hadoop
Maps4 finland 28.8.2012, jari reini
Laserdata i skyen - Geomatikkdagene 2013
RIPE NCC Data Sets
BI SaaS & Cloud Strategies for Telcos
An introduction to social network data
Building an analytical platform

Viewers also liked (20)

PDF
Building a data warehouse of call data records
PDF
The ABC of Data Governance: driving Information Excellence
PDF
Data Driven Insurance Underwriting (Dutch Language Version)
PDF
LL Higher Ed BI 2014 Key BI Market Trends 20140513a
PPT
Basics of Microsoft Business Intelligence and Data Integration Techniques
PDF
Data Driven Insurance Underwriting
PDF
Data warehousing change in a challenging environment
PPTX
Igqie14 analytics and ethics 20141107
PDF
The one question you must never ask!" (Information Requirements Gathering for...
DOCX
04. Logical Data Definition template
DOCX
02. Information solution outline template
PDF
WHITE PAPER: Distributed Data Quality
DOCX
Example data specifications and info requirements framework OVERVIEW
DOCX
05. Physical Data Specification Template
PDF
Managing for Effective Data Governance: workshop for DQ Asia Pacific Congress...
PDF
Moving From Scorecards To Strategic Management
DOCX
06. Transformation Logic Template (Source to Target)
PPTX
DATA MART APPROCHES TO ARCHITECTURE
DOCX
03. Business Information Requirements Template
PDF
Using the right data model in a data mart
Building a data warehouse of call data records
The ABC of Data Governance: driving Information Excellence
Data Driven Insurance Underwriting (Dutch Language Version)
LL Higher Ed BI 2014 Key BI Market Trends 20140513a
Basics of Microsoft Business Intelligence and Data Integration Techniques
Data Driven Insurance Underwriting
Data warehousing change in a challenging environment
Igqie14 analytics and ethics 20141107
The one question you must never ask!" (Information Requirements Gathering for...
04. Logical Data Definition template
02. Information solution outline template
WHITE PAPER: Distributed Data Quality
Example data specifications and info requirements framework OVERVIEW
05. Physical Data Specification Template
Managing for Effective Data Governance: workshop for DQ Asia Pacific Congress...
Moving From Scorecards To Strategic Management
06. Transformation Logic Template (Source to Target)
DATA MART APPROCHES TO ARCHITECTURE
03. Business Information Requirements Template
Using the right data model in a data mart
Ad

Similar to Implementing Netezza Spatial (20)

PPT
Spatial OLAP for environmental data: solved and unresolved problems Sandro Bi...
PDF
Intro To Geospatial
PPTX
SQLBits X SQL Server 2012 Spatial
PDF
Geo alberta2010 ppt_template
PDF
What is GIS
PDF
Igcon 2011
PDF
Exploring Map-Based Discovery Services in the Digital Library Environment
PDF
Gis and-sap-realestate-mgmt
PDF
Multi-thematic spatial databases
PPTX
Big Data: Beyond the "Bigness" and the Technology (webcast)
PPTX
Vodafone xone fev142013v3 ext
PPT
Geoservices Activities at EDINA
PDF
PDF
GeoAlberta keynote
PPTX
Capabilities Brief Analytics
DOCX
FBSIC Functionalities Matrix
PDF
[Day 3] Building Sustainable Communities
PDF
Orbit GT Mobile Mapping Solutions
PDF
Skills portfolio
PDF
Making your Analytics Investment Pay Off - StampedeCon 2012
Spatial OLAP for environmental data: solved and unresolved problems Sandro Bi...
Intro To Geospatial
SQLBits X SQL Server 2012 Spatial
Geo alberta2010 ppt_template
What is GIS
Igcon 2011
Exploring Map-Based Discovery Services in the Digital Library Environment
Gis and-sap-realestate-mgmt
Multi-thematic spatial databases
Big Data: Beyond the "Bigness" and the Technology (webcast)
Vodafone xone fev142013v3 ext
Geoservices Activities at EDINA
GeoAlberta keynote
Capabilities Brief Analytics
FBSIC Functionalities Matrix
[Day 3] Building Sustainable Communities
Orbit GT Mobile Mapping Solutions
Skills portfolio
Making your Analytics Investment Pay Off - StampedeCon 2012
Ad

More from David Walker (20)

PDF
Moving To MicroServices
PDF
Big Data Week 2016 - Worldpay - Deploying Secure Clusters
PDF
Data Works Berlin 2018 - Worldpay - PCI Compliance
PDF
Data Works Summit Munich 2017 - Worldpay - Multi Tenancy Clusters
PDF
Big Data Analytics 2017 - Worldpay - Empowering Payments
PDF
An introduction to data virtualization in business intelligence
PDF
Gathering Business Requirements for Data Warehouses
PDF
Struggling with data management
PDF
A linux mac os x command line interface
PDF
Connections a life in the day of - david walker
PDF
Conspectus data warehousing appliances – fad or future
PDF
Storage Characteristics Of Call Data Records In Column Store Databases
PDF
UKOUG06 - An Introduction To Process Neutral Data Modelling - Presentation
PDF
Oracle BI06 From Volume To Value - Presentation
PDF
Openworld04 - Information Delivery - The Change In Data Management At Network...
PDF
IRM09 - What Can IT Really Deliver For BI and DW - Presentation
PDF
IOUG93 - Technical Architecture for the Data Warehouse - Presentation
PDF
ETIS11 - Enterprise Metadata Management
PDF
ETIS11 - Agile Business Intelligence - Presentation
PDF
ETIS10 - BI Governance Models & Strategies - Presentation
Moving To MicroServices
Big Data Week 2016 - Worldpay - Deploying Secure Clusters
Data Works Berlin 2018 - Worldpay - PCI Compliance
Data Works Summit Munich 2017 - Worldpay - Multi Tenancy Clusters
Big Data Analytics 2017 - Worldpay - Empowering Payments
An introduction to data virtualization in business intelligence
Gathering Business Requirements for Data Warehouses
Struggling with data management
A linux mac os x command line interface
Connections a life in the day of - david walker
Conspectus data warehousing appliances – fad or future
Storage Characteristics Of Call Data Records In Column Store Databases
UKOUG06 - An Introduction To Process Neutral Data Modelling - Presentation
Oracle BI06 From Volume To Value - Presentation
Openworld04 - Information Delivery - The Change In Data Management At Network...
IRM09 - What Can IT Really Deliver For BI and DW - Presentation
IOUG93 - Technical Architecture for the Data Warehouse - Presentation
ETIS11 - Enterprise Metadata Management
ETIS11 - Agile Business Intelligence - Presentation
ETIS10 - BI Governance Models & Strategies - Presentation

Recently uploaded (20)

PDF
Machine learning based COVID-19 study performance prediction
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
cuic standard and advanced reporting.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Encapsulation theory and applications.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Electronic commerce courselecture one. Pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
A Presentation on Artificial Intelligence
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Machine learning based COVID-19 study performance prediction
The Rise and Fall of 3GPP – Time for a Sabbatical?
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Mobile App Security Testing_ A Comprehensive Guide.pdf
cuic standard and advanced reporting.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Encapsulation theory and applications.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Electronic commerce courselecture one. Pdf
MYSQL Presentation for SQL database connectivity
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
“AI and Expert System Decision Support & Business Intelligence Systems”
Dropbox Q2 2025 Financial Results & Investor Presentation
The AUB Centre for AI in Media Proposal.docx
Advanced methodologies resolving dimensionality complications for autism neur...
Understanding_Digital_Forensics_Presentation.pptx
A Presentation on Artificial Intelligence
NewMind AI Monthly Chronicles - July 2025
CIFDAQ's Market Insight: SEC Turns Pro Crypto

Implementing Netezza Spatial

  • 1. in  conjunc(on  with   Data Management & Warehousing http://www.datamgmt.com
  • 2. What  is  the  Spa(al  Module?   •  It’s  the  ability  to  analyse  informa(on  in  a   geographic  context:   –  Where  is  the  nearest  petrol  sta(on?   –  Which  road  am  I  on?   –  How  many  ATMs  are  in  this  area?   •  It’s  not  maps  and  images   –  These  come  later  with  tools  that  help  present  the   informa(on   Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   2  
  • 3. The  three  types  of  data  &  many  ques(ons   •  Points   •  How  close  are  two   –  OS  Grid   points?   –  La(tude  &  Longitude     •  Does  a  point  touch  a   •  Lines   line?   –  Pairs  of  points   •  Is  a  point  inside  or   –  e.g.  Road  Segments   outside  a  polygon?   •  Polygons   •  Does  a  line  cross  a   –  A  series  of  points  that   polygon?   define  a  boundary   •  How  many  points  are  in   –  e.g.  Postcode  Boundaries   a  polygon?   Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   3  
  • 4. Using  Spa(al  Data  Is  Complex   •  Different  distances   between  points  at   different  longitudes  and   la(tudes   •  Measurement  over  a   curved  irregular  surface   •  Mul(ple  input  and  output   formats   •  Mul(ple  co-­‐ordinate   systems  see: A  Guide  to  Coordinate   Systems  in  Great  Britain     Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   4  
  • 5. Sources  of  Informa(on  –  GPS   •  In  Car  Device   –  Sends  frequent  data  sets  to   processing  centre   –  Point  Data   •  Speed,  Direc(on,     Loca(on  and  G-­‐force   –  Aggregate  Data   •  Speed  and  Direc(on   •  Other  Devices   –  Sat  Nav  Systems   –  Smart  Phone  Apps     e.g.  ‘GPS  Tracker’   –  Cameras   Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   5  
  • 6. Sources  of  Informa(on  –  Ordnance  Survey   •  Integrated  Road  Network:   A  series  of  3  million   ‘linestrings’  and  17  million   points  that  describe  every   road  in  the  UK   •  Linestrings  have  between  2   and  655  points,  most  have   less  than  10   •  23  points  for  this  picture       Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   6  
  • 7. Sources  of  Informa(on  –  Post  Office/GAdm   •  Postal  Address  File:   A  series  of  c.1.75M  UK   postcodes   –  Postcode  Boundaries     –  Over  28M  complete   addresses   •  Global  Admin  Boundaries   –  Na(onal  and  regional   boundaries  for  c.245   countries   –  hgp://www.gadm.org     Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   7  
  • 8. Data  Layers  –  Enriching  what  you  have   •  Data  Layers  are  sets  of  informa(on  (ed  to  a   geographic  point   –  Road  Speed  for  a  given  road  segment   –  ATM  Loca(on   –  House  Price  for  a  postcode   •  Where  data  has  loca(on  informa(on  it  is   known  as  ‘Geo-­‐tagged’   Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   8  
  • 9. Data  Layer  Sources  (1)   •  Ordnance  Survey   –  Road  Types,  Limits,  Closures,  etc.   •  Government   –  UK  Government  now  providing  masses  of     geo-­‐tagged  info  (hgp://data.gov.uk)   •  Met  Office  /  HM  Nau(cal  Almanac  Office     –  Weather,  Daylight  to  Postcode  Level   Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   9  
  • 10. Data  Layer  Sources  (2)   •  Wikipedia   –  Geo-­‐tag  Access  API  –  what’s  nearby?   •  Google  Maps   –  Road  level  photographic  images   •  Commercial  Sources   –  Fast  Food  Outlets,  Supermarkets,  Petrol  Sta(ons,  ATMs,   etc.   •  Massive  growth  in  both  commercial  and  public  domain   geo-­‐tagged  data   Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   10  
  • 11. Issues  with  Geo-­‐tagged  data   •  Geo-­‐tagging  uses  different  formats   –  Longitude  &  La(tude,  OS  Grid  Reference,  etc   •  Geo-­‐tagging  at  different  levels   –  Data  for  a  postcode  or  a  an  en(re  county  which  makes   it  difficult  to  compare   •  Geo-­‐tagging  coverage  is  patchy  and/or  historic   –  Rate  of  change  of  fine  detail  data  is  very  high     –  e.g.  OS  issues  monthly  updates  to  the  UK  mapping   •  Mul(ple  standards  and  formats   –  XML  &  CSV,  different  file  formats,  etc.     Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   11  
  • 12. Our  Model  For  Delivering  Spa(al  Data   Source   1.  Load  Mul(ple  File  Formats   Netezza   2.  Standardise  Geo-­‐Tagging   3.  Extract  &  Load  CSVs    (Proximity,  Contains,  Excludes)   (Tableau,  Google  Maps,  etc.)   Query  &  Presenta(on  Tools   Source   4.  Perform  Spa(al  Analysis   (Sets  of  data  with  spa(al   Spa(al  Presenta(on   5.  Create  User  Access  Area   Spa(al  Analysis   agributes)   Source   Source   (Small)   1   Postgres   3   Database   Source   2   Source   4   5   Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   12  
  • 13. Netezza  Spa(al  Value  Add   •  Netezza  Spa(al  is  fast   •  Netezza  Spa(al  is  easy   –  Analysis   –  Distance  and  proximity   •  Look  up  a  typical  18  point   calcula(ons  are  simple   trip  in  the  3M  linestrings  to   –  ‘Touches’,  ‘Overlaps’  &   find  the  roads  that  the   ‘Contains’  queries  allow   vehicle  was  on  in  less  than   1  second   instant  value  add     •  Overnight  batch  process  of   300,000  points  to  matching   •  Netezza  Spa(al  integrates   road  names  in  under  30   minutes   –  Works  well  with  Tableau   –  Presenta(on   –  Easy  to  generate  KML  for   •  Tools  rely  on  fast  query   use  with  Google  Earth  and   access  to  render  any   Google  Maps   queried  map  with  sub-­‐ second  response  (mes   Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   13  
  • 14. Netezza  Spa(al  Limita(ons   •  Fails  the  Slar(barpast  Test:   –  Polygons  for  very  detailed  maps   are  too  big  to  be  loaded  as   Netezza  limits  the  maximum   block  size  to  64000  characters   Norway   –  Named  aqer  the  Hitch-­‐Hikers   Guide  to  the  Galaxy  coastline   designer  responsible  for  the   twiddly  bits  around  the   Norwegian  rords   •  Work-­‐around:   –  Use  regional  boundaries  (e.g.   UK  Coun(es,  US  States,  etc.)   and  then  aggregate  into   na(onal  boundaries   –  If  a  point  is  in  Berkshire  then  by   Slar(barpast   defini(on  it  is  also  in  England   Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   Page  14  
  • 15. Current  Uses  …   •  M/A/B  road  driving  profiles   •  Time  of  day  driving  profiles   •  Speed  Limits  vs.  Driven  Speed   •  Matching  GPS  posi(ons  to  road  names   •  Out  of  bounds  driving   •  Customer  Demographic  Profiles    …  but  this  is  only  the  start  in  a  very  short  (me   Wednesday,  July  28,  2010   ©  2010  Data  Management  &  Warehousing   15  
  • 16. in  conjunc(on  with   Data Management & Warehousing http://www.datamgmt.com