SlideShare a Scribd company logo
OpenCage FOSSGIS 2015
http://worldwideberlin.com/
OpenCage FOSSGIS 2015
Overview
I. place name disambiguation (homonyms)
– with & without spellcheck
II. Nominatim
III. other (open data) geocoders
– 2015 trends
– opportunities to share data, config, tests
IV. shared ranking/scoring data
OpenCage FOSSGIS 2015
OpenCage Geocoder
OpenCage FOSSGIS 2015
Welches Münster meinen sie?
OpenCage FOSSGIS 2015
Nominatim geocoder
OpenCage FOSSGIS 2015
OpenCage FOSSGIS 2015
Mühlheim vs Mülheim
OpenCage FOSSGIS 2015
“eifelturm”
OpenCage FOSSGIS 2015
“eiffel turm”
OpenCage FOSSGIS 2015
“eiffeltower” => no result
OpenCage FOSSGIS 2015
“eifel tower”
=> fair ground, Varna Bulgaria (fixed last week)
OpenCage FOSSGIS 2015
“eiffel tower”
=> one in Paris
=> replicas around the world
=> restaurants around the world
OpenCage FOSSGIS 2015
OpenCage FOSSGIS 2015
http://www.openstreetmap.org/#map=17/39.80885/116.28163
OpenCage FOSSGIS 2015
OpenCage FOSSGIS 2015
OpenCage FOSSGIS 2015
Nominatim
●
OSM data, minutely updates
●
+ UK postal codes, TIGER
●
1TB PostGIS
●
import in C, setup scripts in PHP, Postgres stored
procedures, PHP frontend, Python&PHP test suite
●
autocomplete if you add Photon geocoder
●
no spellcheck
OpenCage FOSSGIS 2015
regression/blackbox tests
OpenCage FOSSGIS 2015
other geocoders
Closed source Open source, high resources Open source, low resources
Google Maps Mapzen “Pelias” OpenStreetMap “Nominatim”
Bing/Yahoo Mapbox “Carmen” OpenCage (multiple)
Mapquest Mapquest open (Nominatim) geonames
ESRI/ArcGIS Online Foursquare “Quattroshapes” geocod.io (Tiger data)
Baidu Scout Photon (Nominatim)
Yandex Cloudmade geo.io (Nominatim)
TomTom DSTK (Tiger, geonames)
Amazon (Android only) SmartyStreets
Telenav ...
Nokia/Ovi/Here
Apple (iOS only)
...
OpenCage FOSSGIS 2015
trends
●
SSD
●
Add commercial sources
●
Full builds, downloadable index
●
High parallel (map/reduce, nodejs), cloud scaling,
noSQL
●
Community building, guidelines
●
Test suites
OpenCage FOSSGIS 2015
typical features to improve
●
horizontal scaling
●
autocomplete
●
spellcheck
●
improve text parsing (App 3, 111-113b)
●
crossings (Main & 2nd N, New Orleans)
●
“4km north of $cityname on the N6”
●
tests for non-latin alphabets
●
postal code boundaries
●
localsearch/POIs
OpenCage FOSSGIS 2015
what should be shared
●
aka. don't reinvent everything
●
standard test suite to compare geocoders
●
hierarchy data
●
address parsing
●
address formatting
●
language configuration
●
data parsing, e.g. OSM tags
OpenCage FOSSGIS 2015
OpenCage FOSSGIS 2015
OpenCage FOSSGIS 2015
openaddresses.io
●
110m addresses
●
10GB of text files
1174 SMITH CREEK WAY, BRASSFIELD, WAKE FOREST, NC 27587
732 STEWARTS ROAD, LANEXA, VA 23124
OpenCage FOSSGIS 2015
address formatting
https://github.com/lokku/address-formatting/
– configuration
– test cases for 33 countries
– reference implementation in Perl
{ country_code: 'dk', village: 'Ærøskøbing', county: 'Ærø
Municipality', house_number: '17A', neighbourhood: 'Paradiset',
postcode: '5970', road: 'Baggårde', state: 'Region of Southern Denmark'
}
Baggårde 17A, 5970 Ærøskøbing, Denmark
Adama Asnyka 1, 59-700 Bolesławiec, Poland
CAI, Cerrito 1250, Retiro, C1010AAZ Buenos Aires, Argentina
OpenCage FOSSGIS 2015
wikipedia data
OpenCage FOSSGIS 2015
core geocoding logic
1. tokenize
2. filter
•
fixed bounding box, browser window, country
•
OSM tags/POI search
•
min-max admin
3. search
4. rank
•
country bias
•
language bias (client, explicit)
•
location boost (client, explicit, history)
•
maybe: spellcheck
•
maybe: retry/failover/remove phrases
•
importance boost
OpenCage FOSSGIS 2015
http://blog.mayflower.de/755-Schnelle-Volltextsuche-mit-Solr.html
OpenCage FOSSGIS 2015
map to hierachy (ranks)
http://wiki.openstreetmap.org/wiki/Nominatim/Development_overview
OpenCage FOSSGIS 2015
names, names, names
OpenCage FOSSGIS 2015
name is one of many factors
ranking examples:
●
Altona
– type: suburb vs train station vs town ins US/Canada
●
Germany
– admin_level=2 (country) vs island
●
Mt everest
– importance: viewpoint vs peak vs island
●
Oktoberfest
– actually a alt_name of Theresienwiese
●
Königsberg
– 10x a peak, 1x old_name of Kaliningrad
●
Hitlerberg
– old_name:1934-1945 of Heigelkopf
OpenCage FOSSGIS 2015
status on wikipedia_articles.bin
●
version 1: wikipedia pageview logs
– https://en.wikipedia.org/wiki/Wikipedia:Notability
●
version 2 (current): parsing wikipedia articles and count links
– last updated 2013
– 80m wikipedia entries + 15m redirects
– 0.6m places in OSM have wikipedia tag set (2013: 0.4m)
●
Version 3 (TBD): parsing wikipedia geo exports
– http://de.wikipedia.org/wiki/Wikipedia:WikiProjekt_Georeferenzierung/Haupts
eite/Wikipedia-World/en
– 3.4m entries, more languages, regular dumps, new documentaton
●
version 4 (?)
- used wikidata exports
- used by multiple geocoders
OpenCage FOSSGIS 2015
what can mappers do?
●
add wikipedia tags
●
fix administrative levels
●
don't add wrong names (typos)
●
file bugs (github)
http://nominatim.openstreetmap.org/
OpenCage FOSSGIS 2015
… and if all fails: rename city
OpenCage FOSSGIS 2015
Questions ?
mtm@opencagedata.com

More Related Content

PDF
1 info draft_agenda+registration (2)
PDF
GI2010 symposium-stark (tele-addr)
PDF
Presenting the OpenCage Geocoder at #londonapi 17 Sept 2014
PPT
The path ahead for property portals
PDF
The Nestoria GeoChallenge
PDF
OpenCage Data and sustainable business models for open data
PDF
A living hell - lessons learned in eight years of parsing real estate data
PPTX
Nestoria case study - The effective use of geo-data for search marketing
1 info draft_agenda+registration (2)
GI2010 symposium-stark (tele-addr)
Presenting the OpenCage Geocoder at #londonapi 17 Sept 2014
The path ahead for property portals
The Nestoria GeoChallenge
OpenCage Data and sustainable business models for open data
A living hell - lessons learned in eight years of parsing real estate data
Nestoria case study - The effective use of geo-data for search marketing

Similar to Geocoding Overview (20)

PDF
Volunteered Geographic Information and OpenStreetMap
PDF
The OpenCage Geocoder
PDF
OpenStreetMap at Camp Roberts
PDF
Analysing OpenStreetMap Data with QGIS
PDF
OSM and QGIS
PDF
Agi08 Jeremy Morley
PPTX
Forget your nike and adidas, this year’s cool geobrand is open
PDF
The OpenCage Geocoder #lpw2014
PDF
Making the entire world accessible via a single API - the OpenCage Geocoder
KEY
Drupal mapping
PDF
The OpenCage geocoder - geoinquiets 2 July 2015
PDF
Library of Congress - Neogeography and Geospatial data preservation
PDF
FreeMap Palestine November 2008
PDF
Drupal and the GeoSpatial Web
PPT
OpenSearch 2010-09
PPTX
Internet-enabled GIS Using Free and Open Source Tools
PDF
GeoMapFish, the Open Source WebGIS
PDF
Beyond Google Maps - FOWA 2008 London
PDF
Beyond Googlemaps - Andrew Turner
Volunteered Geographic Information and OpenStreetMap
The OpenCage Geocoder
OpenStreetMap at Camp Roberts
Analysing OpenStreetMap Data with QGIS
OSM and QGIS
Agi08 Jeremy Morley
Forget your nike and adidas, this year’s cool geobrand is open
The OpenCage Geocoder #lpw2014
Making the entire world accessible via a single API - the OpenCage Geocoder
Drupal mapping
The OpenCage geocoder - geoinquiets 2 July 2015
Library of Congress - Neogeography and Geospatial data preservation
FreeMap Palestine November 2008
Drupal and the GeoSpatial Web
OpenSearch 2010-09
Internet-enabled GIS Using Free and Open Source Tools
GeoMapFish, the Open Source WebGIS
Beyond Google Maps - FOWA 2008 London
Beyond Googlemaps - Andrew Turner
Ad

More from lokku (20)

PPTX
Geo-search-location-based-results-for-site-search
PDF
Geocoding India - talk delivered on 31 Jan 2014 at the Bangalore goeBLR event
PDF
Nestoria new design
PDF
CSS::SpriteMaker in action!
PDF
Reducing the technical hurdle - why we started OpenCage Data
PDF
Css sprite_maker-1
PPTX
Geo-Data for Search Marketing SEM & SEO
PDF
Making using OSM data simpler - OpenCage Data
PDF
What’s next in mapping for portals? ppw2012
PDF
How Nestoria switched to OpenStreetMap maps
PPT
Remote Geocoding
PDF
Lessons learned in doing lots with few people
PDF
Mapstraction
PDF
Bar Camp London 7
PDF
How People Search For Locations
PDF
Arbyte - A modular, flexible, scalable job queing and execution system
PPT
Planning for Debugging
PDF
YAPC::Europe 2008 - Mike Astle - Profiling
PPT
SOTM08
ODP
LPW 2007 - Perl Plumbing
Geo-search-location-based-results-for-site-search
Geocoding India - talk delivered on 31 Jan 2014 at the Bangalore goeBLR event
Nestoria new design
CSS::SpriteMaker in action!
Reducing the technical hurdle - why we started OpenCage Data
Css sprite_maker-1
Geo-Data for Search Marketing SEM & SEO
Making using OSM data simpler - OpenCage Data
What’s next in mapping for portals? ppw2012
How Nestoria switched to OpenStreetMap maps
Remote Geocoding
Lessons learned in doing lots with few people
Mapstraction
Bar Camp London 7
How People Search For Locations
Arbyte - A modular, flexible, scalable job queing and execution system
Planning for Debugging
YAPC::Europe 2008 - Mike Astle - Profiling
SOTM08
LPW 2007 - Perl Plumbing
Ad

Recently uploaded (20)

PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Cloud computing and distributed systems.
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
cuic standard and advanced reporting.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Machine Learning_overview_presentation.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Unlocking AI with Model Context Protocol (MCP)
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Network Security Unit 5.pdf for BCA BBA.
Cloud computing and distributed systems.
The Rise and Fall of 3GPP – Time for a Sabbatical?
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
cuic standard and advanced reporting.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
MYSQL Presentation for SQL database connectivity
Mobile App Security Testing_ A Comprehensive Guide.pdf
Machine Learning_overview_presentation.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
Diabetes mellitus diagnosis method based random forest with bat algorithm
Review of recent advances in non-invasive hemoglobin estimation
Encapsulation_ Review paper, used for researhc scholars
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Big Data Technologies - Introduction.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Unlocking AI with Model Context Protocol (MCP)
The AUB Centre for AI in Media Proposal.docx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx

Geocoding Overview