SlideShare a Scribd company logo
Geo/Spatial Search with MySQL Alexander Rubin Senior Consultant, MySQL AB
Why Geo Search? Stores: find locations new you Social networks: find friends close to you Online maps: find points of interest near your position Online newspapers/yellow pages: find show times next to you home.
POI Search Example
Common Tasks Task: Find 10 nearby hotels and sort by distance What do we have:  Given point on Earth:  Latitude, Longitude Hotels table: Question: How to calculate distance between us and hotel? Latitude Longitude Hotel Name
Latitudes and Longitudes
Distance between 2 points The Haversine Formula For two points on a sphere (of radius R) with latitudes φ1 and φ2, latitude separation Δφ = φ1 − φ2, and longitude separation Δλ the distance d between the two points:
The Haversine Formula in MySQL R = earth’s radius  Δlat = lat2− lat1; Δlong = long2− long1 a = sin²(Δlat/2) + cos(lat1) * cos(lat2) * sin²(Δlong/2) c = 2*atan2(√a, √(1−a)); d = R*c   3956 * 2 * ASIN ( SQRT ( POWER(SIN((orig.lat - dest.lat)*pi()/180 / 2), 2) +  COS(orig.lat * pi()/180) * COS(dest.lat * pi()/180) *  POWER(SIN((orig.lon - dest.lon) * pi()/180 / 2), 2)  ) ) as distance angles need to be in radians
MySQL Query: Find Nearby Hotels set @orig_lat=122.4058; set @orig_lon=37.7907; set @dist=10; SELECT *,   3956 * 2 * ASIN(SQRT( POWER(SIN((@orig_lat -  abs( dest.lat)) * pi()/180 / 2), 2) +  COS(@orig_lat * pi()/180 ) * COS( abs (dest.lat) * pi()/180) *  POWER(SIN((@orig_lon – dest.lon) * pi()/180 / 2), 2) ))   as  distance FROM hotels dest  having distance < @dist ORDER BY distance limit 10; Lat can be negative!
Find Nearby Hotels: Results +----------------+--------+-------+--------+ | hotel_name  | lat  | lon  | dist  | +----------------+--------+-------+--------+ | Hotel Astori.. | 122.41 | 37.79 | 0.0054 | | Juliana Hote.. | 122.41 | 37.79 | 0.0069 | | Orchard Gard.. | 122.41 | 37.79 | 0.0345 | | Orchard Gard.. | 122.41 | 37.79 | 0.0345 | ... +----------------+--------+-------+--------+ 10 rows in set  (4.10 sec) 4 seconds - very slow for web query!
MySQL Explain query Mysql> Explain … select_type: SIMPLE table: dest type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1787219 Extra: Using filesort 1 row in set (0.00 sec)
How to speed up the query We only need hotels in 10 miles radius  no need to scan the whole table 10 Miles
How to calculate needed coordinates 1 °  of latitude ~= 69 miles 1 °  of longitude ~= cos(latitude)*69 To calculate lon and lat for the rectangle: set lon1 = mylon-dist/abs(cos(radians(mylat))*69);  set lon2 = mylon+dist/abs(cos(radians(mylat))*69);  set lat1 = mylat-(dist/69);  set lat2 = mylat+(dist/69);
Modify the query SELECT  destination.*,  3956 * 2 * ASIN(SQRT(  POWER(SIN((orig.lat - dest.lat) * pi()/180 / 2), 2) +  COS(orig.lat * pi()/180) *  COS(dest.lat * pi()/180) *  POWER(SIN((orig.lon -dest.lon) * pi()/180 / 2), 2)  ))  as distance  FROM  users destination,  users origin  WHERE  origin.id=userid and  destination.longitude  between lon1 and lon2  and destination.latitude  between lat1 and lat2
Stored procedure CREATE PROCEDURE geodist (IN userid int, IN dist int) BEGIN  declare mylon double;  declare mylat double;  declare lon1 float;  declare lon2 float;  declare lat1 float; declare lat2 float; -- get the original lon and lat for the userid select longitude, latitude into mylon, mylat from users5 where id=userid limit 1; -- calculate lon and lat for the rectangle: set lon1 = mylon-dist/abs(cos(radians(mylat))*69);  set lon2 = mylon+dist/abs(cos(radians(mylat))*69);  set lat1 = mylat-(dist/69);  set lat2 = mylat+(dist/69);
Stored Procedure, Contd -- run the query: SELECT  destination.*,  3956 * 2 * ASIN(SQRT(  POWER(SIN((orig.lat - dest.lat) * pi()/180 / 2), 2) +  COS(orig.lat * pi()/180) *  COS(dest.lat * pi()/180) *  POWER(SIN((orig.lon -dest.lon) * pi()/180 / 2), 2)  )) as distance FROM  users destination,  users origin WHERE  origin.id=userid and  destination.longitude between lon1 and lon2  and destination.latitude between lat1 and lat2  having distance < dist ORDER BY Distance limit 10; END $$
Speed comparison Test data: US and Canada zip code table, 800K records Original query (full table scan):  8 seconds Optimized query (stored procedure): 0.06 to 1.2 seconds (depending upon the number of POIs/records in the given radius)
Stored Procedure: Explain Plan Mysql>CALL geodist(946842, 10)\G table: origin type: const key: PRIMARY key_len: 4 ref: const rows: 1, Extra: Using filesort table: destination type: range key: latitude key_len: 18 ref: NULL rows: 25877, Extra: Using where
Geo Search with Sphinx Sphinx search (www.sphinxsearch.com) since 0.9.8 can perform geo distance searches It is possible to setup an &quot;anchor point&quot; in the api code and then use the &quot;geodist&quot; function and specify the radius. Sphinx Search returns in 0.55 seconds for test data regardless of the radius and zip $ php test.php -i zipdist -s @geodist,asc Query '' retrieved 1000 matches in 0.552 sec.
Speed comparison of all solutions
Different Type of Coordinates Decimal Degrees (what we used) 37.3248 LAT, 121.9163 LON  Degrees-minutes-second (used in most GPSes) 37°19′29″N LAT, 121°54′59″E LON Most GPSes can be configured to use Decimal Degrees Other
Converting between coordinates Degrees-Minutes-Seconds to Decimal Degrees :  degrees + (minutes/60) + (seconds/3600)   CREATE FUNCTION `convert_from_dms` (degrees INT, minutes int, seconds int)  RETURNS double DETERMINISTIC BEGIN RETURN degrees + (minutes/60) + (seconds/3600); END $$ mysql>select convert_from_dms (46, 20, 10) as DMS\G dms: 46.33611111
Geo Search with Full Text search Sometimes we need BOTH geo search and full text search Example 1: find 10 nearest POIs, with “school” in the name Example 2: find nearest streets, name contains “OAK” Create FullText index and index on LAT, LON Alter table geonames add fulltext key (name); MySQL will choose which index to use
Geo Search with Full Text search: example Grab POI data from www.geonames.org, upload it to MySQL, add full text index Mysql> SELECT  destination.*,  3956 * 2 * ASIN(SQRT(POWER(SIN((orig.lat - dest.lat) * pi()/180 / 2), 2) +  COS(orig.lat * pi()/180) *  COS(dest.lat * pi()/180) *  POWER(SIN((orig.lon -dest.lon) * pi()/180 / 2), 2)  )) as distance  FROM  geonames destination WHERE  match(name)  against (‘OAK’ in boolean mode) having distance < dist ORDER BY Distance limit 10;
Geo Search with Full Text search: Explain mysql> explain  SELECT  destination.*,  3956 * 2 * ASIN(SQRT(POWER(SIN(… table: destination type: fulltext possible_keys: name_fulltext key: name_fulltext key_len: 0 ref: rows: 1 Extra: Using where; Using filesort
DEMO DEMO: Find POI near us Use GPS All POIs near GPS point Match keyword
Using MySQL Spatial Extension CREATE TABLE `zipcode_spatial` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `zipcode` char(7) NOT NULL, … `lon` int(11) DEFAULT NULL, `lat` int(11) DEFAULT NULL, `loc` point NOT NULL, PRIMARY KEY (`id`),  KEY `zipcode` (`zipcode`), SPATIAL KEY `loc` (`loc`) ) ENGINE=MyISAM;
Zipcode with Spatial Extension mysql> select zipcode, lat, lon, AsText(loc) from zipcode_spatial where city_name = 'Santa Clara' and state ='CA' limit 1\G ****** 1. row******** zipcode: 95050 lat: 373519 lon: 1219520 AsText(loc): POINT(1219520 373519)
Spatial Search: Distance Spatial Extension: no built-in distance function CREATE FUNCTION `distance` (a POINT, b POINT)  RETURNS double DETERMINISTIC BEGIN RETURN  round(glength(linestringfromwkb (linestring(asbinary(a), asbinary(b))))); END $$ ( forge.mysql.com/tools/tool.php?id=41 )
Spatial Search Example SELECT DISTINCT dest.zipcode, distance(orig.loc, dest.loc) as sdistance FROM zipcode_spatial orig, zipcode_spatial dest WHERE orig.zipcode = '27712' having sdistance < 10 ORDER BY sdistance limit 10;

More Related Content

PPTX
Dyslipidemia-latest guidlines-Review of Guidlines by Dr.Jayasoorya p g
PPTX
Diabetic Dyslipidemia - A True CV risk
PDF
Twitter graphics and installation tutorial
DOCX
CV (BĐS)
PDF
Creating an Interest List on Facebook
PPT
Online writing for QC campus journalists
PPT
Online writing for campus journalists of QC
PPTX
Operation report
Dyslipidemia-latest guidlines-Review of Guidlines by Dr.Jayasoorya p g
Diabetic Dyslipidemia - A True CV risk
Twitter graphics and installation tutorial
CV (BĐS)
Creating an Interest List on Facebook
Online writing for QC campus journalists
Online writing for campus journalists of QC
Operation report

Similar to Geo distance search with my sql presentation (20)

PPTX
Geolocation
PPTX
Where in the world
PDF
Geolocation on Rails
PDF
10. R getting spatial
PDF
R getting spatial
 
PPTX
Geopy module in python
PDF
10. Getting Spatial
 
PDF
PPT
KEY
OSCON july 2011
DOCX
Calculate_distance_and_bearing_between Latitude_Longitude_Points.docx
DOCX
Calculate_distance_and_bearing_between Latitude_Longitude_Points.docx
PDF
Stratio's Cassandra Lucene index: Geospatial use cases by Andrés Peña
PPT
Jan 2012 HUG: RHadoop
PPTX
Day 6 - PostGIS
PDF
Das Web Wird Mobil - Geolocation und Location Based Services
PDF
Geo search introduction
PPTX
Stratio's Cassandra Lucene index: Geospatial Use Cases (Andrés de la Peña & J...
PPTX
Stratio's Cassandra Lucene index: Geospatial use cases
DOCX
Udp socket programming(Florian)
Geolocation
Where in the world
Geolocation on Rails
10. R getting spatial
R getting spatial
 
Geopy module in python
10. Getting Spatial
 
OSCON july 2011
Calculate_distance_and_bearing_between Latitude_Longitude_Points.docx
Calculate_distance_and_bearing_between Latitude_Longitude_Points.docx
Stratio's Cassandra Lucene index: Geospatial use cases by Andrés Peña
Jan 2012 HUG: RHadoop
Day 6 - PostGIS
Das Web Wird Mobil - Geolocation und Location Based Services
Geo search introduction
Stratio's Cassandra Lucene index: Geospatial Use Cases (Andrés de la Peña & J...
Stratio's Cassandra Lucene index: Geospatial use cases
Udp socket programming(Florian)
Ad

Geo distance search with my sql presentation

  • 1. Geo/Spatial Search with MySQL Alexander Rubin Senior Consultant, MySQL AB
  • 2. Why Geo Search? Stores: find locations new you Social networks: find friends close to you Online maps: find points of interest near your position Online newspapers/yellow pages: find show times next to you home.
  • 4. Common Tasks Task: Find 10 nearby hotels and sort by distance What do we have: Given point on Earth: Latitude, Longitude Hotels table: Question: How to calculate distance between us and hotel? Latitude Longitude Hotel Name
  • 6. Distance between 2 points The Haversine Formula For two points on a sphere (of radius R) with latitudes φ1 and φ2, latitude separation Δφ = φ1 − φ2, and longitude separation Δλ the distance d between the two points:
  • 7. The Haversine Formula in MySQL R = earth’s radius Δlat = lat2− lat1; Δlong = long2− long1 a = sin²(Δlat/2) + cos(lat1) * cos(lat2) * sin²(Δlong/2) c = 2*atan2(√a, √(1−a)); d = R*c 3956 * 2 * ASIN ( SQRT ( POWER(SIN((orig.lat - dest.lat)*pi()/180 / 2), 2) + COS(orig.lat * pi()/180) * COS(dest.lat * pi()/180) * POWER(SIN((orig.lon - dest.lon) * pi()/180 / 2), 2) ) ) as distance angles need to be in radians
  • 8. MySQL Query: Find Nearby Hotels set @orig_lat=122.4058; set @orig_lon=37.7907; set @dist=10; SELECT *, 3956 * 2 * ASIN(SQRT( POWER(SIN((@orig_lat - abs( dest.lat)) * pi()/180 / 2), 2) + COS(@orig_lat * pi()/180 ) * COS( abs (dest.lat) * pi()/180) * POWER(SIN((@orig_lon – dest.lon) * pi()/180 / 2), 2) )) as distance FROM hotels dest having distance < @dist ORDER BY distance limit 10; Lat can be negative!
  • 9. Find Nearby Hotels: Results +----------------+--------+-------+--------+ | hotel_name | lat | lon | dist | +----------------+--------+-------+--------+ | Hotel Astori.. | 122.41 | 37.79 | 0.0054 | | Juliana Hote.. | 122.41 | 37.79 | 0.0069 | | Orchard Gard.. | 122.41 | 37.79 | 0.0345 | | Orchard Gard.. | 122.41 | 37.79 | 0.0345 | ... +----------------+--------+-------+--------+ 10 rows in set (4.10 sec) 4 seconds - very slow for web query!
  • 10. MySQL Explain query Mysql> Explain … select_type: SIMPLE table: dest type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 1787219 Extra: Using filesort 1 row in set (0.00 sec)
  • 11. How to speed up the query We only need hotels in 10 miles radius no need to scan the whole table 10 Miles
  • 12. How to calculate needed coordinates 1 ° of latitude ~= 69 miles 1 ° of longitude ~= cos(latitude)*69 To calculate lon and lat for the rectangle: set lon1 = mylon-dist/abs(cos(radians(mylat))*69); set lon2 = mylon+dist/abs(cos(radians(mylat))*69); set lat1 = mylat-(dist/69); set lat2 = mylat+(dist/69);
  • 13. Modify the query SELECT destination.*, 3956 * 2 * ASIN(SQRT( POWER(SIN((orig.lat - dest.lat) * pi()/180 / 2), 2) + COS(orig.lat * pi()/180) * COS(dest.lat * pi()/180) * POWER(SIN((orig.lon -dest.lon) * pi()/180 / 2), 2) )) as distance FROM users destination, users origin WHERE origin.id=userid and destination.longitude between lon1 and lon2 and destination.latitude between lat1 and lat2
  • 14. Stored procedure CREATE PROCEDURE geodist (IN userid int, IN dist int) BEGIN declare mylon double; declare mylat double; declare lon1 float; declare lon2 float; declare lat1 float; declare lat2 float; -- get the original lon and lat for the userid select longitude, latitude into mylon, mylat from users5 where id=userid limit 1; -- calculate lon and lat for the rectangle: set lon1 = mylon-dist/abs(cos(radians(mylat))*69); set lon2 = mylon+dist/abs(cos(radians(mylat))*69); set lat1 = mylat-(dist/69); set lat2 = mylat+(dist/69);
  • 15. Stored Procedure, Contd -- run the query: SELECT destination.*, 3956 * 2 * ASIN(SQRT( POWER(SIN((orig.lat - dest.lat) * pi()/180 / 2), 2) + COS(orig.lat * pi()/180) * COS(dest.lat * pi()/180) * POWER(SIN((orig.lon -dest.lon) * pi()/180 / 2), 2) )) as distance FROM users destination, users origin WHERE origin.id=userid and destination.longitude between lon1 and lon2 and destination.latitude between lat1 and lat2 having distance < dist ORDER BY Distance limit 10; END $$
  • 16. Speed comparison Test data: US and Canada zip code table, 800K records Original query (full table scan): 8 seconds Optimized query (stored procedure): 0.06 to 1.2 seconds (depending upon the number of POIs/records in the given radius)
  • 17. Stored Procedure: Explain Plan Mysql>CALL geodist(946842, 10)\G table: origin type: const key: PRIMARY key_len: 4 ref: const rows: 1, Extra: Using filesort table: destination type: range key: latitude key_len: 18 ref: NULL rows: 25877, Extra: Using where
  • 18. Geo Search with Sphinx Sphinx search (www.sphinxsearch.com) since 0.9.8 can perform geo distance searches It is possible to setup an &quot;anchor point&quot; in the api code and then use the &quot;geodist&quot; function and specify the radius. Sphinx Search returns in 0.55 seconds for test data regardless of the radius and zip $ php test.php -i zipdist -s @geodist,asc Query '' retrieved 1000 matches in 0.552 sec.
  • 19. Speed comparison of all solutions
  • 20. Different Type of Coordinates Decimal Degrees (what we used) 37.3248 LAT, 121.9163 LON Degrees-minutes-second (used in most GPSes) 37°19′29″N LAT, 121°54′59″E LON Most GPSes can be configured to use Decimal Degrees Other
  • 21. Converting between coordinates Degrees-Minutes-Seconds to Decimal Degrees : degrees + (minutes/60) + (seconds/3600) CREATE FUNCTION `convert_from_dms` (degrees INT, minutes int, seconds int) RETURNS double DETERMINISTIC BEGIN RETURN degrees + (minutes/60) + (seconds/3600); END $$ mysql>select convert_from_dms (46, 20, 10) as DMS\G dms: 46.33611111
  • 22. Geo Search with Full Text search Sometimes we need BOTH geo search and full text search Example 1: find 10 nearest POIs, with “school” in the name Example 2: find nearest streets, name contains “OAK” Create FullText index and index on LAT, LON Alter table geonames add fulltext key (name); MySQL will choose which index to use
  • 23. Geo Search with Full Text search: example Grab POI data from www.geonames.org, upload it to MySQL, add full text index Mysql> SELECT destination.*, 3956 * 2 * ASIN(SQRT(POWER(SIN((orig.lat - dest.lat) * pi()/180 / 2), 2) + COS(orig.lat * pi()/180) * COS(dest.lat * pi()/180) * POWER(SIN((orig.lon -dest.lon) * pi()/180 / 2), 2) )) as distance FROM geonames destination WHERE match(name) against (‘OAK’ in boolean mode) having distance < dist ORDER BY Distance limit 10;
  • 24. Geo Search with Full Text search: Explain mysql> explain SELECT destination.*, 3956 * 2 * ASIN(SQRT(POWER(SIN(… table: destination type: fulltext possible_keys: name_fulltext key: name_fulltext key_len: 0 ref: rows: 1 Extra: Using where; Using filesort
  • 25. DEMO DEMO: Find POI near us Use GPS All POIs near GPS point Match keyword
  • 26. Using MySQL Spatial Extension CREATE TABLE `zipcode_spatial` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `zipcode` char(7) NOT NULL, … `lon` int(11) DEFAULT NULL, `lat` int(11) DEFAULT NULL, `loc` point NOT NULL, PRIMARY KEY (`id`), KEY `zipcode` (`zipcode`), SPATIAL KEY `loc` (`loc`) ) ENGINE=MyISAM;
  • 27. Zipcode with Spatial Extension mysql> select zipcode, lat, lon, AsText(loc) from zipcode_spatial where city_name = 'Santa Clara' and state ='CA' limit 1\G ****** 1. row******** zipcode: 95050 lat: 373519 lon: 1219520 AsText(loc): POINT(1219520 373519)
  • 28. Spatial Search: Distance Spatial Extension: no built-in distance function CREATE FUNCTION `distance` (a POINT, b POINT) RETURNS double DETERMINISTIC BEGIN RETURN round(glength(linestringfromwkb (linestring(asbinary(a), asbinary(b))))); END $$ ( forge.mysql.com/tools/tool.php?id=41 )
  • 29. Spatial Search Example SELECT DISTINCT dest.zipcode, distance(orig.loc, dest.loc) as sdistance FROM zipcode_spatial orig, zipcode_spatial dest WHERE orig.zipcode = '27712' having sdistance < 10 ORDER BY sdistance limit 10;

Editor's Notes

  • #18: ADD EXPLAIN PLAN
  • #25: ADD EXPLAIN PLAN