I’ve got a project where I need to build a store locator for a client.
I’m using a custom post type “restaurant-location
” and I have written the code to geocode the addresses stored in postmeta using the Google Geocoding API (heres the link that geocodes the US White House in JSON and I’ve stored the latitude and longitude back to custom fields.
I’ve written a get_posts_by_geo_distance()
function that returns a list of posts in order of those which are closest geographically using the formula I found in the slideshow at this post. You might call my function like so (I’m starting with a fixed “source” lat/long):
include "wp-load.php";
$source_lat = 30.3935337;
$source_long = -86.4957833;
$results = get_posts_by_geo_distance(
'restaurant-location',
'geo_latitude',
'geo_longitude',
$source_lat,
$source_long);
echo '<ul>';
foreach($results as $post) {
$edit_url = get_edit_url($post->ID);
echo "<li>{$post->distance}: <a href="{$edit_url}" target="_blank">{$post->location}</a></li>";
}
echo '</ul>';
return;
Here’s the function get_posts_by_geo_distance()
itself:
function get_posts_by_geo_distance($post_type,$lat_key,$lng_key,$source_lat,$source_lng) {
global $wpdb;
$sql =<<<SQL
SELECT
rl.ID,
rl.post_title AS location,
ROUND(3956*2*ASIN(SQRT(POWER(SIN(({$source_lat}-abs(lat.lat))*pi()/180/2),2)+
COS({$source_lat}*pi()/180)*COS(abs(lat.lat)*pi()/180)*
POWER(SIN(({$source_lng}-lng.lng)*pi()/180/2),2))),3) AS distance
FROM
wp_posts rl
INNER JOIN (SELECT post_id,CAST(meta_value AS DECIMAL(11,7)) AS lat FROM wp_postmeta lat WHERE lat.meta_key='{$lat_key}') lat ON lat.post_id = rl.ID
INNER JOIN (SELECT post_id,CAST(meta_value AS DECIMAL(11,7)) AS lng FROM wp_postmeta lng WHERE lng.meta_key='{$lng_key}') lng ON lng.post_id = rl.ID
WHERE
rl.post_type='{$post_type}' AND rl.post_name<>'auto-draft'
ORDER BY
distance
SQL;
$sql = $wpdb->prepare($sql,$source_lat,$source_lat,$source_lng);
return $wpdb->get_results($sql);
}
My concern is that the SQL is about as non-optimized as you can get. MySQL can’t order by any available index since the source geo is changeable and there’s not a finite set of source geos to cache. Currently I’m stumped as to ways to optimize it.
Taking into consideration what I’ve done already the question is: How would you go about optimizing this use-case?
It’s not important that I keep anything I’ve done if a better solution would have me throw it out. I’m open to considering almost any solution except for one that requires doing something like installing a Sphinx server or anything that requires a customized MySQL configuration. Basically the solution needs to be able to work on any plain vanilla WordPress install. (That said, it would be great if anyone wants to list alternate solutions for others who might be able to get more advanced and for posterity.)
Resources Found
FYI, I did a bit of research on this so rather than have you do the research again or rather than have you post any of these links as an answer I’ll go ahead and include them.
- http://jebaird.com/blog/calculating-distance-miles-latitude-and-longitude
- http://wordpress.org/extend/plugins/geolocation/screenshots/
- http://code.google.com/apis/maps/articles/phpsqlsearch.html
- http://www.rooftopsolutions.nl/blog/229
- http://planet.mysql.com/entry/?id=18085
- http://blog.peoplesdns.com/archives/24
- http://www.petefreitag.com/item/622.cfm
- http://www.phpro.org/tutorials/Geo-Targetting-With-PHP-And-MySQL.html
- http://forum.geonames.org/gforum/posts/list/692.page
- http://forums.mysql.com/list.php?23
- http://www.scribd.com/doc/2569355/Geo-Distance-Search-with-MySQL
- http://developer.yahoo.com/maps/rest/V1/geocode.html
- http://geocoder.us/
Regarding Sphinx Search
- http://sphinxsearch.com/
- https://launchpad.net/wp-sphinx-plugin
- http://forums.site5.com/showthread.php?t=28981
- http://wordpress.org/extend/plugins/wordpress-sphinx-plugin/
- http://wordpress.org/extend/plugins/sphinx-search/
- http://www.mysqlperformanceblog.com/2008/02/15/mysql-performance-blog-now-uses-sphinx-for-site-search/
What precision do you need? if it’s a state/national wide search maybe you could do a lat-lon to zip lookup and have precomputed distance from zip area to zip area of the restaurant. If you need accurate distances that won’t be a good option.
You should look into a Geohash solution, in the Wikipedia article there is a link to a PHP library to encode decode lat long to geohashs.
Here you have a good article explaining why and how they use it in Google App Engine (Python code but easy to follow.) Because of the need to use geohash in GAE you can find some good python libraries and examples.
As this blog post explains, the advantage of using geohashes is that you can create an index on the MySQL table on that field.
This might be too late for you, but I’m going to reply anyway, with a similar answer as I gave to this related question, so future visitors can refer to both questions.
I would not store these values in the post metadata table, or at least not only there. You want a table with
post_id
,lat
,lon
columns, so you can place an index oflat, lon
and query on that. This should not be too hard to keep up to date with a hook on post save and update.When you query the database, you define a bounding box around the starting point, so you can do an efficient query for all
lat, lon
pairs between the North-South and East-West borders of the box.After you get this reduced result, you can do a more advanced (circular or actual driving directions) distance calculation to filter out the locations that are in the corners of the bounding box and therefor further away than you desire.
Here you find a simple code example that works in the admin area. You need to create the extra database table yourself. The code is ordered from most to least interesting.
I’m late to the party on this one, but looking back at this, the
get_post_meta
is really the problem here, rather than the SQL query you’re using.I recently had to do a similar geo lookup on a site that I run, and rather than using the meta table to store lat and lon (which requires at best two joins to look up and, if you’re using get_post_meta, two additional database queries per location), I created a new table with a spatially indexed geometry POINT data type.
My query looked a lot like yours, with MySQL doing a lot of the heavy lifting (I left out the trig functions and simplified everything to two-dimensional space, because it was close enough for my purposes):
where $client_location is a value returned by a public geo IP lookup service (I used geoio.com, but there are a number of similar ones.)
It may seem unwieldy, but in testing it, it consistently returned the nearest 5 locations out of a 80,000-row table in under .4 sec.
Until MySQL rolls out the DISTANCE function that’s being proposed, this seems like the best way I found to implement location lookups.
EDIT: Adding the table structure for this particular table. Its a set of property listings, so it may or may not be similar to any other use-case.
The
geolocation
column is the only thing relevant for the purposes here; it consists of x(lon),y(lat) coordinates that I just look up from the address upon importing new values into the database.Just pre-calculate the distances between all entities. I would store that into a database table on it’s own, with the ability to index values.