Shapefile Tiles with PHP and GD

Last week, I demonstrated some tools for importing Shapefile data into an SQL database. It’s time to put that stuff to work.

If you’re in a rush, here’s the final demo: translucent state overlays on a Google Map. It’s not the quickest thing in the world, nor is it entirely quirk-free; it exists as an initial demonstration that will be improved upon in future articles. And it’s likely you can think of some interesting state-based data that might make a good mashup.

Choices

There are two big decisions that needed be made in assembling this project. Both of these deserve mention, as future demos may explore the alternatives.

The first choice is which image library to use. I’ve chosen to use GD here because it is the simplest for most readers. It has calls built directly into PHP, which makes it a snap to get started on. The downside of GD is that its support for alpha-channel techniques is extremely poor. Even using this awkward but ingenious hack, I wasn’t able to get the kind of crisp, antialiased polygons that ImageMagick renders. If you view one of the demo’s individual tiles, you’ll see that it’s a single-colour GIF with just the kind of rough edge that you’d expect from binary transparency.

The second choice is which Maps API method to use. Again, I’ve picked the simplest option: GTileLayer. Although it may seem intimidating, it’s actually requires a surprisingly small amount of configuration to get off the ground with. Also, readers of the book will recognize a lot of crossover between this demo and the custom tiles one from chapter 7.

The difficulty with this choice is that every new tile layer added introduces (at typical screen resolutions) a dozen or more img elements to the DOM tree. It’s okay for the first one or two, but after that you leave your users’ browsers begging for air. This limitation can be overcome (of course) by offering only one or two overlays; rather than serving each state as its own layer, combine them all into a single mega-overlay. Of course, to do that, you’re forced to use PNGs over GIFs—if you want to show different regions at different opacity levels.

The alternative to using GTileLayer is to build up your own GOverlay—from scratch. This is a more complex affair, but it gives you that finer grain of control. Rather than loading up the page with piles of mostly-empty image tags, you can have exactly as many as you want per overlay. One state, one image, fifty image total… perfect, right? Sort of. Once the user starts zooming in, those images get larger and larger and quickly become unmanageable. Instead of serving a handful of tiles to the user, you’re serving fifty images that might each be longer than a thousand pixels on each dimension!

Clearly, the solution in this case is to have the program make a judgement call between an GOverlay and GTileLayer, depending on how big the region is at the current zoom level. But that’s a trick for the future.

For now, how can we get off the ground with GD and GTileLayer?

The List

Readers will recognize the layout of this mashup from a chapter 6 example. The markup and styles haven’t changed substantially from what was presented then.

The first big change is the code generating the list for the sidebar. It’s still located in map_data.php, but now it’s fetching this data out of last week’s shape_polygon table, with the following query:

SELECT code,
min(latitude_min) as latitude_min,
max(latitude_max) as latitude_max,
min(longitude_min) as longitude_min,
max(longitude_max) as longitude_max
FROM shape_polygons GROUP BY code ORDER BY code ASC

If you’ve never seen a GROUP BY clause before, you might need to read up a bit, but the basic gist of this isn’t hard. It simply takes the thousands of polygons in the table, groups them together by what code (US state) they’re in, and then returns the extents of all the geometry in that group.1

Finally, it renders it down to a simple JS object, just as did previous iterations of map_data.php. You can see the results by visiting the live script.

Tilesets and the API

In map_functions.js, I’ve used a global array called activeOverlays to keep a list of all my overlays currently on the map. This is convenient because I can simply key them to the database’s code field, and then to check if an overlay is displayed or not currently, I just check if activeOverlays[code] evaluates to a non-false value. In the click-handler for the sidebar buttons, it’s a condition that splits between showing and hiding a given overlay:

function regionListClickHandler() {
    var code = this.parentNode.id;
    if (!activeOverlays[code])
    {
        var tilelayer = new GTileLayer(new GCopyrightCollection(''));
        tilelayer.getTileUrl = function(tile, zoom) {
            return 'cache/' + code + '/' + tile.x + '-' + tile.y + '-' + zoom + '.gif';
        }
        tilelayer.isPng = function() { return false; }
        tilelayer.getOpacity = function() { return 0.4; }
        activeOverlays[code] = new GTileLayerOverlay(tilelayer);
        map.addOverlay(activeOverlays[code]);
        this.parentNode.className = 'visible';
    }
    else
    {
        map.removeOverlay(activeOverlays[code]);
        activeOverlays[code] = null;
        this.parentNode.className = '';
    }

    return false;
}

The important thing to note here is the hardcoded opacity value. If you were displaying actual data using this method, you could have that getOpacity function hit up the regions array (or some other source) to get a value that reflects a concentration or some other factor.

The other important thing is the getTileUrl function. It will end up returning a URL like cache/US.States.Texas/3-6-4.gif, which looks not like a script, but like a real file sitting there. And that brings us to…

Caching

Generating an image involving a massive polygon is an expensive task. There’s no need to do it more than once. Once the image has been generated the first time, successive calls needn’t even invoke PHP; they can simply grab the image directly from the webserver. To set this up, I needed a rewrite rule in my htaccess file, like so:

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^cache/(.*\\.(png|gif))$ drawtile.php?tile=$1 [L]

The third line there funnels all image requests inside the cache folder to a single drawtile.php script. The previous line tells it “but only do this if there isn’t already the exact file sitting right there.”

It now becomes the responsibility of drawtile.php to not only generate the image, but to save it into the correct location for future requests.

Tile Drawing

Mercifully, the drawing of the tiles itself is not too difficult. My method in drawtile.php is inefficient and wasteful, but it gets the job done… and remember, the job only has to happen once. 2

To draw a tile, we must grab all the points from the polygons that intersect it, using a query such as this one:

SELECT p.id, v.ordering, v.latitude, v.longitude
FROM shape_polygons AS p
LEFT JOIN shape_vertices AS v ON p.id = v.polygon_id
WHERE p.code = '$set'
AND NOT (latitude_max < $tile_bottom OR latitude_min > $tile_top
OR longitude_max < $tile_left OR longitude_min > $tile_right)
ORDER BY p.id, v.ordering

What does it do? It grabs all the points belonging to the point of interest, but it discards any where the whole shape they belong to is outside the bounding box. This is a good optimization that comes practically for free; even when you’re drawing part of a state on a tile, there are many states that have little coastal islands (or whatever) that can be ignored while drawing the inland bits.

In terms of the drawing itself, having got all the points in a sequential array (from the query), it’s just a matter of breaking them up into their polygon groups, and calling the relevant GD functions in PHP. The conversion from latitude and longitude co-ordinates to pixels in handled entirely by the Google Maps Utility class that was introduced in the book.

Wrapup

Obviously, there are countless variations to be made on this theme. Mapping state boundaries is quite dull, really, but I hope it helps open your eyes to the possibilities of region data. There’s an almost absurd amount of state-based information available at the US Census site; how about doing a bit of parsing on these pages? A mashup that compares the distribution of public servant salaries in different states? Or one that compares state tax revenues?

Tired of states? Move on to any number of the multitude of topics covered by National Atlas data. Want to go outside America? Trying mixing up some data from eurostat with these international shapefiles.

This has been a brief explanation, but hopefully with reading the source and access to us for questioning, you should be able to piece together how it went down. (Questions in the comment area are best; that way everyone benefits… but you can always email us, if you’d prefer.)

The Files Involved

Notes

1. The one place this trick very seriously breaks down is in dealing with Alaska. Since it spans the Date Line, its apparent range of longitude is from -180 all the way to +180. Although the API is good at handling Date Line weirdness, our own code is a bit more frail. I welcome any suggestions for elegant solutions to this difficulty. (For the moment, the range values aren’t even in use, but they could become part of an important optimization in the near future.)

2. To verify this for yourself, try panning/zooming the map to some area where the tiles are coming in really slowly, suggesting that they’re being actually generated. Then, open up a different browser (or clear your cache) and surf back to that same spot. You should see the same area come into view much more quickly. The GIF files themselves are tiny; it’s just the intial generating of them that takes a lot of time.