Waiting for PostGIS 2.2 - ST_ClipByBox2D - Map dicing Redux in concert with ...

BostonGIS BostonGIS Home About Postgres OnLine Journal PostGIS in Action

Tuesday, September 30. 2014

Posted by Regina Obe in postgis postgresql, postgis_raster, waiting_postgis_22

Comments (0)
Trackbacks (0)

Waiting for PostGIS 2.2 - ST_ClipByBox2D - Map dicing Redux in concert with ST_Tile

One of the new features coming in PostGIS 2.2 is ST_ClipByBox2D (thanks to Sandro Santilli's recent commits funded by CartoDB ). However to take advantage of it, you are going to need your PostGIS compiled with GEOS 3.5+ (very recent build) which has not been released yet. Windows folks, the PostGIS 2.2 9.3 and 9.4 experimental binaries are built with the latest GEOS 3.5 development branch, so you should be able to test this out with Winnie's experimental builds.

Since the dawn of PostGIS, PostGIS users have needed to mutilate their geometries in often painful and horrific ways. Why is ST_ClipByBox2D function useful, because its a much faster way of mutilating your geometries by a rectangular mold than using ST_Intersection. Why would you want to mutilate your geometries? There are many reasons, but I'll give you one: As your geometry approaches the area of your bounding box and as your bounding box size decreases, the more efficient your spatial index becomes. You can consider this article, Map dicing redux of the article I wrote (eons ago) - Map Dicing and other stuff which describes the same approach with much older technology. Though I will be using more or less the same dataset Massachusetts TOWNSSURVEY_POLYM (its newer so you can't really compare) and I tried to simplify my exercise a bit (not resorting to temp tables and such), my focus in this article will be to compare the speed between the new ST_ClipByBox2D approach and the old ST_Intersection approach. The spoiler for those who don't have the patience for this exercise is that using ST_ClipByBox2D at least on my sample data set on my puny Windows 7 64-bit desktop using PostgreSQL 9.4beta2 64-bit was about 4-5 times faster than using ST_Intersection. There was a downside to this speedup. With the ST_Intersection approach, I had no invalid polygons. In the case of ST_ClipByBox2D, I had one invalid polygon. So as noted in the docs, use with caution. We'd be very interested in hearing other people's experiences with it.

One other benefit that ST_ClipByBox2D has over ST_Intersection which I didn't test out is that although ST_Intersection doesn't work with invalid geometries, ST_ClipByBox2D can.

For these exercises, I'm going to also abuse the PostGIS raster function ST_Tile for geometry use, cause for some strange reason, we have no ST_Tile function for PostGIS geometry. For those who missed the improper use of ST_Tile for raster, refer to Waiting for PostGIS 2.1 - ST_Tile.

ST_Tile usage (can be replaced with ST_PixelAsPolygons)

The code in both uses ST_Tile of an empty raster to create square boxes. This is expressed as a common table expression called grid. Sadly I realized that using a CTE instead of just inlining was about 50-100ms slower (seemed consistently). You probably also gain performance by using a temp table and indexing that. Boxes don't need to be square, they can be rectangular too. The 10, -10 you see is in this case saying each pixel is 10 meters by 10 meters (negative because in reverse of geometric space) per pixel. For this I could have just kept it at 1, -1 and it would be fine. The empty raster pixel width/height I set to 100x100 (and also experimented with 200x200) which would resolve to 1000x1000 meters, 2000x2000 meters respectively in Massachusetts State Plane geometric space. I have the timings in the comments of the chopping.

Now some people may wonder why I didn't opt for using ST_PixelAsPolygons raster function. I found the code slightly more unpleasant to look at, and the speed was slightly worse -- 29932 ms for my 1000x1000m (using ST_PixelAsPolygons) vs. 29092 ms (using ST_Tile). I suspect the reason might be the unneeded pixel value returned and the ST_AddBand call. ST_Tile works happily with an empty raster, but ST_PixelAsPolygons seems to require at least one band.

If you would like to compare replace the WITH grid AS (..)

WITH grid As (
SELECT ( ST_PixelAsPolygons(
    ST_AddBand(
     ST_MakeEmptyRaster(ceiling( (ST_XMax(ext) - ST_Xmin(ext))/1000 )::integer, 
   ceiling((ST_YMax(ext) - ST_Ymin(ext))/1000)::integer, ST_XMin(ext), 
  ST_YMax(ext), 1000, -1000,0,0, 26986), '1BB'::text,1,0) )
   ).geom As b
FROM (SELECT ST_Extent(geom) As ext from towns) As f )

ST_ClipByBox Code: Much faster

DROP TABLE IF EXISTS towns_grid_clipbox;
CREATE TABLE towns_grid_clipbox(id serial primary key, 
 town varchar(50), geom geometry(MUltiPolygon, 26986));
-- 9351 rows affected, 9920 ms execution time. (2000 m x 2000 m tiles)
-- 29117 rows affected, 29092 ms execution time. (1000 m x 1000 m tiles)
INSERT INTO towns_grid_clipbox(town, geom)
WITH grid As (
SELECT ST_Tile(
 ST_MakeEmptyRaster(
   ceiling( (ST_XMax(ext) - ST_Xmin(ext))/10 )::integer, 
   ceiling((ST_YMax(ext) - ST_Ymin(ext))/10)::integer, 
   ST_XMin(ext), ST_YMax(ext), 10, -10,0,0, 26986), 100,100
    )::geometry As b
FROM (SELECT ST_Extent(geom) As ext from towns) As f )
SELECT t.town, ST_Multi(ST_ClipByBox2D(t.geom, grid.b::box2d) )
FROM towns AS t INNER JOIN grid ON ST_Intersects(t.geom, grid.b) ;

ST_Intersection Code: Slower

DROP TABLE IF EXISTS towns_grid_inter;
CREATE TABLE towns_grid_inter(id serial primary key, 
 town varchar(50), 
 geom geometry(Multipolygon, 26986));
-- 9351 rows affected, 42544 ms execution time. (2000 m,2000 m tile)
-- 29117 rows affected, 129649 ms execution time (1000 m, 1000 m tile)

INSERT INTO towns_grid_inter(town, geom)
WITH grid As (
SELECT ST_Tile(
 ST_MakeEmptyRaster(
   ceiling( (ST_XMax(ext) - ST_Xmin(ext))/10 )::integer, 
   ceiling((ST_YMax(ext) - ST_Ymin(ext))/10)::integer, 
   ST_XMin(ext), ST_YMax(ext), 10, -10,0,0, 26986), 100,100
    )::geometry As b
FROM (SELECT ST_Extent(geom) As ext from towns) As f )
SELECT t.town, ST_Multi(ST_Intersection(t.geom, grid.b) )
FROM towns AS t INNER JOIN  grid ON ST_Intersects(t.geom, grid.b) ;

Validity Test

Though clip box was much faster, it produced one invalid geometry where as the ST_Intersection approach produced no invalid geometries. As such you probably want to check the validaty of your geometries after and run an ST_MakeValid on the invalid ones.

SELECT t.id, t.town, (d).reason, ST_AsText((d).location)
 from towns_grid_clipbox AS t, ST_IsValidDetail(t.geom) As d
 WHERE NOT ST_IsValid(geom) ;

 id   |   town   |         reason         |                st_astext
------+----------+------------------------+-----------------------------------------
15444 | WEYMOUTH | Ring Self-intersection | POINT(247863.732500002 889768.12460001)

Trackbacks

Trackback specific URI for this entry

No Trackbacks

Comments

Display comments as (Linear | Threaded)

No comments

Add Comment

Name
Email
Homepage
In reply to
Comment	E-Mail addresses will not be displayed and will only be used for E-Mail notifications. To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly. Enter the string from the spam-prevention image above:
	Remember Information? Subscribe to this entry