I’ve been looking hard at datastores recently, in particular PostgreSQL + PostGIS. In school, I did a few GIS related things in class and I assumed it was magic. It turns out there is no magic.

At work, we have an application that accepts polygons and returns objects whose location are within that polygon. In GIS-speak this is often called an Area of Interest or AOI and they’re often represented as a few data types: GeoJSON, Well-known Text and Well-known Binary (WKT/WKB).

So what happens when the request is accepted and how does PostGIS make this all work?

Here’s a graphical representation of my GeoJSON (the ‘raw’ link should show the GeoJSON document):

https://gist.github.com/andyhky/b1a41c61e3368e4457471a0152f508df

And the same as WKT:

https://gist.github.com/andyhky/8591d9574b4571d125a26fa2c803794a

In the PostGIS world, geospatial data is stored as a geometry type.

Here’s a query returning WKT->Geometry:

https://gist.github.com/andyhky/293d1a868e57138ad6fc8e27448c1c60

The PostGIS geometry datatype is an lwgeom struct.

So, to find all of the objects in a given AOI, the SQL query would look something like this ( PostGIS FAQ):

https://gist.github.com/andyhky/2b26fb5b42b0b58e86680c4ce0e1c778

OK, but what does the && operator do? Thanks to this awesome dba.stackexchange answer, we finally get some answers! In a nutshell:

  • && is short for geometry_overlaps
  • geometry_overlaps passes through several layers of abstraction (the answer did not show them all)
  • in the end, floating point comparisons see if the object in the database’s polygon is within the requested polygon