3: Spatially Enabled Relational Databases
3.3: Spatial Analysis using SQL and PostGIS
You’ve been hired as a GIS consultant for the nonprofit Friends of Northampton Trails and Greenways. They want to improve drainage on their bike trails by installing some infrastructure on the sides of their trails (gravel, drainage ditches, etc). The bike trails they maintain go through some sensitive habitat however, and they need to get an estimate for how much sensitive habitat could be affected by this project. They’ve asked you to find the answer to this question:
What is the total area of conserved lands within 100m of their bike paths in Easthampton, Northampton, Hadley, and Amherst?
You are probably accustomed to answering a question like this using ArcGIS or QGIS. This exercise shows how we can use PostGIS and a relational database to perform spatial analyses similar to what we do in desktop GIS programs. You’ll start with three layers from MassGIS:
- “Town Boundaries” – all towns in Massachusetts
- “Bicycle Trials” – all bike trails in Massachusetts
- “Openspace” – all conserved land in Massachusetts
From these layers, you will generate…
- …a layer representing our study area (Easthampton, Northampton, Hadley, and Amherst)
- …a layer of bike trails that fall within the study area
- …a 100m buffer around these bike trails
- …a layer representing conserved areas that overlap this buffer.
Finally, you’ll calculate the total area of the last layer.
In this exercise, you’ll use three PostGIS tools – ST_Buffer(), ST_Intersects(), and ST_Intersection(). Look up these tools in the PostGIS documentation (http://postgis.net/documentation/) and try to answer the following questions:
- what are the inputs that each tool takes? In other words, what arguments does each tool take?
- what is the output of each tool?
When you’ve answered those questions, try to match each tool with the step that you’ll use it in. Note that one step does not require any PostGIS tool – you can do it with regular old SQL!
Step | Which Tool? |
---|---|
Select 4 towns in Massachusetts (Easthampton, Northampton, Hadley, Amherst) | _ |
Select bike trails inside these 4 towns | _ |
Create buffer around these trails | _ |
Create layer of overlap between buffers and conserved lands | _ |
Options:
ST_Intersects()
ST_Intersection()
ST_Buffer()
No PostGIS tool required
(see A in Answers below)
Upload data to PostgreSQL
- In pgAdmin, create a spatially-enabled database (i.e., a database with the PostGIS extension added) for this exercise
- Open QGIS
- Connect to the MassGIS WFS
- Add the layers “Bicycle Trails”, “Openspace”, and “Town Boundaries” to the map
- Upload these layers to your new PostgreSQL database using DBManager
Note: if you’ve forgotten how to do any of these steps, consult the GitBook exercises for previous weeks, especially section 4.3: Key Postgres and PostGIS steps from Weeks 2 and 3
Create study area from towns layer
In pgAdmin, Open the SQL window and write a query to select the towns of Easthampton, Northampton, Hadley, and Amherst. Here’s something to help get you started:
SELECT * FROM “Town Boundaries” WHERE “TOWN” = ‘EASTHAMPTON’ OR __________
Note: do not try to copy and paste this code into pgAdmin. You must use straight quotes instead of the curly quotes shown here.
Fill in the blank, and you’re all set! (for the solution see B in “Answers”)
Instead of simply selecting these four towns, we’d like to make a new table/layer out of them:
Insert “CREATE TABLE
_______
AS” before the previous SQL statement, where______
is whatever name you’d like to give to the new table. I suggest studyarea. You should see your new table appear in pgAdmin (remember to refresh)Connect to your database in QGIS, and add the new layer. Do you see the four towns in the study area?
Select bicycle trails in our study area
- Back in pgAdmin, open the SQL window.
Write the following query to create a new table of bicycle trails that overlap (i.e., intersect) the study region, filling in the blanks where necessary (see C in Answers):
CREATE TABLE __________ AS SELECT “Bicycle Trails”.id, “Bicycle Trails”.geom, ___________.”TRAILNAME” FROM “Bicycle Trails”, “studyarea” WHERE ST_Intersects(“Bicycle Trails”.geom, ______________)
If you’re stuck on this last blank, check out the PostGIS documentation for the tool ST_Intersects()
Add the new table to the map in QGIS to verify that the query functioned as planned.
Create a 100m buffer around the bike trails
- In pgAdmin, open the SQL window.
- Write the following query to create a new table of buffered bike trails in our area, filing in the blanks where necessary (see D in Answers):
Note: check out the PostGIS documentation on the buffer tool if you’re stuck on the last two blanks.CREATE TABLE ___________ AS SELECT “__________”.id ST_Buffer(“__________”.geom, ____) AS geom, “__________”.TRAILNAME FROM “_________”
- Add the new table to the map in QGIS to verify that the query functioned as planned.
Create a table/layer showing where the bike trail buffers and conserved land overlap
- In pgAdmin, open the SQL window.
Write the following query to create a new table/layer which shows the overlap between the bike trail buffers and conserved land. In the output table, make sure you include the column “PRIM_PURP” from the table/layer “Openspace” – we’ll use this column in a future query (see E in Answers).
CREATE TABLE ____________ AS SELECT “___________”.___________ ________________________________________________ AS geom, “___________”.___________ “___________”.___________ FROM “__________”, “__________” WHERE ST_Intersects(“_________”.geom, “________”.geom)
Note: check out the PostGIS documentation on ST_Intersects() and ST_Intersection() – they are different, and the difference is pretty important!
Another note: the “WHERE” clause is technically unneccesary for this query. However, remember that “Openspace” covers all of Massachusetts. The “WHERE” statement tells PostGIS to include only areas of “Openspace” that touch the bike trails, thereby significantly reducing the processing time. In QGIS, check to make sure this layer makes sense!
Find the total area of conserved lands within 100m of bike trails
- Open the SQL window in pgAdmin
- Use the following query to calculate the sum the area of all polygons in the layer you created in the last step (I called my layer “consland_bike”)
(see F in Answers)SELECT SUM(ST_Area(____________)) FROM “consland_bike”
Find total area of conserved lands without recreational access within 100m of bike trails
Try doing this one completely on your own! Here’s a hint: It will look a lot like the last query, but you will be incorporating the column “PRIM_PURP” from the table “consland_bike”. Any polygon whose value for “PRIM_PURP” is “C” (which stands for conservation as opposed to recreation) is not legally accesed for recreational purposes. Another hint: you should be able to figure out by looking at the GitBook from the first week we used PostgreSQL. (see G in Answers)
Conclusion
At this point, you may be thinking, “Why would I write all of these cumbersome SQL/PostGIS queries when I could do this same thing using the much simpler graphic interfaces of QGIS or ArcGIS?” If Friends of Northampton Trails and Greenways were only interested in answering this one question (What is the total area of conserved lands within 100m of their bike paths?), using QGIS or ArcGIS would be just fine. But, in the future, they might need to find out the total area of conserved lands within 100m of their hiking trails. Or, maybe they’ll add a new bike trail, or the Kestrel land trust will buy more land for conservation, and they’ll need to recalculate the answer to this question. Or, in the open-source spirit, they want to share the ability to answer this question with other trail nonprofits in Massachusetts, the US, or across the world.
In these scenarios, simply clicking through the tools would not be enough – rather, we need a way to automate the process. Automating a process in GIS lets users repeat the same sequence of tools on any data sets they choose without having to redo the process every time. You can automate a process in ArcGIS or QGIS using Model Builder. Using an ArcGIS or QGIS model, however, requires having both the software and a staff member who can use the software. PostGIS gives us a lot more flexibility. You can (and will) use JavaScript to send this type of query to your PostgreSQL database and display the results on a web map. Make this process into a JavaScript web mapping application, and suddenly anyone with an Internet browser can perform this analysis with the click of a button.
Answers
A.
Select 4 towns in Massachusetts (Easthampton, Northampton, Hadley, Amherst): this can be done with SQL tools – select these four towns using the column of town names
Select bike trails inside these 4 towns: ST_Intersects()
Create buffer around these trails: ST_Buffer()
Create layer of overlap between buffers and conserved lands: ST_Intersection()
B.
CREATE TABLE “studyarea”
SELECT * FROM “Town Boundaries”
WHERE “TOWN” = ‘EASTHAMPTON’ OR “TOWN” = ‘NORTHAMPTON’ OR “TOWN” = ‘HADLEY’ OR “TOWN” = ‘AMHERST’
C.
CREATE TABLE “local_biketrails” AS
SELECT
“Bicycle Trails”.id,
“Bicycle Trails”.geom,
“Bicycle Trails”.”TRAILNAME”
FROM “Bicycle Trails”, “studyarea”
WHERE ST_Intersects(“Bicycle Trails”.geom, “studyarea”.geom)
D.
CREATE TABLE “trail_buffers” AS
SELECT
“local_biketrails”.id
ST_Buffer(“local_biketrails”.geom, 100) AS geom,
“local_biketrails”.TRAILNAME
FROM “local_biketrails”
E.
CREATE TABLE “consland_bike” AS
SELECT
“trail_buffers”.id,
ST_Intersection(“trail_buffers”.geom, “Openspace”.geom) AS geom,
“Openspace”.”NAME”,
“Openspace”.”PRIM_PURP”
FROM
“trail_buffers”, “Openspace”
WHERE
ST_Intersects(“trail_buffers”.geom, “Openspace”.geom)
F.
SELECT
SUM(ST_Area(“consland_bike”.geom))
FROM “consland_bike”
Answer: 1527608.51….. m^2
G.
SELECT
SUM(ST_Area(“consland_bike”.geom)),
“consland_bike”.”PRIM_PURP”
FROM “consland_bike”
GROUP BY “publiclands2”.”PRIM_PURP”
Answer: 666793.64….….. m^2