Introduction to GIS and ABM

Lesson 4 with Nick Magliocca

Contents


Lesson Goals

This tutorial will introduce you to:

  1. Pros/Cons of introducing GIS data to your models
  2. Downloading, cleaning, and importing GIS data into models
  3. Implementing ABMs with GIS data

Top of Section


Quick reminder: GIS data types and formats

Using the GIS extension, you can use geographic data that is vector format as shapefiles or raster data as ASCII and TIFF.

Using GIS with ABMs: Benefits

Using GIS with ABMs: Costs

Top of Section


Let’s Make a Model!

Overview of Model Process

Scoping the model

A note about this model …

For this exercise we are only focusing on a sample model to bring in the GIS and CSV data into NetLogo, based on these themes above. We will not actually model the spread of the disease or be able to support the experiments with this base model. However, I encourage you to build upon the model to do this.

See Dr. Andrew Crooks’s work on Ebola

What data do we need?

Reminder about Shapefiles

Shapefiles are actually composed of several component files. At least 4, but up to 8, files that end in the following: .shp, .dbf, .shx, .prj, .sbn, .sbx, .cpg, .xml.

Pro Tip: Try to keep all of these together in the same folder!

Reminder about Tabular Data

Use CSV files for structured data in text format separated by commas.

Viewed in text editor:

District,Province,Capital,Area,Population
Kailahun,Eastern,Kailahun,3859,358190
Kenema,Eastern,Kenema,6053,497948

View in tabular form:

District Province Capital Area Population
Kailahun Eastern Kailahun 3859 358190
Kenema Eastern Kenema 6053 497948
       

Where can we find and download GIS data (shp)?

Where can we find and download tabular data?

Data Organization

Review the Data

The dbf file contains the GIS data attributes that you will use in NetLogo.

Note: may have to refer back to data source to know what column names mean: District name (GLOBAL_A_1), country name (GLOBAL_A_3), and the number of confirmed ebola cases to date (V_ADM2_C3)

Review the Data

Review projection: The .prj file tells us the spatial reference information and projection of the shapefile. Ideally we want all our data in same projection, and lucky for us it is for this exercise.

A Note About GIS Software

ArcGIS/QGIS to View and Process GIS data

QGIS: https://www.qgis.org/en/site/forusers/download.html

QGIS is a free open source GIS software for manipulating and creating geographic data. Use to reproject, edit, trim, simplify, and clean datasets for a model. Create additional GIS data layers if needed for the model.

NetLogo Extensions

NetLogo Homepage: http://ccl.northwestern.edu/netlogo/

Or, in NetLogo, go to Help > NetLogo User Manual

NetLogo and GIS

A great reference for working with GIS data in NetLogo is the model in the model library:

Note: You can use this model to copy and adapt code sections for your own models.

Top of Section


Let’s Make the Ebola Model!

Define Global Variables

Find the following code at the top of your script to define globals:

extensions [ gis csv ]
globals [ sites
		roads
		districts
		SL]
breed [ admin-labels admin-label ]
patches-own [district-name confirmed population road-here]
turtles-own [name status time-infected]

What entities will be created from data? What attributes will patches have? Turtles?

Setup Button to Load GIS

Write Code to Load Data

Find the next code block starting with:

to setup
	...

We need to assign data to our global variables and load the data files.

Tasks

  1. Specify projection file in the ‘gis:load-coordinate-system’ command.
  2. Specify data files assigned to each global variable.

Visualize GIS Layers

Make a button called ‘draw,’ then go to code tab.

Look at the next code block starting with:

to draw
	...

What colors are we making sites, districts, and roads? How do we assign roads to specific patches?

Model Extent

What is our model extent now? (Hint: Remember that our roads data is just for Sierra Leone)

Controlling Model Extent

To change model extent, we need to find where it is defined

Labelling Districts

Let’s color districts with the Ebola data. We already defined a patch attribute called ‘confirmed’ that we will use to hold the value for the number of confirmed cases, which we will get from the GIS data for districts.

Use the Command Center to get a feel for the values of ‘confirmed’.

Show values for 10 patches, type into the Command Center:

ask n-of 10 patches [show confirmed]

Or just show the value for a specific patch:

ask (patch 1 1) [show confirmed]

Color Districts By Confirmed Cases

gis:apply-coverage districts "V_ADM2_C_3" confirmed
  ask patches
  [ifelse (confirmed > 0)
    [set pcolor scale-color red confirmed 5000 0]
    [set pcolor white]
  ]

Note: Always color patches first.

A Quick Note About gis:draw

Note that the code ‘gis:draw’ only adds the GIS data to the display for visual effect. This does not actually apply any attributes to patches or turtles. And turtles and patches cannot interact with the features that are drawn with gis:draw command.

Also note that the red square in the image represents the size of one patch. You can see that the gray line of the boundary and the black line of the road are coarse when inspecting at the patch level - this is just the way Netlogo converts vector data for display. Also note that the detail of these vector GIS features is more detailed than the size of the patch.

Label Each District

Go to the interface tab. Create button called ‘load-pop’.

Go to the code block starting with:

to load-pop
	...

Look at the structure of the loop beginning with the ‘foreach’ command. See how we can assign labels from our data file with this loop?

What Do We Have So Far?

The model display is showing the spatial extent of the country of Sierra Leone, although the country is not outlined or labeled

All this with GIS data freely downloaded and using the Netlogo GIS extension. Although this model is not complete, we have made a nice visual base model upon which to add more data and behavior.

What Story Are We Trying To Tell?

We can see which districts had more confirmed ebola cases, but we don’t know how much of the population in each district was impacted.

The GIS data we’ve added thus far only had information about number of people infected with ebola, but we do not know the infect rate relative to total population.

We need to get more data about the population in each district of Sierra Leone.

Population Data

For a quick source of data, go to Wikipedia: Districts of Sierra Leone

You can simply grab this table by highlighting the table contents including headers.

Copy it, paste it in a spreadsheet, and save as ‘SL_pop.csv’ in the data folder with your GIS data. (Already done for you.)

Inspect the Data

Open in Excel (or other spreadsheet or text editor program).

Note:

  1. Columns of interest are the first (‘District’) and fifth (‘Population’) columns.
  2. Formatting such as upper/lower/mixed case
  3. Size of population is too large for a 1:1 representation with agents in Netlogo

Let’s add agents to model in proportion to population for each district based on the population value stored in the csv file you just saved.

Load CSV file

Go to the ‘load-pop’ section of the code. Begin with section that starts:

file-open "...

Specify the file name to load.

Check the code, push the ‘load-pop’ button.

What Happened?

Look at the CSV file again.

Note how the name of the districts in the GIS data and shown on the display are all upper case, but the name of districts in the CSV data is mixed case. Netlogo does not do fuzzy matching. It must be exactly the same, so let’s add code to fix this.

Fix the Labels

We could fix this in the CSV file, but let’s fix it in Netlogo with some code.

Go back to the ‘file-open’ code block and make these changes to the second and third lines after the ‘while’ command:

let d_name item 0 row
let district_name upper-case-string d_name

This change calls a function named ‘upper-case-string’ to convert the data stored in ‘d_name’ to be uppercase and store this final version as district_name.

Check the code and green stickies when you’ve made this change successfully.

Go to the bottom of the code. Find the two statements starting with:

to-report upper-case-string [s]
	...

and:

to-report uppoer-case-char [c]
	...

These two functions convert the input to uppercase.

Check the code, and then click the ‘load-pop’ button.

Green stickies when you’ve gotten the model to produce agents in proportion to population.

Top of Section


What Story Are We Trying To Tell?

We now have a nice base model.

We’re interested in how Ebola spreads, so we need to model how people move around between infected areas and treatment centers.

Make agents move along roads

Open the Netlogo model ‘expanded_ebola_model.nlogo’

Let’s check out the ‘go’ code blocks.

Top of Section


Sharing the Model

To send this model to someone, you need to send the Netlogo model, plus the data folder full of data. It’s best to zip the model and data folder together at the same time.

Review

A Few Parting Tips …

Things to consider when thinking about using GIS data in a Model …

Tip 1: Models can be effective and efficient without incorporating GIS

Tip 2: If you use GIS data, you raise the expectation that your model represents realistic behavior. This means your audience may have less tolerance for inconsistencies with a GIS based model than if they were viewing an abstract model.

Tip 3: If you use GIS data, try to also visually represent the layers in a meaningful and visually appealing way.

Tip 4: If you create a model with GIS data, you have to send the model and the folder of GIS data to the end user as a zip package. Or you can export Netlogo world and send this as the base data to import into the model.

Another good resource for GIS and Netlogo: Turtles in space!

Top of Section