Effective at SESYNC's closure in December 2022, this page is no longer maintained. The information may be out of date or inaccurate.

Introduction to GIS and ABM

Handouts for this lesson need to be saved on your computer. Download and unzip this material into the directory (a.k.a. folder) where you plan to work.


Lesson Goals

This tutorial will introduce you to:

  1. Pros/Cons of introducing GIS data to your models
  2. Downloading, cleaning, and importing GIS data into models
  3. Implementing ABMs with GIS data

Top of Section


Quick reminder: GIS data types and formats

Using the GIS extension, you can use geographic data that is vector format as shapefiles or raster data as ASCII and TIFF.

  • Vector datasets (e.g., shapefiles)
    • cities as points
    • roads as lines
    • buildings as polygons
  • Raster datasets (e.g., ASCII, TIFF)
    • elevation data
    • remote sensing imagery

Using GIS with ABMs: Benefits

  • Support more “realistic” models and tie model to a specific place
  • GIS data can provide attributes that can be used by patches and agents that would be too time consuming to manually code
  • Can be more visually appealing with layers of information displayed in Netlogo
  • It is not challenging to add GIS and CSV data to Netlogo

Using GIS with ABMs: Costs

  • Need realistic agent behavior to correspond with spatial scale of model with GIS
  • Scale matters. At detailed map scale, challenge to get spatial reference system to match Netlogo reference such that 10 km = distance of 1 patch, for example.
  • GIS data may slow down model load and run time
  • GIS and data acquisition/cleaning/processing may be time intensive and require tools like QGIS and Open Office

Top of Section


Let’s Make a Model!

Overview of Model Process

  • Scope topic
  • Identify what you need
  • Getting/Organizing/Reviewing/Cleaning Data
  • Importing data and building a NetLogo model
  • Trouble-shooting and test throughout the process
  • Validation
  • Sharing your model

Scoping the model

  • Topic: Ebola
  • What is being modeled: spread of ebola, rate of spread
  • Experiments: Identify impacts on rate of spread, if restrict travel
  • Spatial Scope: Africa > Sierra Leone > district level
  • Temporal Scope: Variable
  • Agents: People (categories based on SEIR model)
  • Environment: (layers we want to show such as with GIS) Admin districts, treatment facilities, roads, cities, airports

A note about this model …

For this exercise we are only focusing on a sample model to bring in the GIS and CSV data into NetLogo, based on these themes above. We will not actually model the spread of the disease or be able to support the experiments with this base model. However, I encourage you to build upon the model to do this.

See Dr. Andrew Crooks’s work on Ebola

  • Github: https://github.com/rohan-suri/ebola-abm
  • Video: https://youtu.be/IlvnW32uUdg

What data do we need?

  • Map data
  • Demographic data
  • Ebola counts

Reminder about Shapefiles

Shapefiles are actually composed of several component files. At least 4, but up to 8, files that end in the following: .shp, .dbf, .shx, .prj, .sbn, .sbx, .cpg, .xml.

Pro Tip: Try to keep all of these together in the same folder!

Reminder about Tabular Data

Use CSV files for structured data in text format separated by commas.

Viewed in text editor:

District,Province,Capital,Area,Population
Kailahun,Eastern,Kailahun,3859,358190
Kenema,Eastern,Kenema,6053,497948

View in tabular form:

District Province Capital Area Population
Kailahun Eastern Kailahun 3859 358190
Kenema Eastern Kenema 6053 497948
       

Where can we find and download GIS data (shp)?

  • Various US datasets: http://catalog.data.gov/dataset
  • Base level GIS global: https://lib.stanford.edu/GIS/data
  • List of sources: http://gisgeography.com/best-free-gis-datasources-raster-vector/
  • US Census data: https://www.census.gov/geo/maps-data/data/tiger-data.html
  • Global humanitarian datasets: https://data.hdx.rwlabs.org
  • Open Street Map and other datasets: WeoGeo Market - Trimble Data Marketplace: http://market.weogeo.com/

Where can we find and download tabular data?

  • US Census data: https://www.census.gov/geo/maps-data/data/tiger-data.html
  • Global humanitarian datasets: https://data.hdx.rwlabs.org
  • Wikipedia: https://en.wikipedia.org/wiki/Districts_of_Sierra_Leone

Data Organization

  • Make a folder for this project called ‘GIS_ebola’
  • If you were starting from scratch, you would make two folders, one called ‘source_data’ and one called ‘data’
  • Why two folders?

Review the Data

The dbf file contains the GIS data attributes that you will use in NetLogo.

  • Open the .dbf file for “Cases_at_Admin2” in excel or open office (do not save!).
  • Note the column names of interest for the model (district name, country name, number of confirmed cases), and then close the file but do not save.

Note: may have to refer back to data source to know what column names mean: District name (GLOBAL_A_1), country name (GLOBAL_A_3), and the number of confirmed ebola cases to date (V_ADM2_C3)

Review the Data

Review projection: The .prj file tells us the spatial reference information and projection of the shapefile. Ideally we want all our data in same projection, and lucky for us it is for this exercise.

  • Open .prj file in a text editor, review, close, do not save.

  • COPY any one of the .prj file and name it ‘projection.prj’ to make it easy for NetLogo and to have a separate prj file for the model.

  • Review spatial extent of data in QGIS, optional.

A Note About GIS Software

ArcGIS/QGIS to View and Process GIS data

QGIS: https://www.qgis.org/en/site/forusers/download.html

QGIS is a free open source GIS software for manipulating and creating geographic data. Use to reproject, edit, trim, simplify, and clean datasets for a model. Create additional GIS data layers if needed for the model.

  • Load data (image, vectors, csv) (tips to make it faster with pixel size and smaller input file via smoothing)
  • Make the map: Display, label features (draw features, apply color, label)
  • Create agents based on map attributes
  • Agent-environment interaction (e.g. movement constrained by map properties - road following, move within state, etc.)
  • Saving/Sharing model

NetLogo Extensions

NetLogo Homepage: http://ccl.northwestern.edu/netlogo/

Or, in NetLogo, go to Help > NetLogo User Manual

NetLogo and GIS

A great reference for working with GIS data in NetLogo is the model in the model library:

  • Open NetLogo > Models Library > Code Examples > GIS > GIS General Examples
  • Click setup (to load the GIS)
  • Click display-countries
  • display-population-in-patches
  • Go to code tab (notice extensions at top, setup commands, good commenting throughout).

Note: You can use this model to copy and adapt code sections for your own models.

Top of Section


Let’s Make the Ebola Model!

  • Create a new model in Netlogo File > New
  • Right click on View, go to Model Settings and change values to match my settings
  • Save model in ‘GIS_ebola’ folder but not inside any of the other folders.

Define Global Variables

Find the following code at the top of your script to define globals:

extensions [ gis csv ]
globals [ sites
		roads
		districts
		SL]
breed [ admin-labels admin-label ]
patches-own [district-name confirmed population road-here]
turtles-own [name status time-infected]

What entities will be created from data? What attributes will patches have? Turtles?

Setup Button to Load GIS

  • On the interface tab make a button called “setup,” then go to code tab
  • Right click > choose button> type in setup

Write Code to Load Data

Find the next code block starting with:

to setup
	...

We need to assign data to our global variables and load the data files.

Tasks

  1. Specify projection file in the ‘gis:load-coordinate-system’ command.
  2. Specify data files assigned to each global variable.

Visualize GIS Layers

Make a button called ‘draw,’ then go to code tab.

Look at the next code block starting with:

to draw
	...

What colors are we making sites, districts, and roads? How do we assign roads to specific patches?

Model Extent

What is our model extent now? (Hint: Remember that our roads data is just for Sierra Leone)

Controlling Model Extent

To change model extent, we need to find where it is defined

  • Go back up to the ‘setup’ code block.

  • Comment-out entities with undesired extents to specify our desired extent; in this case, just Sierra Leone.

  • Return to the interface, press ‘setup’ and ‘draw.’

  • Green stickies when your model extent aligns with Sierra Leone only.

Labelling Districts

Let’s color districts with the Ebola data. We already defined a patch attribute called ‘confirmed’ that we will use to hold the value for the number of confirmed cases, which we will get from the GIS data for districts.

Use the Command Center to get a feel for the values of ‘confirmed’.

Show values for 10 patches, type into the Command Center:

ask n-of 10 patches [show confirmed]

Or just show the value for a specific patch:

ask (patch 1 1) [show confirmed]

Color Districts By Confirmed Cases

  • In the ‘draw’ code block, add the code shown in the box, check the code, then click the draw button.
gis:apply-coverage districts "V_ADM2_C_3" confirmed
  ask patches
  [ifelse (confirmed > 0)
    [set pcolor scale-color red confirmed 5000 0]
    [set pcolor white]
  ]
  • Make sure to place the block of code in the same order within the to draw section.

Note: Always color patches first.

A Quick Note About gis:draw

Note that the code ‘gis:draw’ only adds the GIS data to the display for visual effect. This does not actually apply any attributes to patches or turtles. And turtles and patches cannot interact with the features that are drawn with gis:draw command.

Also note that the red square in the image represents the size of one patch. You can see that the gray line of the boundary and the black line of the road are coarse when inspecting at the patch level - this is just the way Netlogo converts vector data for display. Also note that the detail of these vector GIS features is more detailed than the size of the patch.

Label Each District

Go to the interface tab. Create button called ‘load-pop’.

Go to the code block starting with:

to load-pop
	...

Look at the structure of the loop beginning with the ‘foreach’ command. See how we can assign labels from our data file with this loop?

What Do We Have So Far?

The model display is showing the spatial extent of the country of Sierra Leone, although the country is not outlined or labeled

  • We can see which districts have more confirmed ebola cases. Those with more cases are shown in darker red. Those with fewer cases are shown in lighter shades of pinkish-white.
  • The districts are outlined with thick, light gray lines. Each district name is labeled in black.
  • The treatment centers are shown as red circles. The road network is shown as thin black lines and they appear to connect most treatment centers.

All this with GIS data freely downloaded and using the Netlogo GIS extension. Although this model is not complete, we have made a nice visual base model upon which to add more data and behavior.

What Story Are We Trying To Tell?

We can see which districts had more confirmed ebola cases, but we don’t know how much of the population in each district was impacted.

The GIS data we’ve added thus far only had information about number of people infected with ebola, but we do not know the infect rate relative to total population.

We need to get more data about the population in each district of Sierra Leone.

Population Data

For a quick source of data, go to Wikipedia: Districts of Sierra Leone

You can simply grab this table by highlighting the table contents including headers.

Copy it, paste it in a spreadsheet, and save as ‘SL_pop.csv’ in the data folder with your GIS data. (Already done for you.)

Inspect the Data

Open in Excel (or other spreadsheet or text editor program).

Note:

  1. Columns of interest are the first (‘District’) and fifth (‘Population’) columns.
  2. Formatting such as upper/lower/mixed case
  3. Size of population is too large for a 1:1 representation with agents in Netlogo

Let’s add agents to model in proportion to population for each district based on the population value stored in the csv file you just saved.

Load CSV file

Go to the ‘load-pop’ section of the code. Begin with section that starts:

file-open "...

Specify the file name to load.

Check the code, push the ‘load-pop’ button.

What Happened?

Look at the CSV file again.

Note how the name of the districts in the GIS data and shown on the display are all upper case, but the name of districts in the CSV data is mixed case. Netlogo does not do fuzzy matching. It must be exactly the same, so let’s add code to fix this.

Fix the Labels

We could fix this in the CSV file, but let’s fix it in Netlogo with some code.

Go back to the ‘file-open’ code block and make these changes to the second and third lines after the ‘while’ command:

let d_name item 0 row
let district_name upper-case-string d_name

This change calls a function named ‘upper-case-string’ to convert the data stored in ‘d_name’ to be uppercase and store this final version as district_name.

Check the code and green stickies when you’ve made this change successfully.

Go to the bottom of the code. Find the two statements starting with:

to-report upper-case-string [s]
	...

and:

to-report uppoer-case-char [c]
	...

These two functions convert the input to uppercase.

Check the code, and then click the ‘load-pop’ button.

Green stickies when you’ve gotten the model to produce agents in proportion to population.

Top of Section


What Story Are We Trying To Tell?

We now have a nice base model.

We’re interested in how Ebola spreads, so we need to model how people move around between infected areas and treatment centers.

Make agents move along roads

Open the Netlogo model ‘expanded_ebola_model.nlogo’

Let’s check out the ‘go’ code blocks.

  • subroutine for ‘travel’
  • right turn and move along roads

Top of Section


Sharing the Model

To send this model to someone, you need to send the Netlogo model, plus the data folder full of data. It’s best to zip the model and data folder together at the same time.

Review

  • You learned about pros/cons of GIS data and Netlogo
  • You searched for, obtained, reviewed GIS data
  • You loaded GIS data into Netlogo and displayed it
  • You applied GIS attributes to Netlogo patches and turtles
  • You loaded CSV data and used it to make more turtles
  • You overcame a few challenges that popped up
  • You learned how to share your model

A Few Parting Tips …

Things to consider when thinking about using GIS data in a Model …

Tip 1: Models can be effective and efficient without incorporating GIS

Tip 2: If you use GIS data, you raise the expectation that your model represents realistic behavior. This means your audience may have less tolerance for inconsistencies with a GIS based model than if they were viewing an abstract model.

Tip 3: If you use GIS data, try to also visually represent the layers in a meaningful and visually appealing way.

Tip 4: If you create a model with GIS data, you have to send the model and the folder of GIS data to the end user as a zip package. Or you can export Netlogo world and send this as the base data to import into the model.

Another good resource for GIS and Netlogo: Turtles in space!

Top of Section


If you need to catch-up before a section of code will work, just squish it's 🍅 to copy code above it into your clipboard. Then paste into your interpreter's console, run, and you'll be ready to start in on that section. Code copied by both 🍅 and 📋 will also appear below, where you can edit first, and then copy, paste, and run again.

# Nothing here yet!