Table of Contents

General Information

This summer’s Computational Institute will provide science teams from the socio-environmental synthesis community with hands-on training in open source tools for collaborative coding and data management, analysis, visualization, and dissemination. The goals of the workshop are to learn new concepts, skills and approaches for data-driven research, advance work on team projects, and become familiar with tools compitible with the cyberinfrastructure available at most research institutions.

Instructors:
Ian Carroll, Kelly Hondula, Philippe Marchand, Mary Shelley

Assistant:
Kate Weiss

Requirements:
Participants must bring a laptop.

Contact:
Please email icarroll@sesync.org with any questions and for information not covered here.

Acknowledgements & Support

Portions of the instructional materials are adopted from Data Carpentry and Software Carpentry. The structure of the curriculum as well as the teaching style are informed by Software Carpentry.

Schedule

Plenary sessions begin promptly at 9:00 am. Come prepared to survive until the first coffee break at 10:30 am and on-site lunch provided by SESYNC at 12:30 pm. Trainees are responsible for their own breakfast and dinner arrangements (we can make recommendations).

Tuesday 9:00 am Welcome and Overview of SESYNC
  9:15 Collaborative & Reproducible Workflows
  10:30 Break
  10:45 Data Storage and Access for All
  12:30 pm Lunch
  1:30 Introduce ‘data2doc’ Project & Team Meetings
  3:30 Break
  4:00 One Hour Language
    choose one new language: R, Python, SQL, or JavaScript
  5:00 Reception (informal with snacks, tasty beverages, etc.)
Wednesday 9:00 am The Landscape of Spatial Data Tools
  10:30 Break
  10:45 Scripting Geospatial Analysis
  12:30 pm Lunch
  1:30 Geospatial Packages in R
  3:30 Break
  3:45 Coaching Sessions
Thursday 9:00 am Version Control & Data Provenance
  10:30 Break
  10:45 Shiny Apps
  12:30 pm Lunch
  1:30 Coaching Sessions
  3:30 Break
  3:45 Data Manipulation in R
Friday 9:00 am Coaching Sessions and ad-hoc Plenary
  12:00 pm Wrap-up and Review
  12:30 Lunch
  1:30 Presentation of “data2doc” documents

Pre-Arrival Installations & Downloads

To participate, you will need working copies of the software described below. Please make sure to install everything before the start of the short course.

Working Directory

We encourage participants to create a directory (i.e. a folder), which we will reference using the variable %sandbox%, for all course material. Please download the sidebar data and unzip it into your %sandbox%. You should have a README file at %sandbox%\data\README.md.

GitHub

If you do not aleady have a GitHub account, please create one at https://www.github.com. Participants with a SESYNC account may instead use https://gitlab.sesync.org for their projects, but are encouraged to try GitHub in our lessons.

Editor

When you’re writing scripts or text, it’s nice to have a text editor that is optimized for writing code. These editors feature automatic color-coding of key words, line numbers, and maybe even tab-completion. The default text editor on Mac OS X and Linux is usually set to Vim, which is not famous for being intuitive. If you accidentally find yourself stuck in Vim, try typing the escape key, followed by ‘:q!’ (colon, lower-case ‘q’, exclamation mark), then hitting Return to exit. This will lose any unsaved changes to the file! Your operating system will have a default text editor (e.g. Notepad on Windows, TextEdit on Mac OSX), you may install a text editor of your own choosing, or you can use the text editor built into RStudio (see below) for all kinds of scripting (not just in the R language).

Command Line Interface / Shell

When writing scripts, creating programs, and working with data, the best way (sometimes the only way!) to proceed is often by working on the operating system’s command line interface (CLI), or shell. Access to the shell varies by OS:

  • Windows: Select “Start” and “Run” and type in “cmd.exe”.
  • Mac: Open the Terminal in Applications.
  • Linux: I bet you already know.

Make sure you know how to access the shell on your system. Customize it’s look and feel if you like!

Install this Software before Arrival

The table below lists software we will use in this short course. Unless noted (and especially for git) please use the default installation options. For Windows users, an installer for each item is available at the given download site. Mac users are encouraged to use Homebrew – the missing package manager for OS X – via the shell. Most packages in the list below can be installed with brew install %package%, but packages with an * require brew cask install %package%. Ubuntu users may install from the shell with sudo apt-get install %package%, and other Linux users are on their own.

Software Download Site Homebrew Package(s) Aptitude Package(s)
git https://git-scm.com/downloads git git
R https://cran.rstudio.com/ r r-base
RStudio https://www.rstudio.com/products/rstudio/download2/ rstudio*  
Python 2.7.x https://www.python.org/downloads/ python python
QGIS https://trac.osgeo.org/osgeo4w/1 qgis*, gdal qgis

1: Choose the express Desktop install. Windows users need to add a path to your PATH environment variable. Open the “Start” menu, search for “environment”, and choose to edit environment variables for your account. Add a new variable named “PATH” with value “C:\OSGeo4W64\bin”.

The following R packages need to be installed. Open RStudio and, for each package below, type install.packages(%package%) at the prompt and press return. Follow all prompts.

  • tidyr
  • dplyr
  • RPostgreSQL
  • sp
  • rgdal
  • rgeos
  • raster
  • shiny
  • shinythemes
  • leaflet