Table of Contents
General Information
This summer’s Computational Institute will provide science teams from the socio-environmental synthesis community with hands-on training in open source tools for collaborative coding and data management, analysis, visualization, and dissemination. The goals of the workshop are to learn new concepts, skills and approaches for data-driven research, advance work on team projects, and become familiar with tools compitible with the cyberinfrastructure available at most research institutions.
Instructors:
Ian Carroll, Kelly Hondula, Philippe Marchand, Mary Shelley
Assistant:
Kate Weiss
Requirements:
Participants must bring a laptop.
Contact:
Please email icarroll@sesync.org with any questions and for information not covered here.
Acknowledgements & Support
Portions of the instructional materials are adopted from Data Carpentry and Software Carpentry. The structure of the curriculum as well as the teaching style are informed by Software Carpentry.
Schedule
Plenary sessions begin promptly at 9:00 am. Come prepared to survive until the first coffee break at 10:30 am and on-site lunch provided by SESYNC at 12:30 pm. Trainees are responsible for their own breakfast and dinner arrangements (we can make recommendations).
Tuesday | 9:00 am | Welcome and Overview of SESYNC |
9:15 | Collaborative & Reproducible Workflows | |
10:30 | Break | |
10:45 | Data Storage and Access for All | |
12:30 pm | Lunch | |
1:30 | Introduce ‘data2doc’ Project & Team Meetings | |
3:30 | Break | |
4:00 | One Hour Language | |
choose one new language: R, Python, SQL, or JavaScript | ||
5:00 | Reception (informal with snacks, tasty beverages, etc.) | |
Wednesday | 9:00 am | The Landscape of Spatial Data Tools |
10:30 | Break | |
10:45 | Scripting Geospatial Analysis | |
12:30 pm | Lunch | |
1:30 | Geospatial Packages in R | |
3:30 | Break | |
3:45 | Coaching Sessions | |
Thursday | 9:00 am | Version Control & Data Provenance |
10:30 | Break | |
10:45 | Shiny Apps | |
12:30 pm | Lunch | |
1:30 | Coaching Sessions | |
3:30 | Break | |
3:45 | Data Manipulation in R | |
Friday | 9:00 am | Coaching Sessions and ad-hoc Plenary |
12:00 pm | Wrap-up and Review | |
12:30 | Lunch | |
1:30 | Presentation of “data2doc” documents |
Pre-Arrival Installations & Downloads
To participate, you will need working copies of the software described below. Please make sure to install everything before the start of the short course.
Working Directory
We encourage participants to create a directory (i.e. a folder), which we will reference using the variable %sandbox%
, for all course material.
Please download the sidebar data and unzip it into your %sandbox%
.
You should have a README file at %sandbox%\data\README.md
.
GitHub
If you do not aleady have a GitHub account, please create one at https://www.github.com. Participants with a SESYNC account may instead use https://gitlab.sesync.org for their projects, but are encouraged to try GitHub in our lessons.
Editor
When you’re writing scripts or text, it’s nice to have a text editor that is optimized for writing code. These editors feature automatic color-coding of key words, line numbers, and maybe even tab-completion. The default text editor on Mac OS X and Linux is usually set to Vim, which is not famous for being intuitive. If you accidentally find yourself stuck in Vim, try typing the escape key, followed by ‘:q!’ (colon, lower-case ‘q’, exclamation mark), then hitting Return to exit. This will lose any unsaved changes to the file! Your operating system will have a default text editor (e.g. Notepad on Windows, TextEdit on Mac OSX), you may install a text editor of your own choosing, or you can use the text editor built into RStudio (see below) for all kinds of scripting (not just in the R language).
Command Line Interface / Shell
When writing scripts, creating programs, and working with data, the best way (sometimes the only way!) to proceed is often by working on the operating system’s command line interface (CLI), or shell. Access to the shell varies by OS:
- Windows: Select “Start” and “Run” and type in “cmd.exe”.
- Mac: Open the Terminal in Applications.
- Linux: I bet you already know.
Make sure you know how to access the shell on your system. Customize it’s look and feel if you like!
Install this Software before Arrival
The table below lists software we will use in this short course.
Unless noted (and especially for git
) please use the default installation options.
For Windows users, an installer for each item is available at the given download site.
Mac users are encouraged to use Homebrew – the missing package manager for OS X – via the shell.
Most packages in the list below can be installed with brew install %package%
, but packages with an * require brew cask install %package%
.
Ubuntu users may install from the shell with sudo apt-get install %package%
, and other Linux users are on their own.
Software | Download Site | Homebrew Package(s) | Aptitude Package(s) |
---|---|---|---|
git | https://git-scm.com/downloads | git |
git |
R | https://cran.rstudio.com/ | r |
r-base |
RStudio | https://www.rstudio.com/products/rstudio/download2/ | rstudio * |
|
Python 2.7.x | https://www.python.org/downloads/ | python |
python |
QGIS | https://trac.osgeo.org/osgeo4w/1 | qgis *, gdal |
qgis |
1: Choose the express Desktop install. Windows users need to add a path to your PATH
environment variable. Open the “Start” menu, search for “environment”, and choose to edit environment variables for your account. Add a new variable named “PATH” with value “C:\OSGeo4W64\bin”.
The following R packages need to be installed. Open RStudio and, for each package below, type install.packages(%package%)
at the prompt and press return. Follow all prompts.
tidyr
dplyr
RPostgreSQL
sp
rgdal
rgeos
raster
shiny
shinythemes
leaflet