Table of Contents

General Information

This year’s Summer Institute brings together seven science teams for a short course on data and software skills in socio-environmental synthesis. Through lectures, hands-on computer labs, and project consultation, SESYNC staff will aim to accelerate your team’s adoption of cyber resources for all phases of data-driven research and dissemination.

Participants should expect to:

  • learn new scientific computing skills
  • overcome specific or conceptual project hurdles
  • gain coding confidence
  • have fun

Please review the agenda below and follow the pre-arrival installation instructions.

Instructors:

  • Ian Carroll, Data Scientist
  • Mary Shelly, Associate Director for Synthesis
  • Benoit Parmentier, Data Scientist
  • Kelly Hondula, Quantitative Researcher and Computer Programmer

When:

Optional day for basic R training: Monday, July 17

Tuesday, July 18, 2017 to Friday, July 21, 2017

Where:

1 Park Place, Suite 300
Annapolis, MD 21401

Get directions with OpenStreetMap or Google Maps.

Contact:

Please email icarroll@sesync.org with any questions, including installation issues, or for information not covered here.

Requirements

  • Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.), and have installed the software described below the schedule.
  • At least one team member must bring data for the mini-project; a sample/incomplete data is okay.
  • After the course, participants must complete a reimbursement form to recover allowed travel expenses.

Schedule

Sessions begin promptly at 9:00 am.

Nourishment will arrive at the 10:30 am coffee break, the on-site lunch provided by SESYNC at 12:15 pm, and an afternoon break. Trainees are responsible for their own breakfast and dinner arrangements (we can make recommendations).

[Monday] 9:00 am Introduction to the RStudio IDE Ian
  9:30 Installation Help & Reading Comprehension Exercise  
  10:30 Break  
  10:45 Basic R: Part I Ian
  12:15 pm Lunch  
  1:15 Basic R: Part II Mary
  3:15 Break  
  3:30 Scripting Challenges  
Tuesday 9:00 am Welcome and Overview of SESYNC Mary
  9:15 Collaborative Workflows & Reproducible Pipelines Ian
  10:30 Break  
  10:45 Introduce Coaches & ‘data2doc’ Project Ian
  11:15 Coaching Sessions & Installation Help  
  12:15 pm Lunch  
  1:15 Database Principles and Use Benoit
  3:15 Break  
  3:30 Manipulating Tabular Data Kelly
  5:00 Reception (informal with snacks, tasty beverages, etc.)  
Wednesday 9:00 am Visualization with ggplot2 Mary
  10:30 Break  
  10:45 Mini-languages for Statistical Models Ian
  12:15 pm Lunch  
  1:15 Cyberinfrastructure @SESYNC Mary
  2:15 Coaching Sessions  
  3:15 Break  
  3:30 Introduction to Shiny Apps Kelly
Thursday 9:00 am Geospatial Packages in R Benoit
  10:30 Break  
  10:45 Introduction to Python + Pandas Kelly
  12:15 pm Lunch  
  1:15 Database-to-Doc with RMarkdown Ian
  1:45 Coaching Sessions  
  3:15 Break  
  3:30 Coaching Sessions  
Friday 9:00 am Web Services and APIs with Python Ian
  10:30 Break  
  10:45 Coaching Sessions  
  12:15 Lunch  
  1:15 “data2doc” Project Presentations  

Pre-Arrival Installation Instructions

A bundle of all the software needed for the Summer Institute is available as a Docker “container”, a virtual server that your laptop will run in the background. To use the container, you “only” need to install Docker with Kitematic (it may be harder than the average install). If you cannot get Docker running, you must install several pieces of software separately. In short, please complete only one of the three sets of instructions below:

Option 1: Docker

If you run Windows 10 Pro, Education, or Enterprise (64bit), you can probably install Docker for Windows. The installer will ask to enable the Windows 10 utility Hyper-V, which you should “Ok”. After restart, Docker will show up in the lower-right system tray (it may be hidden, so expand to see all running services). Docker may display an error message if your laptop’s virtualization technology is turned off in the system BIOS. In that case, search the internet for system-specific instructions for changing your BIOS settings using the keywords “enable vt-x %laptop type%” (e.g. “enable vt-x thinkpad”) or “enable amd-v %laptop type%” on non-Intel PCs. To complete installation, right-click the Docker icon in your system tray and choose “Kitematic”. Move the contents of the downloaded .zip file to a new folder called “Kitematic” within “C:\Program Files\Docker”. Now Kitematic will launch from the right-click menu of the Docker icon in the system tray.

If you run macOS 10.11+ (El Capitan or newer), you can probably install Docker for Mac. Download and open the “Stable” installer and drag the Docker app icon into your Applications folder, as instructed. Kitematic will be in the menu opened by tapping on the Docker icon, once the icon appears in the menu bar.

On both Windows and macOS, run Kitematic and skip account sign-up if asked. Search for “sesync”, and create the “teaching-lab” container. If some text appears in the “Container Logs”, you are ready to go. You can “Stop” the container and quit Docker.

Option 2: Docker Toolbox

If you run 64bit Windows 7 or higher you can probably install Docker Toolbox, a legacy version of “Docker for Windows”. After running the installer (leaving all the default settings), you will have three new applications: the Docker Quickstart Terminal, Kitematic and Oracle VM Virtualbox. When you are in a patient mood, launch Kitematic. You may see an error ending with a complaint about “VT-X/AMD-v” and the “BIOS” if your laptop’s virtualization technology is turned off. In that case, search the internet for system-specific instructions for changing your BIOS settings using the keywords “enable vt-x %laptop type%” (e.g. “enable vt-x thinkpad”) or “enable amd-v %laptop type%” on non-Intel PCs.

If you run macOS 10.8+ (Mountain Lion or newer) you can probably install Docker Toolbox, a legacy version of “Docker for Mac”. Choose “Get Docker Toolbox for Mac” from the installation guide to download the installer, click the package to install and complete the instructions.

On both Windows and macOS, run Kitematic and skip account sign-up if asked. Search for “sesync”, and create the “teaching-lab” container. If some text appears in the “Container Logs”, you are ready to go. You can “Stop” the container and quit Docker.

Option 3: Itemized Installation

The table below lists software we use in the short course. Unless noted (and especially for git) please use the default installation options. For Windows users, an installer for each item is available at the given download site. Mac users are encouraged to use Homebrew – the missing package manager for OS X – via the shell. Most packages in the list below can be installed with brew install %package%, but packages with an * require brew cask install %package%. Ubuntu users may install from the shell with sudo apt-get install %package%, and other Linux users are on their own.

Software Download Site Homebrew Package(s) Aptitude Package(s)
git https://git-scm.com/downloads git git
R https://cran.rstudio.com/ r r-base
RStudio https://www.rstudio.com/products/rstudio/download2/ rstudio*  
Python 3.x https://www.python.org/downloads/ python3 python3
GDAL/OGR https://trac.osgeo.org/osgeo4w/ gdal21, geos gdal-bin2

1: macOS users will need to execute brew tap osgeo/osgeo4mac prior to running brew install gdal2.

2: Ubuntu users will need to add the UbuntuGIS repository prior to running apt-get install gdal-bin

The following R packages need to be installed. Open RStudio and, for each package below, type install.packages(%package%) at the prompt and press return. Follow all prompts.

  • tidyr
  • ggplot2
  • RSQLite
  • rgdal
  • rgeos
  • shiny
  • leaflet

Acknowledgments

Portions of the instructional materials are adopted from Data Carpentry and Software Carpentry. The structure of the curriculum as well as the teaching style are informed by Software Carpentry.