January 11 and 13, 2016
10:00 am - 4:00 pm
We'll explore ways to introduce state-of-the-art, practical computation to undergraduates in the social sciences.
Instructors: Mary Shelley, Mike Smorul
Requirements: Participants must bring a laptop with the required software pre-installed (see "Setup" below).
Monday | 10:00am | Welcome |
10:15 | Introductions and overview of issues | |
11:00 | Using OpenRefine to explore and cleanup data | |
12:00pm | Lunch | |
1:00 | Intro to R and RStudio | |
3:30 | Q&A/Day 1 Wrap up | |
Wednesday | 10:00am | Recap Data Vis and introduce RShiny and accessing API driven data |
12:00pm | Lunch | |
1:00 | Quick Jupyter notebook demo | |
2:00 | Brainstorm ideas for applying in courses | |
3:30 | Workshop wrap up |
install.packages('swirl'); library("swirl") swirl()
To participate, you will need working copies of the software described below. Please make sure to install everything before the start of the workshop. Mary and Mike will be on-site at 9 a.m. to help anyone who has trouble.
R is a programming language that specializes in statistical computing. It is a powerful tool for exploratory data analysis and statistical modelling. We will interact with R using RStudio, an Integrated Development Environment (IDE).
This will require installing two piece of software. First, download and install the version of R corresponding to your operating system from here. Second, you will need to install the RStudio interface from here, again selecting the installer for your operating system.
We will use a selection of data from GapMinder to illustrate and explore these tools. Columns on population, life expectancy, and per capita GDP are included for multiple countries for multiple years. You do not need to download the data ahead of time; we'll use the URLs below to load it in directly.
For the data cleaning exercise, these data have been "dirtied" with typos. Use this URL for the OpenRefine exercise: https://goo.gl/38QDoy
We'll use clean data for R which can be found here: https://goo.gl/NpMmTZ
Portions of the instructional materials are adopted from Data Carpentry and Software Carpentry. The structure of the curriculum as well as the teaching style are informed by Software Carpentry.