Teaching Computation in the Social Sciences

Sponsored by BSOS and SESYNC

 
January 11 and 13, 2016
10:00 am - 4:00 pm

General Information

We'll explore ways to introduce state-of-the-art, practical computation to undergraduates in the social sciences.

Instructors: Mary Shelley, Mike Smorul

Requirements: Participants must bring a laptop with the required software pre-installed (see "Setup" below).

Schedule (tentative)

Monday 10:00am Welcome
10:15 Introductions and overview of issues
11:00 Using OpenRefine to explore and cleanup data
12:00pm Lunch
1:00 Intro to R and RStudio
3:30 Q&A/Day 1 Wrap up
Wednesday10:00amRecap Data Vis and introduce RShiny and accessing API driven data
12:00pm Lunch
1:00 Quick Jupyter notebook demo
2:00 Brainstorm ideas for applying in courses
3:30 Workshop wrap up

Resources

Gap Minder Data

R Resources

Jupyter notebooks

Python

Course materials

Open Refine

UMD R Study Group: https://umd-r-users.github.io/studyGroup/
Campus-wide R users mailing list: https://listserv.umd.edu/cgi-bin/wa?SUBED1=umd-r&A=1

Setup

To participate, you will need working copies of the software described below. Please make sure to install everything before the start of the workshop. Mary and Mike will be on-site at 9 a.m. to help anyone who has trouble.

OpenRefine

OpenRefine is a free, open source tool for working with messy data. Please download and install the appropriate kit of version 2.5 for your operating system from here. We recommend extracting the zipped download to a new folder on your desktop so it's easy to find. OpenRefine runs locally on your computer using your web browser as an interface. For best results, make sure you have Firefox installed. OpenRefine may not work with all web browsers.

R

R is a programming language that specializes in statistical computing. It is a powerful tool for exploratory data analysis and statistical modelling. We will interact with R using RStudio, an Integrated Development Environment (IDE).

This will require installing two piece of software. First, download and install the version of R corresponding to your operating system from here. Second, you will need to install the RStudio interface from here, again selecting the installer for your operating system.

Data

We will use a selection of data from GapMinder to illustrate and explore these tools. Columns on population, life expectancy, and per capita GDP are included for multiple countries for multiple years. You do not need to download the data ahead of time; we'll use the URLs below to load it in directly.

For the data cleaning exercise, these data have been "dirtied" with typos. Use this URL for the OpenRefine exercise: https://goo.gl/38QDoy

We'll use clean data for R which can be found here: https://goo.gl/NpMmTZ

Acknowledgements & Support

Portions of the instructional materials are adopted from Data Carpentry and Software Carpentry. The structure of the curriculum as well as the teaching style are informed by Software Carpentry.