Table of Contents
General Information
A SESYNC Data Skills Workshop provides researchers from the socio-environmental synthesis community with hands-on training in open source tools for collaborative coding, data management, analysis, visualization, and dissemination. The goal of this two-day workshop is to introduce novice and intermediate scientific coders to concepts, skills and approaches for data-driven research.
The first day of the workshop (Tuesday) will utilize R and tools available through the RStudio development environment. The second day (Thursday) will introduce Python and several command line tools. The schedule below provides an overview of the specific topics we will address through a series of 8 lessons that integrate live-coding and trainee challenge exercises.
Registration is open to any faculty or research staff in the Behavioral and Social Sciences College of the University of Maryland. Participants are welcome to attend either one or both days.
Instructors:
- Ian Carroll, Data Scientist @SESYNC
- Mary Shelley, Associate Director of Synthesis @SESYNC
When:
The workshop will cover two full, but non-sequential days during Winter Term.
Tuesday, January 16, 2018 and Thursday, January 18, 2018
Where:
1101 Morrill Hall
Get directions with OpenStreetMap or Google Maps.
Requirements:
Participants must bring a laptop with a full keyboard and mouse/trackpad (not a tablet, iPad, etc.), and have installed a full-functioning browser (e.g. Chrome, Firefox, Safari, or Internet Explorer).
Contact:
Please email icarroll@sesync.org with any questions, or for information not covered here.
Registration
Schedule
Please note, we plan to end each day with sufficient time to answer any lengthy follow-up questions with individuals as needed.
Tuesday | 9:00 | Introductions & Orientation |
9:15 | Basic R | |
10:45 | Coffee Break | |
11:00 | Model Building Mini-Languages | |
12:15 pm | Lunch Break | |
1:00 | Data Manipulation with “dplyr” | |
2:30 | Stretch Break | |
2:45 | Visualizations with “ggplot2” | |
4:15 | FIN | |
Wednesday | NOT MEETING | |
Thursday | 9:00 | [Re-]Introductions & Orientation |
9:15 | git and More Tools in the Shell | |
10:30 | Coffee Break | |
10:45 | Basic Python | |
12:15 pm | Lunch Break | |
1:00 | Software Portals (PyPI and CRAN) | |
1:30 | Web Services and APIs with Python | |
2:30 | Stretch Break | |
2:45 | ||
4:15 | FIN |
Setup
- Day 1
- login to https://lab.sesync.org/rstudio/ with the username from your e-mail address
- run
unzip('/tmp/handouts.zip', exdir = 'handouts')
- Day 2
- Settle in next to a friend … or make one!
- Sign-in or sign-up at http://github.com
- start at https://lab.sesync.org
- login the JupyterLab, ask for your username and password
Software
Use the default installation options for all packages. For Windows users, an installer for each item is available at the given download site. Mac users are encouraged to use Homebrew – the missing package manager for OS X – via the shell, although the downlink links also provide .dmg installers.
- git
- https://git-scm.com/downloads
brew install git
- R
- https://cran.rstudio.com/
brew install r
- RStudio (free version)
- https://www.rstudio.com/products/rstudio/download2/
- Use the downloader.
- Python 3.x
- https://www.python.org/downloads/
brew install python3
The following R packages need to be installed after R and Rstudio are installed.
Open RStudio and, for each package below, type install.packages(%package%)
at
the prompt and press return. Follow all prompts.
- tidyr
- dplyr
- magrittr
- stringr
- ggplot2
- data.table
- lme4
The following Python packages need to be installed Python. Open a shell/terminal
and, for each package below, run pip3 install %package%
.
- pandas
- jupyterlab
- beautifulsoup4
- requests
- census
- ggplot
After installing jupyterlab, run jupyter serverextension enable --py jupyterlab
--sys-prefix
in the shell/terminal to complete installation.
JupyterLab runs through
your browser, to launch it, enter jupyter lab
in the shell/terminal, and stop
it with Ctrl-C.
Acknowledgements
Portions of the instructional materials, along with the structure of the curriculum and teaching approach, are adopted from Data Carpentry and Software Carpentry.