R for Social Scientists: A Practical and Hands-On Introduction

Dr. Jeffrey GIrard, Carnegie Mellon University


July 22nd-23rd, 2019 – Both days will involve lectures and planned exercises. Participants should bring a laptop computer and are encouraged to bring their own data, as the planned exercises will provide opportunities to apply the learned techniques to both provided and personal data. The instructor may be able to provide access to a laptop computer to those without one (please ask before registering). Both days are anticipated to start at 9am and end by 4pm or 5pm with a one-hour break for lunch.


Trainees (students and post-docs) – $250

Professionals (including faculty) – $500

NOTE: If also registering for Multilevel Modeling for Longitudinal Data, a 20% fee reduction will be applied to both registrations.


R is a freely available statistical computing environment that runs on all major platforms (Windows, Mac OS X, and Linux). It allows users to wrangle, visualize, and analyze data in a highly customizable and easily reproducible manner. R enjoys a large and diverse community of users from many backgrounds; in addition to being fun and welcoming, this community means that (a) it is easy to get online support for R and (b) new features and techniques are constantly being added to it, often months or years before they are added to expensive alternatives. For social scientists, the main drawback of R is that it has not historically been taught in undergraduate and graduate courses, and therefore can seem unfamiliar and intimidating to many. This workshop addresses this drawback by gently introducing participants to R.

This 2-day workshop offers a practical and hands-on introduction to R suitable for both trainees and professionals. It is designed to empower participants and take them from “0 to 45 miles per hour,” i.e., from no experience with R to being able to confidently import, wrangle, visualize, and analyze research data using common statistical techniques. It also provides previews of and pointers toward resources for advanced material: perfect for those wishing to eventually extend beyond “45 miles per hour.” The content is tailored to be especially relevant and engaging for social scientists (e.g., researchers from psychology and education), but the knowledge and techniques covered will have broad applicability to anyone who frequently works with and wants to deepen their understanding of quantitative data.

Day 1 will include lectures on (1) installing RStudio and understanding the R environment and language, (2) basic data wrangling (using “dplyr” and other “tidyverse” packages), and (3) basic data visualization (using the “ggplot2” package), as well as previews of advanced programming, data wrangling, data visualization techniques. Day 1 will also include carefully constructed, hands-on activities wherein participants will practice applying the newly learned techniques to actual data (provided and personal).

Day 2 will include lectures on (4) basic statistical analyses (e.g., correlations and group comparisons), (5) general linear modeling (e.g., regression, ANOVA, F-tests, and t-tests), and (6) model diagnostics and exploration (e.g., assumption testing, effect sizes, confidence intervals, tables, and figures), as well as previews of advanced statistical analysis and open science techniques. Day 2 will also include hands-on activities with actual data, and time will be set aside for live consultation with the instructor.