Given below are three data visualizations that violate many data visualization best practices. Improve these visualizations using R and the tips for effective visualizations that we introduced in class. You should produce one visualization per dataset. Your visualization should be accompanied by a brief paragraph describing the choices you made in your improvement, specifically discussing what you didn’t like in the original plots and why, and how you addressed them in the visualization you created.
On the due date you will give a brief presentation describing one of your improved visualizations and the reasoning for the choices you made.
The learning goals for this lab are:
Go to the course GitHub organization and locate your lab repo. Grab the URL of the repo, and clone it in RStudio. Refer to Lab 01 if you would like to see step-by-step instructions for cloning a repo into an RStudio project.
First, open the R Markdown document and Knit it. Make sure it compiles without errors. The output will be in the file markdown
.md file with the same name.
Your email address is the address tied to your GitHub account and your name should be first and last name.
Before we can get started we need to take care of some required housekeeping. Specifically, we need to do some configuration so that RStudio can communicate with GitHub. This requires two pieces of information: your email address and your name.
Run the following (but update it for your name and email!) in the Console to configure git:
library(usethis) use_git_config(user.name = "Your Name", user.email = "firstname.lastname@example.org")
This is the second week you’re working in teams, so we’re going to make things a little more interesting and let all of you make changes and push those changes to your team repository. Sometimes things will go swimmingly, and sometimes you’ll run into merge conflicts. So our first task today is to walk you through a merge conflict!
Take turns in completing the exercise, only one member at a time.
Run the following code in the Console to load this package.
The American Association of University Professors (AAUP) is a nonprofit membership association of faculty and other academic professionals. This report compiled by the AAUP shows trends in instructional staff employees between 1975 and 2011, and contains an image very similar to the one given below.
Let’s start by loadong the data used to create this plot.
Each row in this dataset represents a faculty type, and the columns are the years for which we have data. The values are percentage of hires of that type of faculty for each year.
## # A tibble: 5 x 12 ## faculty_type `1975` `1989` `1993` `1995` `1999` `2001` `2003` `2005` `2007` `2009` ## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 Full-Time T… 29 27.6 25 24.8 21.8 20.3 19.3 17.8 17.2 16.8 ## 2 Full-Time T… 16.1 11.4 10.2 9.6 8.9 9.2 8.8 8.2 8 7.6 ## 3 Full-Time N… 10.3 14.1 13.6 13.6 15.2 15.5 15 14.8 14.9 15.1 ## 4 Part-Time F… 24 30.4 33.1 33.2 35.5 36 37 39.3 40.5 41.1 ## 5 Graduate St… 20.5 16.5 18.1 18.8 18.7 19 20 19.9 19.5 19.4 ## # … with 1 more variable: `2011` <dbl>
In order to recreate this visualization we need to first reshape the data to have one variable for faculty type and one variable for year. In other words, we will convert the data from the long format to wide format.
But before we do so, a thought exercise: If the long data will have a row for each year/faculty type combination, and there are 5 faculty types and 11 years of data, how many rows will the data have?
We do the wide to long converstion using a new function:
pivot_longer(). The animation below show how this function works, as well as its counterpart