Lab 11 - Grading the professor, Pt. 2

Modelling with multiple predictors

In this lab we revisit the professor evaluations data we modelled in the previous lab. In the last lab we modelled evaluation scores using a single predictor at a time. This time we will use multiple predictors to model evaluation scores.

For context, review the previous lab’s introduction before continuing on to the exercises.

Getting started

Go to the course GitHub organization and locate your homework repo, clone it in RStudio and open the R Markdown document. Knit the document to make sure it compiles without errors.

Warm up

Let’s warm up with some simple exercises. Update the YAML of your R Markdown file with your information, knit, commit, and push your changes. Make sure to commit with a meaningful commit message. Then, go to your repo on GitHub and confirm that your changes are visible in your Rmd and md files. If anything is missing, commit and push again.

Packages

We’ll use the tidyverse package for much of the data wrangling and visualisation, the tidymodels package for modeling and inference, and the data lives in the dsbox package. These packages are already installed for you. You can load them by running the following in your Console:

library(tidyverse) 
library(tidymodels)
library(openintro)

Data

The data can be found in the dsbox package, and it’s called evals. Since the dataset is distributed with the package, we don’t need to load it separately; it becomes available to us when we load the package. You can find out more about the dataset by inspecting its documentation, which you can access by running ?evals in the Console or using the Help menu in RStudio to search for evals. You can also find this information here.

Exercises

  1. Load the data by including the appropriate code in your R Markdown file.

Simple linear regression

  1. Fit a linear model (one you have fit before): score_bty_fit, predicting average professor evaluation score based on average beauty rating (bty_avg) only. Write the linear model, and note the \(R^2\) and the adjusted \(R^2\).

🧶 ✅ ⬆️ If you haven’t done so recently, knit, commit, and push your changes to GitHub with an appropriate commit message. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards.

Multiple linear regression

  1. Fit a linear model (one you have fit before): score_bty_gender_fit, predicting average professor evaluation score based on average beauty rating (bty_avg) and gender. Write the linear model, and note the \(R^2\) and the adjusted \(R^2\).

  2. Interpret the slopes and intercept of score_bty_gen_fit in context of the data.

  3. What percent of the variability in score is explained by the model score_bty_gen_fit.

  4. What is the equation of the line corresponding to just male professors?

  5. For two professors who received the same beauty rating, which gender tends to have the higher course evaluation score?

  6. How does the relationship between beauty and evaluation score vary between male and female professors?

  7. How do the adjusted \(R^2\) values of score_bty_gen_fit and score_bty_fit compare? What does this tell us about how useful gender is in explaining the variability in evaluation scores when we already have information on the beuaty score of the professor.

  8. Compare the slopes of bty_avg under the two models (score_bty_fit and score_bty_gen_fit). Has the addition of gender to the model changed the parameter estimate (slope) for bty_avg?

  9. Create a new model called score_bty_rank_fit with gender removed and rank added in. Write the equation of the linear model and interpret the slopes and intercept in context of the data.

🧶 ✅ ⬆️ Knit, commit, and push your changes to GitHub with an appropriate commit message. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards and review the md document on GitHub to make sure you’re happy with the final state of your work.