Home

/

GitHub

/

In kimbrianj/umdpsyc: UMD PSYC R Learning Package

library(learnr)
library(umdpsyc)
library(gradethis)
library(ggplot2)
knitr::opts_chunk$set(exercise.checker = gradethis::grade_learnr)
knitr::opts_chunk$set(echo = FALSE)
learnr::tutorial_options(
  exercise.timelimit = 60)

Introduction

One of the most common uses for R is investigating relationships between numerical variables and running regression models. The way that R handles running models can be a bit different from other statistical software might do it. In R, we create a model object, which contains the executed model, then use the summary function to access the relevant information from the model we just ran. This means it typically takes a line of code to run a model and then a separate line of code to see the results, so make sure to pay close attention to how we get the information we need.

In this tutorial, we will look at the relationship between student motivation (on a 1 to 10 scale) and exam scores, as well as the relationship between the amount of sleep that a student got in the week before the exam and the exam score. You can find papers on these topics here for motivation and exam scores and here for sleep and exam scores.

As a reminder, here is what the data looks like.

data(exam_scores)
head(exam_scores)

Correlation

Scatterplots

Let's start by looking at some scatterplots to look at relationships visually.

qplot(motivation, exam1, data = exam_scores)

qplot(sleep_week, exam1, data = exam_scores)

Finding correlations

You can use the same function to find both Spearman's Rho rank correlation and Pearson's correlation. The cor function is used for both, and you simply specify the method as an argument in the function. The first two arguments are the variables you want to find the correlation for, while the third is the type of correlation that you want to find.

Spearman's Rho

cor(exam_scores$motivation,exam_scores$exam1,method = 'spearman')

Pearson Correlation

cor(exam_scores$sleep_week,exam_scores$exam1,method = 'pearson')

Regression

We'll start by looking at the relationship between exam scores and the amount of sleep a student got in the week before the exam.

Let's fit a least squares regression line to this using the lm function. To do this, you'll have to use formula notation for the first argument. The formula notation is simply the form:

$$[\text{outcome}] \sim [\text{predictor}]$$ so you specify the outcome variable, then put in a ~, then put in your predictor variable. The last argument is the data argument, specifying the data frame so that we can use the variable names by themselves within the formula.

The lm function outputs an lm object. In order to get the results of running the model that we want, such as the coefficients, the associated p-values, R-squared values, and so on, we need to use summary on the model object.

exammod <- lm(exam1 ~ sleep_week, data = exam_scores)
summary(exammod)

Regression Diagnostics

To get regression diagnostic plots, we can simply use plot on the model object. The first two plots show the residuals. The first one is the residual plot, which you should check to make sure there is no pattern and that there is constant variance. The second is a QQ-plot checking the normality of the residuals. The next two are used to check for outliers and influential points.

plot(exammod)

Multiple Regression

In order to do multiple regression, we can simply add additional variables to the righthand side of the tilde (~). We can do this using + and *. If you use +, this simply adds another variable, whereas using * include all interaction terms for variables included in the * operation.

Let's take a look at model to predict exam scores using motivation and sleep, including an interaction term.

exammod2 <- lm(exam1 ~ motivation * sleep_week, data = exam_scores)
summary(exammod2)

Exercises

Exercise 1

How would you find a linear regression model using scores on the first exam as the outcome and student motivation as the predictor? Output the summary of the model (using summary).

grade_result(
  pass_if(~ identical(.result, summary(lm(exam1 ~ motivation, data = exam_scores))), "Good Job!"),
  fail_if(~ TRUE)
)

Exercise 2

learnr::question("Based on the output of the model from Exercise 1, what is the coefficient for motivation (the slope of the regression line)?",
         answer("1.7005", correct = TRUE,  message = "Good job!"),
         answer("71.6990", correct = FALSE),
         answer("0.000797",  correct = FALSE),
         answer("3.474",  correct = FALSE),
         allow_retry = TRUE,
         random_answer_order = TRUE
)

Exercise 3

How would you find a linear regression model using scores on the first exam as the outcome and two predictor variables: minutes of sleep the week before and student motivation, including an interaction term? Output the summary of the model (using summary).

grade_result(
  pass_if(~ identical(.result, summary(lm(exam1 ~ motivation*sleep_week, data = exam_scores))), "Good Job!"),
  fail_if(~ TRUE)
)

Submitting work

submission_code <- function(UID){
  httr::sha1_hash('regression121',as.character(UID))
}

Generate your submission code by putting in your UID in the function below. For example, if your UID is 2, then your code should look like submission_code(UID = 2)

# Replace the number below with your UID
submission_code(2)

kimbrianj/umdpsyc documentation built on Jan. 27, 2021, 7:36 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

kimbrianj/umdpsyc
UMD PSYC R Learning Package

In kimbrianj/umdpsyc: UMD PSYC R Learning Package

Introduction

Correlation

Scatterplots

Finding correlations

Regression

Regression Diagnostics

Multiple Regression

Exercises

Exercise 1

Exercise 2

Exercise 3

Submitting work

R Package Documentation

Browse R Packages

We want your feedback!

kimbrianj/umdpsyc UMD PSYC R Learning Package

In kimbrianj/umdpsyc: UMD PSYC R Learning Package

Introduction

Correlation

Scatterplots

Finding correlations

Regression

Regression Diagnostics

Multiple Regression

Exercises

Exercise 1

Exercise 2

Exercise 3

Submitting work

R Package Documentation

Browse R Packages

We want your feedback!

kimbrianj/umdpsyc
UMD PSYC R Learning Package