In reyesem/IntroAnalysis: Functions for introductory statistics using linear models

IntroAnalysis:::ma223_setup()
library(learnr)

knitr::opts_chunk$set(echo = FALSE)
learnr::tutorial_options(exercise.cap = "Exercise")

```{css, echo=FALSE} / Solution Functionality / .accordion > input[type="checkbox"] { position: absolute; left: -100vw; }

.accordion .content { overflow-y: hidden; height: 0; transition: height 0.3s ease; }

.accordion > input[type="checkbox"]:checked ~ .content { height: auto; overflow: visible; }

.accordion label { display: block; }

/ Styling /

.accordion { margin-bottom: 1em; }

.accordion > input[type="checkbox"]:checked ~ .content { padding: 15px; border: 1px solid #e8e8e8; border-top: 0; }

.accordion .handle { margin: 0; }

.accordion label { cursor: pointer; font-weight: normal; padding: 15px; }

.accordion label:hover, .accordion label:focus { background: #d8d8d8; }

/ Turning arrow / .accordion .handle label:before { font-family: 'fontawesome'; content: "\f054"; display: inline-block; margin-right: 10px; font-size: .58em; line-height: 1.556em; vertical-align: middle; }

.accordion > input[type="checkbox"]:checked ~ .handle label:before { content: "\f078"; }

.tipbox { padding: 1em 1em 1em 1em; border: 1px solid black; border-radius: 10px; box-shadow: 2px 2px #888888; }

.keyidea { border: 1px solid #54585A; background: #4F758B; color: white; }

.keyidea:before { content: "Key Idea: "; font-weight: bold; }

.tip { background: lightyellow; }

.tip:before { content: "Tip: "; font-weight: bold; }

.title{ color: #800000; background: none; }

.solution { background: #ebebeb; }

h2{ color: white; background-color: #800000; }

h3{ color: #800000; border-bottom: thick solid #696969; }

h4{ color: #800000; }

blockquote { font-size: inherit; padding: 0px 20px; }

## Background
The COVID-19 pandemic provided challenges to both physical and mental well-being for individuals across the globe. It has also provided a unique opportunity to examine what strategies were helpful in maintaining mental well-being in the face of extreme challenges. A [2021 study](https://www.frontiersin.org/articles/10.3389/fpsyg.2021.647951/full) by researchers in the UK examined the role of gratitude in protecting mental well-being. Expressing gratitude is a common element of mindfulness, a series of practices often recommended to improve mental health.

> If you are interested in practicing gratitude, a simple exercise is to maintain a gratitude journal. A colleague of mine actually assigns this as required homework for students in her statistics course. Each evening, her students are asked to record three things (which they have not included before) for which they are grateful for that day.

There have been several studies suggesting that focusing on gratitude can improve well-being. It is believed that those who practice gratitude are faster to recognize the benefits within a situation and more apt to persist through challenges.

Researchers surveyed 138 UK residents recruited primarily through social media. The survey took place in the early days of lockdown protocols within the UK, and it was not clear how long these protocols would be in place. In addition to general demographics, participants completed a series of questionnaires to quantify various aspects of their well-being. The data is available (`gratitude`). Gratitude was measured using the Gratitude Questionnaire-Six-Item Form (GQ-6); participants responded to six questions using a 7-point scale. Responses were collated into a score (`Gratitude`, ranging from 6 to 42); higher values indicate higher levels of gratitude. Overall well-being was measured using the Warwick-Edinburgh Mental Well-being Scale (WEMWBS); participants responded to 14 questions using a 5-point scale. Responses were collated into a score (`Wellbeing`, ranging from 14-70); higher values indicate better well-being.

Researchers are primarily interested in determining if higher levels of gratitude are associated with improved well-being; this suggests the following model for the data generating process:

$$(\text{Wellbeing})_i = \beta_0 + \beta_1 (\text{Gratitude})_i + \varepsilon_i$$

This model can be fit with the following code:

```r
gratitude.model = specify_mean_model(Wellbeing ~ 1 + Gratitude, data = gratitude)

gratitude.model = specify_mean_model(Wellbeing ~ 1 + Gratitude, data = gratitude)

Assessing Conditions

Exercise: Compute Residuals

Obtain the residuals and fitted values for the above model, and store them in a dataset called gratitude.diag.

gratitude.diag = obtain_diagnostics()

gratitude.diag = obtain_diagnostics(gratitude.model)

gratitude.model = specify_mean_model(Wellbeing ~ 1 + Gratitude, data = gratitude)
gratitude.diag = obtain_diagnostics(gratitude.model)

You must always obtain the residuals from the full unconstrained model for the data generating process (not the one under the null hypothesis). This must be done before trying to assess conditions.

Exercise: Assessing Normality

Is it reasonable to assume the error in the well-being score follows a Normal distribution? Construct an appropriate graphic to justify your answer.

ggplot() +
  aes() +
  labs()

ggplot(data = gratitude.diag) +
  aes(sample = .resid) +
  labs(y = "Sample Quantiles",
       x = "Theoretical Quantiles") +
  geom_qq()

Solution

If the errors follow a Normal distribution, we would expect a probability plot of the residuals to exhibit a linear trend. Examining the above probability plot of the residuals, we do see a linear trend. Therefore, it is reasonable to assume the errors follow a Normal distribution.

Exercise: Assessing Homoskedasticity

Is it reasonable to assume the variability of the error in the well-being score is constant, regardless of the gratitude score? Use an appropriate graphic to justify your answer.

ggplot() +
  aes() +
  labs()

ggplot(data = gratitude.diag) +
  aes(y = .resid,
      x = .fitted) +
  labs(y = "Residuals",
       x = "Predicted Well-being Score") +
  geom_point()

Solution

If the errors are consistent with the constant-variance condition, then we would expect that a plot of the residuals against fitted values would not exhibit any trends in the spread of the residuals as we move left-to-right across the graphic. Examining the plot above, the spread of the residuals remains fairly constant for all predicted well-being scores. Therefore, it is reasonable to assume the errors are consistent with this condition.

Exercise: Assessing Independence

As this survey was conducted online over a period of time, we do have a sense of ordering; we will assume the data is presented in the order it was obtained. Under this assumption, is there reason to believe the error in the well-being score for one individual is not independent of the error in the well-being score for any other individual? Explain.

ggplot() +
  aes() +
  labs()

ggplot(data = gratitude.diag) +
  aes(y = .resid,
      x = seq_along(.resid)) +
  labs(y = "Residuals",
       x = "Order in Which Data is Presented") +
  geom_point() +
  geom_line()

Solution

Since we know the order in which the data was collected, constructing a time-series plot is reasonable. The time-series plot of the residuals shows no trend in the location or spread of the residuals over time. This is consistent with what we would expect if the errors in the response are independent of one another. Thinking through the context, since lockdown measures were in place, there is no reason that one person's responses would be influenced by another individual's responses. Therefore, there does not appear to be any reason why the error in the well-being for one individual would be associated with the error in the well-being of any other individual.

Exercise: Assessing Mean 0

We have assumed that if there is a relationship between well-being and gratitude, it can be described linearly. Is this structural form reasonable; or, is there concern the deterministic portion of the model for the data generating process has been misspecified? Explain.

ggplot() +
  aes() +
  labs()

ggplot(data = gratitude.diag) +
  aes(y = .resid,
      x = .fitted) +
  labs(y = "Residuals",
       x = "Predicted Well-being Score") +
  geom_point()

Solution

As there are no trends in the location of the residuals when plotted against the predictors, the data is consistent with the errors having a mean of 0 for all values of the predictor. That is, it is reasonable to assume the deterministic portion of the model for the data generating process was correctly specified.

Exercise: Modeling the Sampling Distribution

Based upon your above conclusions, construct an appropriate 95% confidence interval for the parameters in the model. What conclusions can be drawn regarding the research question.

estimate_parameters()

estimate_parameters(gratitude.model,
                    confidence.level = 0.95,
                    assume.constant.variance = TRUE,
                    assume.normality = TRUE)

Solution

We note that the confidence interval contains only positive values. For each 1-unit increase in the gratitude score, the well-being score increases between 0.67 and 1.05 units, on average. This suggests that individuals who are more grateful tend to have higher levels of well-being, on average.

reyesem/IntroAnalysis documentation built on March 29, 2025, 3:29 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

reyesem/IntroAnalysis
Functions for introductory statistics using linear models

In reyesem/IntroAnalysis: Functions for introductory statistics using linear models

Assessing Conditions

Exercise: Compute Residuals

Exercise: Assessing Normality

Solution

Exercise: Assessing Homoskedasticity

Solution

Exercise: Assessing Independence

Solution

Exercise: Assessing Mean 0

Solution

Exercise: Modeling the Sampling Distribution

Solution

R Package Documentation

Browse R Packages

We want your feedback!

reyesem/IntroAnalysis Functions for introductory statistics using linear models

In reyesem/IntroAnalysis: Functions for introductory statistics using linear models

Assessing Conditions

Exercise: Compute Residuals

Exercise: Assessing Normality

Solution

Exercise: Assessing Homoskedasticity

Solution

Exercise: Assessing Independence

Solution

Exercise: Assessing Mean 0

Solution

Exercise: Modeling the Sampling Distribution

Solution

R Package Documentation

Browse R Packages

We want your feedback!

reyesem/IntroAnalysis
Functions for introductory statistics using linear models