IntroAnalysis:::ma223_setup()
library(learnr)

knitr::opts_chunk$set(echo = FALSE)
learnr::tutorial_options(exercise.cap = "Exercise")

```{css, echo=FALSE} / Solution Functionality / .accordion > input[type="checkbox"] { position: absolute; left: -100vw; }

.accordion .content { overflow-y: hidden; height: 0; transition: height 0.3s ease; }

.accordion > input[type="checkbox"]:checked ~ .content { height: auto; overflow: visible; }

.accordion label { display: block; }

/ Styling /

.accordion { margin-bottom: 1em; }

.accordion > input[type="checkbox"]:checked ~ .content { padding: 15px; border: 1px solid #e8e8e8; border-top: 0; }

.accordion .handle { margin: 0; }

.accordion label { cursor: pointer; font-weight: normal; padding: 15px; }

.accordion label:hover, .accordion label:focus { background: #d8d8d8; }

/ Turning arrow / .accordion .handle label:before { font-family: 'fontawesome'; content: "\f054"; display: inline-block; margin-right: 10px; font-size: .58em; line-height: 1.556em; vertical-align: middle; }

.accordion > input[type="checkbox"]:checked ~ .handle label:before { content: "\f078"; }

.tipbox { padding: 1em 1em 1em 1em; border: 1px solid black; border-radius: 10px; box-shadow: 2px 2px #888888; }

.keyidea { border: 1px solid #54585A; background: #4F758B; color: white; }

.keyidea:before { content: "Key Idea: "; font-weight: bold; }

.tip { background: lightyellow; }

.tip:before { content: "Tip: "; font-weight: bold; }

.title{ color: #800000; background: none; }

.solution { background: #ebebeb; }

h2{ color: white; background-color: #800000; }

h3{ color: #800000; border-bottom: thick solid #696969; }

h4{ color: #800000; }

blockquote { font-size: inherit; padding: 0px 20px; }

## Background
Regular exercise is an important component of a healthy lifestyle, including mental health.  Exercise has become a suggested component of treatment for depression.  It is important to note that the [CDC considers a broad definition of "exercise,"](https://www.cdc.gov/physicalactivity/basics/adults/index.htm) which includes activities we might not typically consider.  In particular, the CDC defines "physical activity" as "anything that gets your body moving."  

> If you would like ideas on how to remain active, the Sports and Recreation Center staff offer lots of activities from IM leagues to exercise classes.

College students often have lofty goals at the beginning of a term, but then healthy lifestyle choices fall away as academic demands increase.  [Kroencke, Harari, and Gosling (2017)](https://osf.io/swfeg/) conducted a study with over 2000 undergraduate students enrolled in an online psychology course.  Students were asked to track their exercise over the course of the academic term.  A subset of the data is available (`exercise`).  We are particularly interested in examining the number of days students exercised during the first two weeks of the term (`exercise` variable).



## Setting up the Hypotheses
Suppose Student Affairs is interested in characterizing the exercise habits of students at the start of a term.  In particular, they are interested in determining if the study provides evidence that, on average, students exercise less than 7 days during the first two weeks of class.

### Exercise: Graphically Summarizing the Data
> Below is a graphic summarizing the data collected; explain why this graphic is a poor summary, and construct a more appropriate graphic for addressing the above research objective.

```r
exercise |>
  drop_na(exercise) |>
  summarise(x = mean(exercise)) |>
  ggplot(mapping = aes(y = x, x = '')) +
  geom_col(color = 'black', fill = 'grey75') +
  labs(y = 'Average Number of Days Students\nExercise in First Two Weeks',
       x = NULL)
ggplot() +
  aes() +
  labs()
ggplot(data = exercise) +
  aes(y = `exercise`,
      x = "") +
  labs(y = "Number of Days of Exercise",
       x = "") +
  geom_jitter()

A graphic should convey more than a table; in particular, the graphic should allow us to examine the distribution of the variable. In this case, we are missing any sense of the spread in the variable. We note that there are many potential solutions for reasonable graphics. As there are a limited number of potential values (0 - 14), but several hundred observations, we use a jitter plot to prevent overplotting.

Exercise: Hypotheses

State the null and alternative hypotheses that capture the research objective. Be sure to define the parameter(s) of interest.

Let $\mu$ represent the average number of days students exercise during the first two weeks of class. Then, we are interested in testing $$H_0: \mu \geq 7 \qquad \text{vs.} \qquad H_1: \mu < 7$$

Computing a P-Value

Exercise: Model for Data Generating Process

Using mathematical notation, state the model for the data generating process under each hypothesis.

Under the alternative hypothesis, there is no restriction placed on the parameter; therefore, the model for the data generating process is a function of the unknown parameter: $$(\text{Number of Days})_i = \mu + \varepsilon_i$$ Under the null hypothesis, we place a constraint on the parameter. In particular, we choose the value defined by the equality component of the null hypothesis. This results in a restricted model which is $$(\text{Number of Days})_i = 7 + \varepsilon_i$$
Hypotheses are statements about the parameters that appear in the model for the data generating process. The null hypothesis suggests a simpler model for the data generating process by constraining the parameters.

Exercise: Computing a P-Value

Assuming the data is representative of all undergraduates and that the amount of time one student spends exercising is independent of the amount of time any other student spends exercising, is there evidence that undergraduates exercise less than 7 days during the first two weeks, on average? Compute a p-value for addressing this question, and then use it to state your conclusion.

exercise.h1 = specify_mean_model()
exercise.h0 = specify_mean_model()

compare_models()
exercise.h1 = specify_mean_model(exercise ~ 1, data = exercise)
exercise.h0 = specify_mean_model(exercise ~ constant(7), data = exercise)

compare_models(exercise.h1, exercise.h0,
               alternative = 'less than')

With such a small p-value, the study provides strong evidence the average number of days students exercise over the first two weeks of class is less than 7. We actually estimate the mean to be less than 5 days over the first two weeks.


reyesem/IntroAnalysis documentation built on March 29, 2025, 3:29 p.m.