README.md
In reyesem/IntroAnalysis: Functions for introductory statistics using linear models

IntroAnalysis

Wrappers for estimating parameters and comparing nested models (conducting hypothesis tests) when working with models suitable for an introductory course on statistical analysis.

The IntroAnalysis package was constructed to support an introductory statistics course which takes a modeling-based perspective. As an example, consider making inference about a single mean (\mu) in a population. The classical approach is to provide formulas for a (1-\alpha) confidence interval

[\bar{x} \pm Z_{1-\alpha/2} \frac{s}{\sqrt{n}}]

or investigating the distribution of the standardized statistic

[\frac{\bar{x} - \mu_0}{s/\sqrt{n}}.]

Inference for a single population is then divorced from inference in simple linear regression.

Alternatively, a modeling-based perspective considers the model for the data generating process

[(\text{Response})_i = \mu + \varepsilon_i]

where we are willing to impose some conditions on the error terms. This allows all of the introductory course to be considered from the same perspective.

The IntroAnalysis package provides wrappers for estimating perameters and comparing nested models when working with linear and generalized linear models. These wrappers provide a single interface for performing statistical inference imposing the classical conditions or via bootstrapping.

Suppose we are interested in computing a 95% confidence interval (assuming the population follows a Normal distribution) for the mean (\mu). We assume the response variable Response is stored in the dataframe df. We first specify the model for the data generating process

mean.model = specify_mean_model(Response ~ 1, data = df)

We then estimate the parameters

estimate_parameters(
  mean.model, 
  confidence.level = 0.95,
  assume.identically.distributed = TRUE,
  assume.normality = TRUE)

The above code specifies the assumptions made for the classic 1-sample t-interval.

Now, suppose we are interested in testing the hypotheses

[H_0: \mu = 5 \qquad \text{vs.} \qquad H_1: \mu \neq 5]

but we are not willing to assume the population is Normally distributed. Then, we can perform a residual bootstrap. We have already specified the model for the data generating process above, but we need to specify the model under the null hypothesis

mean.model.h0 = specify_mean_model(Response ~ constant(5), data = df)

Notice that there are no parameters to estimate in the above model. Now, we proceed to performing the hypothesis test by comparing the models

compare_models(
  mean.model,
  mean.model.h0,
  assume.identically.distributed = TRUE,
  assume.normality = FALSE)

The code is similar to a classical t-test, emphasizing the same process but different conditions being imposed on the error term.

The package can be installed from Github.

library(devtools)
install_github("reyes/IntroAnalysis")

Once installed, we recommend starting with the Essential Inference using Linear Models vignette or the Coding Cookbook vignette.

reyesem/IntroAnalysis documentation built on March 29, 2025, 3:29 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

reyesem/IntroAnalysis
Functions for introductory statistics using linear models

README.md
In reyesem/IntroAnalysis: Functions for introductory statistics using linear models

IntroAnalysis

Overview

Example

Installation

R Package Documentation

Browse R Packages

We want your feedback!

reyesem/IntroAnalysis Functions for introductory statistics using linear models

README.md In reyesem/IntroAnalysis: Functions for introductory statistics using linear models

IntroAnalysis

Overview

Example

Installation

R Package Documentation

Browse R Packages

We want your feedback!

reyesem/IntroAnalysis
Functions for introductory statistics using linear models

README.md
In reyesem/IntroAnalysis: Functions for introductory statistics using linear models