estimate_parameters: Estimate the parameters of linear or generalized linear...
In reyesem/IntroAnalysis: Functions for introductory statistics using linear models

estimate_parameters.lm

R Documentation

Estimate the parameters of linear or generalized linear models.

Description

Provides point estimates and confidence intervals for the parameters of a linear or generalized linear model via bootstrapping or classical theory.

Usage

## S3 method for class 'lm'
estimate_parameters(
  mean.model,
  confidence.level,
  simulation.replications = 4999,
  assume.constant.variance = TRUE,
  assume.normality = FALSE,
  construct = c("normal-2", "normal-1", "two-point mass"),
  type = c("percentile", "BC", "bootstrap-t"),
  ...
)

## S3 method for class 'glm'
estimate_parameters(
  mean.model,
  confidence.level,
  simulation.replications = 4999,
  method = c("classical", "parametric", "case-resampling"),
  type = c("percentile", "BC", "bootstrap-t"),
  ...
)

estimate_parameters(
  mean.model,
  confidence.level,
  simulation.replications = 4999,
  ...
)

Arguments

`mean.model`	`lm` or `glm` model fit defining the model therefore the parameters of the mean model to be estimated.
`confidence.level`	scalar between 0 and 1 indicating the confidence level for all confidence intervals constructed. If missing (default), only point estimates are returned.
`simulation.replications`	scalar indicating the number of samples to draw from the model for the sampling distribution (default = 4999). This will either be the number of bootstrap replications or the number of samples from the classical sampling distribution. This is ignored if `confidence.level` is not specified.
`assume.constant.variance`	if `TRUE` (default), assume the errors have the same variability for every observation. If `FALSE`, the variability in the errors is allowed to differ across observations.
`assume.normality`	boolean; if `TRUE`, the errors are assumed to follow a Normal distribution. If `FALSE` (default), this is not assumed.
`construct`	string defining the type of construct to use when generating from the distribution for the wild bootstrap (see `rmammen`). If `assume.constant.variance = TRUE`, this is ignored (default = `"normal-2"`).
`type`	string defining the type of confidence interval to construct. If `"percentile"` (default) an equal-tailed percentile interval is constructed. If `"BC"` the bias-corrected percentile interval is constructed. If `"bootstrap-t"` the bootstrap-t interval is constructed.
`...`	additional arguments to be passed to other methods.
`method`	string defining the methodology to employ. If `"classical"` (default), the model is assumed correct and classical large-sample theory is used. If `"parametric"`, a parametric bootstrap is performed. If `"case-resampling"`, a case-resampling bootstrap is performed.

Details

This wrapper provides a single interface for estimating parameters under various conditions imposed on the model. Similar to summary, point and interval estimates of the parameters are available. However, interval estimates can be constructed via bootstrapping or classical theory.

For linear models, the following approaches are implemented:

classical: if both homoskedasticity and normality are assumed, the sampling distributions of a standardized statistic is modeled by a t-distribution.
parametric bootstrap: if normality can be assumed but homoskedasticity cannot, a parametric bootstrap can be peformed in which the variance for each observation is estimated by the square of the corresponding residual (similar to a White's correction).
residual bootstrap: if homoskedasticity can be assumed, but normality cannot, a residual bootstrap is used to compute standard errors and confidence intervals.
wild bootstrap: if neither homoskedasticity nor normality is assumed, a wild bootstrap is used to compute standard errors and confidence intervals.

All methods make additional requirements regarding independence of the error terms and that the model has been correctly specified. Note: for parametric bootstrap assuming constant variance, use a generalized linear model approach.

For generalized linear models, the following approaches are implemented:

classical: if the distributional family is assumed correct, large sample theory is used to justify modeling the sampling distribution of a standardized statistic using a standard normal distribution.
parametric bootstrap: the distributional family is assumed and a parametric bootstrap is performed to compute standard errors and confidence intervals.
nonparametric bootstrap: a case resampling bootstrap algorithm is used to estimate standard errors and confidence intervals.

All methods require observations to be independent of one another.

Confidence intervals constructed via bootstrapping can take on various forms. The percentile interval is constructed by taking the empirical 100\alpha and 100(1-\alpha) percentiles from the bootstrap statistics. If \hat{F} is the empirical distribution function of the bootstrap values, then the 100(1 - 2\alpha) given by

(\hat{F}^{-1}(\alpha), \hat{F}^{-1}(1-\alpha))

The bias-corrected (BC) interval corrects for median-bias. It is given by

(\hat{F}^{-1}(\alpha_1), \hat{F}^{-1}(1-\alpha_2))

where

\alpha_1 = \Phi{2\hat{z}_0 + \Phi^{-1}(\alpha)}

\alpha_2 = 1 - \Phi{2\hat{z}_0 + \Phi^{-1}(1-\alpha)}

\hat{z}_0 = \Phi^{-1}(\hat{F}(\hat{\beta}))

where \hat{\beta} is the estimate from the original sample. The bootstrap-t interval is based on the bootstrap distribution of

t^{b} = \frac{\hat{\beta}^{b} - \hat{\beta}}{\hat{\sigma}^{b}}

where \hat{\sigma} is the estimate of the standard error of \hat{\beta} and the superscript b denotes a bootstrap sample. Let \hat{G} be the empirical distribution function of the bootstrap standardized statistics given above. Then, the bootstrap-t interval is given by

(\hat{\beta} - \hat{\sigma}\hat{G}^{-1}(1-\alpha), \hat{\beta} - \hat{\sigma}\hat{G}^{-1}\alpha)

Value

data.frame containing a table of parameter estimates. The object has an additional attribute "Sampling Distribution" which is a matrix with simulation.replications rows and the same number of columns as parameters in mean.model. Each column contains a sample from the corresponding model of the sampling distribution. This is useful for plotting the modeled sampling distribution.

Methods (by class)

estimate_parameters(lm): Estimates for linear models.
estimate_parameters(glm): Estimates for generalized linear models.

Examples

fit <- lm(mpg ~ 1 + hp, data = mtcars)

# confidence intervals for linear model via a residual bootstrap
estimate_parameters(fit,
  confidence.level = 0.95,
  assume.constant.variance = TRUE,
  assume.normality = FALSE)

# classical inference
estimate_parameters(fit,
  confidence.level = 0.95,
  assume.constant.variance = TRUE,
  assume.normality = TRUE)

reyesem/IntroAnalysis documentation built on March 29, 2025, 3:29 p.m.