DoFCorrection: Degrees of Freedom
In EdSurvey: Analysis of NCES Education Survey and Assessment Data

DoFCorrection

R Documentation

Degrees of Freedom

Description

Calculates the degrees of freedom for a statistic (or of a contrast between two statistics) based on the jackknife and imputation variance estimates.

Usage

DoFCorrection(
  varEstA,
  varEstB = varEstA,
  varA,
  varB = varA,
  method = c("WS", "JR")
)

Arguments

`varEstA`	the `varEstInput` object returned from certain functions, such as `lm.sdf` when `returnVarEstInputs=TRUE`). The variable `varA` must be on this dataset. See Examples.
`varEstB`	similar to the `varEstA` argument. If left blank, both are assumed to come from `varEstA`. When set, the degrees of freedom are for a contrast between `varA` and `varB`, and the `varB` values are taken from `varEstB`.
`varA`	a character that names the statistic in the `varEstA` argument for which the degrees of freedom calculation is required.
`varB`	a character that names the statistic in the `varEstB` argument for which a covariance is required. When `varB` is specified, returns the degrees of freedom for the contrast between `varA` and `varB`.
`method`	a character that is either `WS` for the Welch-Satterthwaite formula or `JR` for the Johnson-Rust correction to the Welch-Satterthwaite formula

Details

This calculation happens under the notion that statistics have little variance within strata, and some strata will contribute fewer than a full degree of freedom.

The functions are not vectorized, so both varA and varB must contain exactly one variable name.

The method used to compute the degrees of freedom is in the vignette titled Statistical Methods Used in EdSurvey section “Estimation of Degrees of Freedom.”

Value

numeric; the estimated degrees of freedom

Author(s)

Paul Bailey

References

Johnson, E. G., & Rust, K. F. (1992). Population inferences and variance estimation for NAEP data. Journal of Educational Statistics, 17, 175–190.

Examples

## Not run: 
sdf <- readNAEP(path=system.file("extdata/data", "M36NT2PM.dat", package="NAEPprimer"))
lm1 <- lm.sdf(formula=composite ~ dsex + b017451, data=sdf, returnVarEstInputs=TRUE)
summary(lm1)
# this output agrees with summary of lm1 coefficient for dsex
DoFCorrection(lm1$varEstInputs,
              varA="dsexFemale",
              method="JR")
# second example, a covariance term requires more work
# first, estimate the covariance between two regression coefficients
# note that the variable names are parallel to what they are called in lm1 output
covFEveryDay <- varEstToCov(lm1$varEstInputs,
                            varA="dsexFemale",
                            varB="b017451Every day",
                            jkSumMultiplier=
                            EdSurvey:::getAttributes(data=sdf, attribute="jkSumMultiplier"))
# second, find the difference and the SE of the difference
se <- lm1$coefmat["dsexFemale","se"] + lm1$coefmat["b017451Every day","se"] +
      -2*covFEveryDay
# third, calculate the t-statistic
tv <- (coef(lm1)["dsexFemale"] - coef(lm1)["b017451Every day"])/se
# fourth, calculate the p-value, which requires the estimated degrees of freedom
dofFEveryDay <- DoFCorrection(lm1$varEstInputs,
                              varA="dsexFemale",
                              varB="b017451Every day",
                              method="JR")
# finally, the p-value
2*(1-pt(abs(tv), df=dofFEveryDay))

## End(Not run)

EdSurvey documentation built on June 27, 2024, 5:10 p.m.