varEstToCov: Covariance Estimation
In EdSurvey: Analysis of NCES Education Survey and Assessment Data

varEstToCov

R Documentation

Covariance Estimation

Description

When the variance of a derived statistic (e.g., a difference) is required, the covariance between the two statistics must be calculated. This function uses results generated by various functions (e.g., a lm.sdf) to find the covariance between two statistics.

Usage

varEstToCov(
  varEstA,
  varEstB = varEstA,
  varA,
  varB = varA,
  jkSumMultiplier,
  returnComponents = FALSE
)

Arguments

`varEstA`	a list of two `data.frame`s returned by a function after the `returnVarEstInputs` argument was turned on. The statistic named in the `varA` argument must be present in each `data.frame`.
`varEstB`	a list of two `data.frame`s returned by a function after the `returnVarEstInputs` argument was turned on. The statistic named in the `varA` argument must be present in each `data.frame`. When the same as `varEstA`, the covariance is within one result.
`varA`	a character that names the statistic in the `varEstA` argument for which a covariance is required
`varB`	a character that names the statistic in the `varEstB` argument for which a covariance is required
`jkSumMultiplier`	when the jackknife variance estimation method—or balanced repeated replication (BRR) method—multiplies the final jackknife variance estimate by a value, set `jkSumMultiplier` to that value. For an `edsurvey.data.frame` or a `light.edsurvey.data.frame`, the recommended value can be recovered with `EdSurvey::getAttributes(myData,` `"jkSumMultiplier")`.
`returnComponents`	set to `TRUE` to return the imputation variance seperate from the sampling variance

Details

These functions are not vectorized, so varA and varB must contain exactly one variable name.

The method used to compute the covariance is in the vignette titled Statistical Methods Used in EdSurvey

The method used to compute the degrees of freedom is in the vignette titled Statistical Methods Used in EdSurvey in the section “Estimation of Degrees of Freedom.”

Value

a numeric value; the jackknife covariance estimate. If returnComponents is TRUE, returns a vector of length three, V is the variance estimate, Vsamp is the sampling component of the variance, and Vimp is the imputation component of the variance

Author(s)

Paul Bailey

Examples

## Not run: 
# read in the example data (generated, not real student data)
sdf <- readNAEP(path=system.file("extdata/data", "M36NT2PM.dat", package = "NAEPprimer"))

# estimate a regression
lm1 <- lm.sdf(formula=composite ~ dsex + b017451, data=sdf, returnVarEstInputs=TRUE)
summary(lm1)
# estimate the covariance between two regression coefficients
# note that the variable names are parallel to what they are called in lm1 output
jkSumMultiplier <- EdSurvey:::getAttributes(data=sdf, attribute="jkSumMultiplier")
covFEveryDay <- varEstToCov(varEstA=lm1$varEstInputs,
                            varA="dsexFemale",
                            varB="b017451Every day",
                            jkSumMultiplier=jkSumMultiplier)
# the estimated difference between the two coefficients
# note: unname prevents output from being named after the first coefficient
unname(coef(lm1)["dsexFemale"] - coef(lm1)["b017451Every day"])
# the standard error of the difference
# uses the formula SE(A-B) = sqrt(var(A) + var(B) - 2*cov(A,B))
sqrt(lm1$coefmat["dsexFemale", "se"]^2
     + lm1$coefmat["b017451Every day", "se"]^2
     - 2 * covFEveryDay)

## End(Not run)

EdSurvey documentation built on June 27, 2024, 5:10 p.m.