View source: R/ImputeRegression2.R
ImputeRegression2 | R Documentation |
Imputation of a sigle variable (y) by a regression model using a primary explanatory variable (x1) and a secondary explanatory variable (x2) for cases where the primary is missing.
ImputeRegression2(
data,
idName = names(data)[1],
strataName = NULL,
x1Name = names(data)[3],
x2Name = names(data)[4],
yName = names(data)[5],
method1 = "ordinary",
method2 = "ordinary",
limitModel = 2.5,
limitIterate = 4.5,
limitImpute = 50,
returnSameType = TRUE
)
ImputeRegression2NewNames(
...,
oldNames = c("yImputed", "Ntotal", "nImputedTotal", "AnImputedTotal", "BnImputedTotal",
"estimateTotal", "AyHat", "ByHat", "AestimateOrig", "cvTotal"),
newNames = c("estimate", "N", "nImputed", "AnImputed", "BnImputed", "estimate",
"AestimateYHat", "BestimateYHat", "y", "cv")
)
ImputeRegression2Tall(..., iD = TalliD())
ImputeRegression2TallSmall(
...,
iD = TalliD(),
keep = c("ID", "estimate", "cv", "nImputed")
)
ImputeRegression2Wide(
...,
addName = WideAddName(),
sep = WideSep(),
idNames = c("", "strata", ""),
addLast = FALSE
)
ImputeRegression2WideSmall(
...,
keep = c("id", "strata", "estimate", "cv", "nImputed"),
addName = WideAddName(),
sep = WideSep(),
idNames = c("", "strata", ""),
addLast = FALSE
)
data |
Input data set of class data.frame |
idName |
Name of id-variable(s) |
strataName |
Name of starta-variable. Single strata when NULL (default) |
x1Name |
Name of x1-variable |
x2Name |
Name of x2-variable |
yName |
Name of y-variable |
method1 |
The method (model and weight) coded as a string: "ordinary" (default), "ratio", "noconstant", "mean" or "ratioconstant". I addition "ratio2" and "ratioconstant2" are alternatives where the weights are based on the other x-variable (x1<->x2). |
method2 |
Similar to method2 above. |
limitModel |
Studentized residuals limit. Above limit -> group 2. |
limitIterate |
Studentized residuals limit for iterative calculation of studentized residuals. |
limitImpute |
Studentized residuals limit. Above limit -> group 3. |
returnSameType |
When TRUE (default) and when the type of input y variable(s) is integer, the output type of yImputed/estimate/estimateTotal is also integer. Estimates/sums are then calculated from rounded imputed values. |
Imputations are initially performed by running method1 using x1 within each strata. Division into three groups are based on studentized residuals. Calculations of studentized residuals are performed by iterativily throwing out observations from the model fitting. Missing imputed values caused by missing x1-values are thereafter imputed by running method2 using x2 within each strata. Combined estimates of seRobust,seEStimate and cv are calculated.
Output of the alternative variants of the function
are constructed similar
to the variants of ImputeRegression
.
Output of ImputeRegression2
and ImputeRegression2NewNames
(using the names after
or
below) is a list of three data sets. micro has as many rows as input, aggregates has one row for each strata
and total has a single row. Variables from the two imputations are named using "A" and "B".
The individual variables (dropping "A" and "B") are:
micro
consists of the following elements:
id |
id from input |
x1 |
The input x1 variable |
x2 |
The input x2 variable |
strata |
The input strata variable (can be NULL) |
category123 |
The three imputation groups: representative (1), correct but not representative (2), wrong (3). |
yHat \emph{or estimateYHat} |
Fitted values |
yImputed \emph{or estimate} |
Imputed y-data |
rStud |
The final studentized residuals |
dffits |
The final DFFITS statistic |
hii |
The final leverages (diagonal elements of hat matrix) |
leaveOutResid |
The final outside-model residual |
aggregates
consists of the following elements:
N |
Number of observations in each strata |
nImputed |
Number of imputed observations in each strata |
estimate |
Total estimates from imputed data |
cv |
Coefficient of variation = seEstimate/estimate |
estimateYhat |
Totale estimate based on model fits |
estimateOrig \emph{or y} |
Estimate based on original data with missing set to zero |
coef |
The final first model coefficient |
coefB |
The final second model coefficient or zeros when only one coefficient in model. |
n |
The final number of observations in model. |
sigmaHat |
The final square root of the estimated variance parameter |
seEstimate |
The final standard error estimate of the total estimate from imputed data |
seRobust |
Robust variant of seEstimate (experimental) |
total
consists of the following elements:
Ntotal \emph{or N} |
Number of observations |
nImputedTotal \emph{or nImputed} |
Total number of imputed observations |
estimateTotal \emph{or estimate} |
Total estimate for all strata |
cvTotal or \emph{cv} |
Total cv for all strata |
rateData <- KostraData("rateData") # Real Kostra data set
w <- rateData$data[, c(17,19,3,16,5)] # Data with id, strata, x1, x2 and y
w <- w[is.finite(w[,"Ny.kostragruppe"]), ] # Remove Longyearbyen
ImputeRegression2(w, strataName = names(w)[2]) # Works without combining strata
w[w[,"Ny.kostragruppe"]>13,"Ny.kostragruppe"]=13 # Combine small strata
ImputeRegression2(w, strataName = names(w)[2]) # Ordinary regressions
ImputeRegression2(w, strataName = names(w)[2],x1Name = names(w)[4], method1="ratio") # x1=x2 and no imputation in round 2
ImputeRegression2(w, strataName = names(w)[2],method1="ratio2",method2="ratio") # ratio2 needed since x1=0
ImputeRegression2(w, strataName = names(w)[2],method1="ratioconstant2",method2="ratioconstant")
ImputeRegression2Tall(w, strataName = names(w)[2])
ImputeRegression2TallSmall(w, strataName = names(w)[2])
ImputeRegression2Wide(w, strataName = names(w)[2])
ImputeRegression2WideSmall(w, strataName = names(w)[2])
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.