ImputeHistory: Imputation of a single variable (y) by the most recent...

View source: R/ImputeHistory.R

ImputeHistoryR Documentation

Imputation of a single variable (y) by the most recent available historical value

Description

A single x variable is created so that each element is the most resent non-missing historical value. Missing y-values are imputed directly by this x without any model. Standard error estimates are based on the naive model where (y-x) is assumed to be pure error.

Usage

ImputeHistory(
  data,
  idName = names(data)[1],
  strataName = NULL,
  xName = names(data)[3],
  yName = names(data)[4],
  weightMethod = "ordinary",
  reverse = FALSE,
  returnSameType = TRUE,
  forceIdMatching = TRUE,
  ...
)

ImputeHistoryNewNames(...)

ImputeHistoryTall(..., iD = TalliD())

ImputeHistoryTallSmall(
  ...,
  iD = TalliD(),
  keep = c("ID", "estimate", "cv", "nImputed")
)

ImputeHistoryWide(
  ...,
  addName = WideAddName(),
  sep = WideSep(),
  idNames = c("", "strata", ""),
  addLast = FALSE
)

ImputeHistoryWideSmall(
  ...,
  keep = c("id", "strata", "estimate", "cv"),
  addName = WideAddName(),
  sep = WideSep(),
  idNames = c("", "strata", ""),
  addLast = FALSE
)

Arguments

data

Input data set of class data.frame

idName

Name of id-variable(s)

strataName

Name of starta-variable. Single strata when NULL (default)

xName

Name of variables with historical y-value(s) (most resent first). Can be set to NULL (see yName).

yName

Name of y-variable. When xName is NULL yName is a vector of current and historical variables (most resent first).

weightMethod

The weight method for error calculations coded as a string: "ordinary" (default) or "ratio".

reverse

When TRUE most resent is last instead of first (see xName and yName). Default is FALSE.

returnSameType

When TRUE (default) and when the type of input y variable(s) is integer, the output type of yImputed/estimate/estimateTotal is also integer. Estimates/sums are then calculated from rounded imputed values.

forceIdMatching

When TRUE id matching in underlying GetData is forced (id as named list is not needed).

...

Used in wrappers .... Can also be used to specify additional variable names that will be included in output (micro).

Details

This function is related to ImputeRegression and the structure and the names of output are similar. Note that missing values of x is allowed here. In cases were both x and y are missing a warning will occur (zero is used in total estimates).

Value

Output

micro consists of the following elements:

id

id from input

x

The x variable created from input according to xName

y

The input y variable

strata

The input strata variable (can be NULL)

category123

Imputation groups: Not imputed (1), Imputed (3) and missing (0). Group 2 never happen with this function.

yHat \emph{or estimateYHat}

Fitted values

yImputed \emph{or estimate}

Imputed y-data

rStud

The final studentized residuals

leaveOutResid

The final outside-model residual

varImputed

Name of origin variable

aggregates consists of the following elements:

N

Number of observations in each strata

nImputed

Number of imputed observations in each strata

estimate

Total estimates from imputed data

cv

Coefficient of variation = seEstimate/estimate

estimateYhat

Totale estimate based on model fits

estimateOrig \emph{or y}

Estimate based on original data with missing set to zero

n

The final number of observations in model.

sigmaHat

The final square root of the estimated variance parameter

seEstimate

The final standard error estimate of the total estimate from imputed data

total consists of the following elements:

Ntotal \emph{or N}

Number of observations

nImputedTotal \emph{or nImputed}

Total number of imputed observations

estimateTotal \emph{or estimate}

Total estimate for all strata

cvTotal or \emph{cv}

Total cv for all strata

Author(s)

Øyvind Langsrud

Examples


rateData <- KostraData("rateData")             # Real Kostra data set
w <- rateData$data[, c(17,19,16,5)]        # Data with id, strata, x and y

w <- w[is.finite(w[,"Ny.kostragruppe"]), ]       # Remove Longyearbyen
w[w[,"Ny.kostragruppe"]>13,"Ny.kostragruppe"]=13 # Combine small strata

# Create historical data by modifying the "original x-variable"
w2=cbind(w,x1=1.2*w[,3]*rep(c(NA,NA,1,1),107),x2=1.1*w[,3]*rep(c(NA,1),214))
ImputeHistory(w2, strataName = names(w2)[2], xName=names(w2)[c(5,6,3)])  # Example with three historical variables - the last is complete
ImputeHistory(w2, strataName = names(w2)[2], xName=names(w2)[c(5,6)]) # Incomplete x and a warning is produced
ImputeHistoryTall(w2, strataName = names(w2)[2], xName=names(w2)[c(5,6,3)])
ImputeHistoryTallSmall(w2, strataName = names(w2)[2], xName=names(w2)[c(5,6,3)])
ImputeHistoryWide(w2, strataName = names(w2)[2], xName=names(w2)[c(5,6,3)])
ImputeHistoryWideSmall(w2, strataName = names(w2)[2], xName=names(w2)[c(5,6,3)])

# Numbers instead of names works.
# Four equivalent variants using reverse and xName=NULL
ImputeHistory(w2, strataName = 2, xName=c(5,6,3), yName=4)
ImputeHistory(w2, strataName = 2, xName=c(3,6,5), yName=4, reverse=TRUE)
ImputeHistory(w2, strataName = 2, xName=NULL, yName = c(4,5,6,3))
ImputeHistory(w2, strataName = 2, xName=NULL, yName = c(3,6,5,4), reverse=TRUE)


statisticsnorway/Kostra documentation built on Nov. 2, 2024, 6:40 p.m.