impute: Impute missing data

Description Usage Arguments Details Value Examples

View source: R/coin_impute.R

Description

Imputation of missing data data sets using a variety of methods (see imtype). This also includes the possibility of imputing by grouping variables, i.e. columns of IndData that are prefaced by "Group_".

Usage

1
2
3
4
5
6
7
8
impute(
  COIN,
  imtype = NULL,
  dset = NULL,
  groupvar = NULL,
  EMaglev = NULL,
  out2 = "COIN"
)

Arguments

COIN

A COIN or a data frame

imtype

The type of imputation method. Either:

  • "agg_mean" (the mean of normalised indicators inside the aggregation group),

  • "agg_median" (the median of normalised indicators inside the aggregation group),

  • "ind_mean" (the mean of all the other units in the indicator),

  • "ind_median" (the median of all the other units in the indicator),

  • "indgroup_mean" (the mean of all the other units in the indicator, in the same group),

  • "indgroup_median" (the median of all the other units in the indicator, in the same group),

  • "EM" (expectation maximisation algorithm via AMELIA package, currently without bootstrapping)

  • "none" (no imputation, returns original data set)

dset

The data set in .$Data to impute

groupvar

The name of the column to use for by-group imputation. Only applies when imtype is set to a group option.

EMaglev

The aggregation level to use if imtype = "EM".

out2

Where to output the imputed data frame. If "COIN" (default for COIN input), creates a new data set .$Data$Imputed. Otherwise if "df" outputs directly to a data frame.

Details

See online documentation for further details and examples.

Value

If out2 = "COIN" (default for COIN input), creates a new data set .$Data$Imputed. Otherwise if out2 = "df" outputs directly to a data frame.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# assemble the COIN
ASEM <- assemble(IndData = ASEMIndData, IndMeta = ASEMIndMeta, AggMeta = ASEMAggMeta)
# Check how many missing data points are in raw data set
sum(is.na(ASEM$Data$Raw))
# impute data using Asia/Europe group mean
DataImputed <- impute(ASEM, dset = "Raw", imtype = "indgroup_mean", groupvar = "Group_EurAsia",
out2 = "df")
# See how many missing data points we have in the imputed data
sum(is.na(DataImputed))
# check no missing data
stopifnot(sum(is.na(DataImputed))==0)

COINr documentation built on Nov. 30, 2021, 9:06 a.m.