impute: Impute using missForest

Description Usage Arguments Value

Description

This function will allow the user to specify factor variables and variables they do not want in the imputation. It will then create a data matrix to impute using missForest. It will add drop columns back after imputation and relabel the factor variables.

Usage

1
impute(df, df_true = NULL, factors = NULL, drop = NULL, ntree = 500, ...)

Arguments

df

the dataframe that we would like to impute NAs.

df_true

optional complete dataframe of the dataframe provided as df argument

drop

a character vector of variables that will not be included in the imputation. These variables will be added to the outputted dataframe, even though they are not inclued in the imputation.

ntree

number of trees to grow in each forest. Default is set to 500.

...

Other arguments to be passed to missForest imputation, besides ntree

facotrs

a character vector of factor variables that we would like specfiy as factors when imputing These factors variables can be numeric or character variables in the df provided

Value

A list of output similar to missForest

ximp

a dataframe with no missing values. The resulting dataframe will be of the same size as the original df provided, but with all of the NA's imputed. However, if there are NA's in the drop columns, these values wil not be imputed.

OOBerror

estimated OOB imputation error. For the set of continuous variables in 'xmis' the NRMSE and for the set of categorical variables the proportion of falsely classified entries is returned. See Details for the exact definition of these error measures. If 'variablewise' is set to 'TRUE' then this will be a vector of length 'p' where 'p' is the number of variables and the entries will be the OOB error for each variable separately.

error

true imputation error. This is only available if 'xtrue' was supplied. The error measures are the same as for 'OOBerror'.


wfmueller29/impyoot documentation built on Dec. 23, 2021, 5:12 p.m.