View source: R/autotune_Amelia.R
autotune_Amelia | R Documentation |
Function use EMB (Expectation-Maximization with Bootstrapping ) to impute missing data. Function performance is highly depend from data structure and chosen parameters.
autotune_Amelia( df, col_type = NULL, percent_of_missing = NULL, col_0_1 = FALSE, parallel = TRUE, polytime = NULL, splinetime = NULL, intercs = FALSE, empir = NULL, verbose = FALSE, return_one = TRUE, m = 3, out_file = NULL )
df |
data.frame. Df to impute with column names and without target column. |
col_type |
character vector. Vector containing column type names. |
percent_of_missing |
numeric vector. Vector contatining percent of missing data in columns for example c(0,1,0,0,11.3,..) |
col_0_1 |
Decaid if add bonus column informing where imputation been done. 0 - value was in dataset, 1 - value was imputed. Default False. (Works only for returning one dataset). |
parallel |
If true parallel calculation is used. |
polytime |
parameter pass to amelia function |
splinetime |
parameter pass to amelia finction |
intercs |
parameter pass to amleia function |
empir |
parameter pass to amelia function as empir in Amelia == empir*nrow(df). If empir dont set empir=nrow(df)*0.015. |
verbose |
If true function will print on console. |
return_one |
Decide if one dataset or amelia object will be returned. |
m |
Number of datasets generated by amelia. If retrun_one=TRUE first dataset will be given. |
out_file |
Output log file location if file already exists log message will be added. If NULL no log will be produced. |
Return one data.frame with imputed values or amelia object.
James Honaker, Gary King, Matthew Blackwell (2011).
James Honaker, Gary King, Matthew Blackwell (2011). Amelia II: A Program for Missing Data. Journal of Statistical Software, 45(7), 1-47. URL https://www.jstatsoft.org/v45/i07/.
{ raw_data <- data.frame( a = as.factor(sample(c("red", "yellow", "blue", NA), 1000, replace = TRUE)), b = as.integer(1:1000), c = as.factor(sample(c("YES", "NO", NA), 1000, replace = TRUE)), d = runif(1000, 1, 10), e = as.factor(sample(c("YES", "NO"), 1000, replace = TRUE)), f = as.factor(sample(c("male", "female", "trans", "other", NA), 1000, replace = TRUE))) # Prepering col_type col_type <- c("factor", "integer", "factor", "numeric", "factor", "factor") percent_of_missing <- 1:6 for (i in percent_of_missing) { percent_of_missing[i] <- 100 * (sum(is.na(raw_data[, i])) / nrow(raw_data)) } imp_data <- autotune_Amelia(raw_data, col_type, percent_of_missing,parallel = FALSE) # Check if all missing value was imputed sum(is.na(imp_data)) == 0 # TRUE }
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.