knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Our package intends to explore the pattern of missing values in users' dataset, imputes the missing values using three simple methods and compare the results of different methods.
We found Amelia and vis_dat packages that are similar but only visualize the missing data. We thought this would be better package for users who do not have much experience in data wrangling.
Introduction
Explore the pattern of missing values in a dataset.
Function
vis_missing(df, colour="default", missing_val_char = NA)
Parameters:
Return:
dfm <- data.frame(x = c(1, 2, 3), y = c(0, 10, NaN)) vis_missing(dfm, "", NaN)
Introduction:
imputes the missing values in a specified column with three simple methods: complete case, mean imputation and median imputation
impute_missing(dfm, col, method, missing_val_char)
Parameters:
the original dataset with missing values
col: string
a column name
method: string
a method name, expected one of "CC", "MIP", "DIP"
missing_val_char:
Return:
dfm <- data.frame(x = c(1, 2, 3), y = c(0, 10, NaN)) impute_missing(dfm, "y", "MIP", NaN)
Introduction:
Compare the results of different methods.
* This function will call function impute_missing() for several methods and
return a table with some statistical information of the specified feature 
before and after imputation of different methods
compare_model()
Parameters:
df (ndarray): -- the original dataset with missing values
Feature (str): -- Name of a specified feature from the original dataset
containing missing values that needs to be imputed.
methods (str or list): -- the methods that users want to compare (default: ["CC","IMP"])
Return:
dfm <- data.frame(x = c(1, 2, 3), y = c(0, 10, NaN)) compare_model(dfm, "y", "MIP", "NaN")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.