View source: R/plot_replace_missing.R
plot_replace_missing | R Documentation |
Function plots counts of missing values for a data frame's variables along with possible replacement.
A ggplot2 bar plot of numeric missing value counts is produced along with the option to replace the values. TODO: Be able to count/replace non-numeric missing values of a data frame.
If the argument 'replace_fun' is NULL
then only a bar chart showing the missing value count for each variable is
returned.
plot_replace_missing(
df,
variables = NULL,
replace_fun = NULL,
miss_values = NULL,
title = NULL,
subtitle = NULL,
center_titles = FALSE,
x_title = "Variables",
y_title = "Missing Counts",
rot_x_tic_angle = 0,
bar_fill = "gray",
bar_color = "black",
bar_alpha = 1,
bar_lwd = 0.7,
bar_width = NULL,
y_limits = NULL,
y_major_breaks = waiver(),
do_coord_flip = FALSE,
order_bars = NULL,
bar_labels = FALSE,
bar_label_sz = 6,
bar_label_color = "black"
)
df |
The source data frame with numeric and character variables. |
variables |
A character vector of numeric variable names from 'df' to be included in the plot and possible value replacement. |
replace_fun |
A character string or function that sets the aggregate function for replacing missing values. Acceptable values are "mean", "median", "locf" (last observation carried forward), "nocb" (next observation carried backward). The parameter can also be a user defined function that accepts a vector of non-missing values for a column (as determined by 'miss_values') and returns a single replacement value. See an example below. |
miss_values |
A vector with numeric and character values that define in addition to |
title |
A string that sets the plot title. |
subtitle |
A string that sets the plot subtitle. |
center_titles |
A logical which if |
x_title |
A string that sets the x axis title. The default is "Variables". If NULL then the x axis title does not appear. |
y_title |
A string that sets the y axis title. The default is "Missing Counts". If NULL then the y axis title does not appear. |
rot_x_tic_angle |
A numeric that sets the angle of rotation for the x tic label. When x tic labels are long, |
bar_fill |
A string that sets the fill color for the bars. |
bar_color |
A string that sets the outline color for the bars. |
bar_alpha |
A numeric that sets the alpha component to 'bar_color'. |
bar_lwd |
A numeric that sets the outline thickness of the bars. |
bar_width |
A numeric that sets the width of the bars. |
y_limits |
A numeric 2 element vector that sets the minimum and maximum for the y axis. Use NA to refer to the existing minimum and maximum. |
y_major_breaks |
A numeric vector or function that defines the exact major tic locations along the y axis. |
do_coord_flip |
A logical which if |
order_bars |
A string which will order the bars in a specific direction. Acceptable values are "asc" or "desc" |
bar_labels |
A logical which if |
bar_label_sz |
A numeric that sets the size of the bar label |
bar_label_color |
A string that sets the color of the bar labels |
Returning a named list with:
"missing_plot" – a ggplot2 plot object where additional aesthetics may be added.
"replacement_df" – a data.table copy of 'df' with missing values replaced
if 'replace_fun' is not NULL
.
library(ggplot2)
library(data.table)
library(mlbench)
library(RplotterPkg)
data("Soybean", package = "mlbench")
for(i in 2:ncol(Soybean)){
Soybean[,i] <- as.numeric(Soybean[,i])
}
columns_of_interest <- colnames(Soybean)[2:ncol(Soybean)]
Soybean$date[[3]] <- NA
Soybean$date[[4]] <- 99
Soybean$leaves[[4]] <- NA
Soybean$leaves[[5]] <- "N/A"
Soybean$leaves[[6]] <- "na"
Soybean$leaves[[7]] <- NA
Soybean$leaves[[8]] <- NaN
missing_val_fun <- function(x){
xx <- as.numeric(x)
return((max(xx) - min(xx))/2)
}
soybean_missing_lst <- RregressPkg::plot_replace_missing(
df = Soybean,
variables = columns_of_interest,
replace_fun = missing_val_fun,
miss_values = c("N/A", "na", 99),
title = "Count of Missing Values",
subtitle = "mlbench::Soybean data set",
x_title = "Variable",
y_title = "Count of Missing Values",
bar_lwd = 0.6,
bar_color = "white",
bar_labels = TRUE,
bar_label_sz = 3,
do_coord_flip = TRUE,
order_bars = "asc"
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.