View source: R/e_data_complete_by_variable_subset.R
e_data_complete_by_variable_subset | R Documentation |
For missing data, determine which sets of variables result in the most number of complete observations
e_data_complete_by_variable_subset(dat, var_list = NULL, var_resp = NULL)
dat |
data data.frame or tibble |
var_list |
list of variables, |
var_resp |
|
out a tibble with the n_complete
, n_var
, var_names_print
, and a list of variable names in var_names
# Generate missing values
dat_mtcars_miss_e <- dat_mtcars_e
prop_missing <- 0.10
n_missing <-
sample.int(
n = prod(dim(dat_mtcars_miss_e))
, size = round( prop_missing * prod(dim(dat_mtcars_miss_e)))
)
ind_missing <- expand.grid(1:dim(dat_mtcars_miss_e)[1], 1:dim(dat_mtcars_miss_e)[2])[n_missing, ]
for (i_row in seq_along(n_missing)) {
dat_mtcars_miss_e[ind_missing[i_row, 1], ind_missing[i_row, 2] ] <- NA
}
# Plot missing data
dat_mtcars_miss_e |> e_plot_missing()
out <- dat_mtcars_miss_e |> e_data_complete_by_variable_subset()
# Print table
out |> print(n = Inf, width = Inf)
# Print variable names from first row
out$var_names[1] |> unlist()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.