Description Usage Arguments Details Value Attributes of return object See Also Examples
The compare_category() compute information to examine the relationship between categorical variables.
1 2 3 4 |
.data |
a data.frame or a |
... |
one or more unquoted expressions separated by commas. You can treat variable names like they are positions. Positive values select variables; negative values to drop variables. These arguments are automatically quoted and evaluated in a context where column names represent column positions. They support unquoting and splicing. |
It is important to understand the relationship between categorical variables in EDA. compare_category() compares relations by pair combination of all categorical variables. and return compare_category class that based list object.
An object of the class as compare based list. The information to examine the relationship between categorical variables is as follows each components.
var1 : factor. The level of the first variable to compare. 'var1' is the name of the first variable to be compared.
var2 : factor. The level of the second variable to compare. 'var2' is the name of the second variable to be compared.
n : integer. frequency by var1 and var2.
rate : double. relative frequency.
first_rate : double. relative frequency in first variable.
second_rate : double. relative frequency in second variable.
Attributes of compare_category class is as follows.
variables : character. List of variables selected for comparison.
combination : matrix. It consists of pairs of variables to compare.
summary.compare_category
, print.compare_category
, plot.compare_category
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | # Generate data for the example
heartfailure2 <- heartfailure
heartfailure2[sample(seq(NROW(heartfailure2)), 5), "smoking"] <- NA
library(dplyr)
# Compare the all categorical variables
all_var <- compare_category(heartfailure2)
# Print compare_numeric class objects
all_var
# Compare the categorical variables that case of joint the death_event variable
all_var %>%
"["(grep("death_event", names(all_var)))
# Compare the two categorical variables
two_var <- compare_category(heartfailure2, smoking, death_event)
# Print compare_category class objects
two_var
# Filtering the case of smoking included NA
two_var %>%
"[["(1) %>%
filter(!is.na(smoking))
# Summary the all case : Return a invisible copy of an object.
stat <- summary(all_var)
# Summary by returned objects
stat
# component of table
stat$table
# component of chi-square test
stat$chisq
# component of chi-square test
summary(all_var, "chisq")
# component of chi-square test (first, third case)
summary(all_var, "chisq", pos = c(1, 3))
# component of relative frequency table
summary(all_var, "relative")
# component of table without missing values
summary(all_var, "table", na.rm = TRUE)
# component of table include marginal value
margin <- summary(all_var, "table", marginal = TRUE)
margin
# component of chi-square test
summary(two_var, method = "chisq")
# verbose is FALSE
summary(all_var, "chisq", verbose = FALSE)
#' # Using pipes & dplyr -------------------------
# If you want to use dplyr, set verbose to FALSE
summary(all_var, "chisq", verbose = FALSE) %>%
filter(p.value < 0.26)
# Extract component from list by index
summary(all_var, "table", na.rm = TRUE, verbose = FALSE) %>%
"[["(1)
# Extract component from list by name
summary(all_var, "table", na.rm = TRUE, verbose = FALSE) %>%
"[["("smoking vs death_event")
# plot all pair of variables
plot(all_var)
# plot a pair of variables
plot(two_var)
# plot all pair of variables by prompt
plot(all_var, prompt = TRUE)
# plot a pair of variables
plot(two_var, las = 1)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.