compare_category.data.frame: Compare categorical variables

Description Usage Arguments Details Value Attributes of return object See Also Examples

Description

The compare_category() compute information to examine the relationship between categorical variables.

Usage

1
2
3
4
compare_category(.data, ...)

## S3 method for class 'data.frame'
compare_category(.data, ...)

Arguments

.data

a data.frame or a tbl_df.

...

one or more unquoted expressions separated by commas. You can treat variable names like they are positions. Positive values select variables; negative values to drop variables. These arguments are automatically quoted and evaluated in a context where column names represent column positions. They support unquoting and splicing.

Details

It is important to understand the relationship between categorical variables in EDA. compare_category() compares relations by pair combination of all categorical variables. and return compare_category class that based list object.

Value

An object of the class as compare based list. The information to examine the relationship between categorical variables is as follows each components.

Attributes of return object

Attributes of compare_category class is as follows.

See Also

summary.compare_category, print.compare_category, plot.compare_category.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
# Generate data for the example
heartfailure2 <- heartfailure
heartfailure2[sample(seq(NROW(heartfailure2)), 5), "smoking"] <- NA

library(dplyr)

# Compare the all categorical variables
all_var <- compare_category(heartfailure2)

# Print compare_numeric class objects
all_var

# Compare the categorical variables that case of joint the death_event variable
all_var %>% 
  "["(grep("death_event", names(all_var)))

# Compare the two categorical variables
two_var <- compare_category(heartfailure2, smoking, death_event)

# Print compare_category class objects
two_var

# Filtering the case of smoking included NA 
two_var %>%
  "[["(1) %>% 
  filter(!is.na(smoking))

# Summary the all case : Return a invisible copy of an object.
stat <- summary(all_var)

# Summary by returned objects
stat

# component of table 
stat$table

# component of chi-square test 
stat$chisq

# component of chi-square test 
summary(all_var, "chisq")

# component of chi-square test (first, third case)
summary(all_var, "chisq", pos = c(1, 3))

# component of relative frequency table 
summary(all_var, "relative")

# component of table without missing values 
summary(all_var, "table", na.rm = TRUE)

# component of table include marginal value 
margin <- summary(all_var, "table", marginal = TRUE)
margin

# component of chi-square test 
summary(two_var, method = "chisq")

# verbose is FALSE 
summary(all_var, "chisq", verbose = FALSE)

#' # Using pipes & dplyr -------------------------
# If you want to use dplyr, set verbose to FALSE
summary(all_var, "chisq", verbose = FALSE) %>% 
  filter(p.value < 0.26)

# Extract component from list by index
summary(all_var, "table", na.rm = TRUE, verbose = FALSE) %>% 
  "[["(1)

# Extract component from list by name
summary(all_var, "table", na.rm = TRUE, verbose = FALSE) %>% 
  "[["("smoking vs death_event")

# plot all pair of variables
plot(all_var)

# plot a pair of variables
plot(two_var)

# plot all pair of variables by prompt
plot(all_var, prompt = TRUE)

# plot a pair of variables
plot(two_var, las = 1)

bit2r/kodlookr documentation built on Dec. 19, 2021, 9:49 a.m.