plot_correlate.data.frame: Visualize correlation plot of numerical data

Description Usage Arguments Details See Also Examples

Description

The plot_correlate() visualize correlation plot for find relationship between two numerical variables.

Usage

1
2
3
4
5
6
7
plot_correlate(.data, ...)

## S3 method for class 'data.frame'
plot_correlate(.data, ..., method = c("pearson", "kendall", "spearman"))

## S3 method for class 'grouped_df'
plot_correlate(.data, ..., method = c("pearson", "kendall", "spearman"))

Arguments

.data

a data.frame or a tbl_df.

method

a character string indicating which correlation coefficient (or covariance) is to be computed. One of "pearson" (default), "kendall", or "spearman": can be abbreviated.

...

one or more unquoted expressions separated by commas. You can treat variable names like they are positions. Positive values select variables; negative values to drop variables. If the first expression is negative, plot_correlate() will automatically start with all variables. These arguments are automatically quoted and evaluated in a context where column names represent column positions. They support unquoting and splicing.

See vignette("EDA") for an introduction to these concepts.

Details

The scope of the visualization is the provide a correlation information. Since the plot is drawn for each variable, if you specify more than one variable in the ... argument, the specified number of plots are drawn.

See Also

plot_correlate.tbl_dbi, plot_outlier.data.frame.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# Visualize correlation plot of all numerical variables
plot_correlate(heartfailure)

# Select the variable to compute
plot_correlate(heartfailure, creatinine, sodium)
plot_correlate(heartfailure, -creatinine, -sodium)
plot_correlate(heartfailure, "creatinine", "sodium")
plot_correlate(heartfailure, 1)
plot_correlate(heartfailure, creatinine, sodium, method = "spearman")

# Using dplyr::grouped_dt
library(dplyr)

gdata <- group_by(heartfailure, smoking, death_event)
plot_correlate(gdata, "creatinine")
plot_correlate(gdata)

# Using pipes ---------------------------------
# Visualize correlation plot of all numerical variables
heartfailure %>%
  plot_correlate()
# Positive values select variables
heartfailure %>%
  plot_correlate(creatinine, sodium)
# Negative values to drop variables
heartfailure %>%
  plot_correlate(-creatinine, -sodium)
# Positions values select variables
heartfailure %>%
  plot_correlate(1)
# Positions values select variables
heartfailure %>%
  plot_correlate(-1, -3, -5, -7)

# Using pipes & dplyr -------------------------
# Visualize correlation plot of 'creatinine' variable by 'smoking'
# and 'death_event' variables.
heartfailure %>%
group_by(smoking, death_event) %>%
plot_correlate(creatinine)

# Extract only those with 'smoking' variable level is "Yes",
# and visualize correlation plot of 'creatinine' variable by 'hblood_pressure'
# and 'death_event' variables.
heartfailure %>%
 filter(smoking == "Yes") %>%
 group_by(hblood_pressure, death_event) %>%
 plot_correlate(creatinine)
 

bit2r/kodlookr documentation built on Dec. 19, 2021, 9:49 a.m.