tidy_corrm: Reshape data for correlation matrices to a tidy format

Description Usage Arguments Details Value See Also Examples

View source: R/tidy_corrm.R

Description

tidy_corrm() is a generic function with the purpose to take a dataset and reshape it to a long-table format that can be plotted with ggcorrm().

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
tidy_corrm(data, ...)

## Default S3 method:
tidy_corrm(
  data,
  labels = NULL,
  rescale = c("as_is", "by_sd", "by_range"),
  corr_method = c("pearson", "kendall", "spearman"),
  corr_group = NULL,
  mutates = NULL,
  ...
)

Arguments

data

For tidy_corrm.default, a data.frame or matrix with the raw data used for the correlation plot. If a data.frame, all numeric variables are used as rows/columns of the correlation plot, while all other variables are appended to the reshaped dataset as additional columns.

...

Further arguments (currently ignored in tidy_corrm.default)

labels

(Optional) character vector or function. If a character, must contain labels for the names of all numeric columns that are used to replace the column names in the plot axis and text labels and must be of the same length as the number of numeric columns displayed in the plot. If a function, must take the original names of the numeric columns as an argument and return a character vector with the same length. Defaults to NULL (use original column names as labels).

rescale

character string specifying the type of transformation performed on the numeric variables in the plot. The standard argument as_is" uses the unchanged raw values."by.sd" scales by the standard deviation of the data and centers around zero. "by.range" rescales the range of the data to the interval from 0 to 1. Defaults to "as_is".

corr_method

character string with the correlation method passed to stats::cor(). Used for the .corr variable appended to the tidy_corr dataset and passed on to lotri_corrtext()/ utri_corrtext() layers. Can be one of "pearson", "kendall" and "spearman". Defaults to "pearson".

corr_group

NULL or the name of a numeric variable in data. If a grouping variable is specified, .corr will be calculated separately for each of these groups (which may be useful for conditional coloring). Defaults to NULL.

mutates

(Optional) list of named quosures created with rlang::quos(). Can be any expressions that specify changes to the tidy_corrm dataset after reshaping, using regular dplyr::mutate() syntax. Defaults to NULL (no mutate operations on the raw data).

Details

tidy_corrm() is a generic S3 method that reshapes raw data for a correlation plot to a long-table format that can be plotted with ggcorrm(). The default method takes a data.frame or matrix and creates a tibble with all combinations of all numeric variables in the dataset that are labelled with their column names (or, alternatively, a vector with new labels) in the order of their appearance in the raw data. All other variables are appended to the reshaped data.frame and can be accessed in the plots.

By default, the data enter the plot unchanged (rescale = "as_is"), but it is also possible to scale and center using their standard deviation (rescale = "by_sd") or to rescale them into the range from 0 to 1 (rescale = "by_range").

An additional variable called .corr with the bivariate correlation of the two variables (by default, Pearson correlation, see cor()) is appended to the dataset. This variable can e.g. be used to specify the colour or fill of geoms conditional of the strength of the correlation (see examples in ggcorrm()). If the correlations displayed with lotri_corrtext() or utri_corrtext() are separated by groups, it may make sense to also calculate .corr separately for these groups. In this case, it is possible to specify a grouping variable for the calculation of .corr using corr_group.

In many cases, the columns of the data.frame used to construct the correlation matrix belong to different groups of variables. As the input for tidy_corrm() is based on a wide table format, it is often not easily possible to include this information as an additional column in the raw data. There are two ways to include variable-specific information after the fact: a) tidy_corrm() can be called directly, and its output can be modified manually before passing it to ggcorrm() or b) the mutates argument can be used to pass a list of named quosures created with rlang::quos() that contain a set of mutating operations based on regular dplyr::mutate() syntax that are evaluated inside the reshaped dataset (see examples). For the standard column names of tidy_corr objects see the Value section.

Value

An object of class tidy_corrm (a tibble with structured correlation data) containing the following columns:

var_x

Name of the variable on the x-axis in the order of appearance in the raw data (ordered factor).

var_y

Name of the variable on the y-axis in the order of appearance in the raw data (ordered factor)

x

Data of the variable on the x axis (numeric).

y

Data of the variable on the y axis (numeric).

pos

Type of panel (character, "utri", "lotri" or "dia").

.corr

Correlation between x and y for the respective panel/group, calculated with cor() using the method specified by corr_method and optionally within the groups specified with corr_group (numeric).

corr_group

grouping variable for .corr (1 for all observations if no groups are specified).

Additional columns

All other columns specified in the dataset and/or created via mutates.

See Also

ggcorrm(), corrmorant()

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
## Not run: 
if(interactive()){
   # general shape of the output
   corrdat <- tidy_corrm(drosera)
   head(corrdat)

   # relabeling variables
   corrdat1 <- tidy_corrm(drosera,
     labels = c("Some", "very", "nice", "labels"))
   head(corrdat1)

   # use of mutates argument
   corrdat2 <- tidy_corrm(
     drosera,
     mutates = quos(
       organ = ifelse(substr(var_x, 1, 1) == "p", "petiole", "leaf"),
       dimension = ifelse(grepl("width", var_x), "width", "length")
       )
    )
   head(corrdat2)
 }

## End(Not run)

r-link/corrmorant documentation built on Jan. 10, 2021, 7:26 p.m.