correlation: Correlation matrices

View source: R/correlation.R

correlationR Documentation

Correlation matrices

Description

Compute the correlation matrix between all columns of a matrix or data frame.

Usage

correlation(x, ...)

Correlation(x, ...)

## S3 method for class 'formula'
correlation(formula, data = NULL, subset, na.action, ...)

## Default S3 method:
correlation(
  x,
  y = NULL,
  use = "everything",
  method = c("pearson", "kendall", "spearman"),
  ...
)

is.Correlation(x)

is.correlation(x)

as.Correlation(x)

as.correlation(x)

## S3 method for class 'Correlation'
print(x, digits = 3, cutoff = 0, ...)

## S3 method for class 'Correlation'
summary(
  object,
  cutpoints = c(0.3, 0.6, 0.8, 0.9, 0.95),
  symbols = c(" ", ".", ",", "+", "*", "B"),
  ...
)

## S3 method for class 'summary.Correlation'
print(x, ...)

## S3 method for class 'Correlation'
plot(
  x,
  y = NULL,
  outline = TRUE,
  cutpoints = c(0.3, 0.6, 0.8, 0.9, 0.95),
  palette = rwb.colors,
  col = NULL,
  numbers = TRUE,
  digits = 2,
  type = c("full", "lower", "upper"),
  diag = (type == "full"),
  cex.lab = par("cex.lab"),
  cex = 0.75 * par("cex"),
  ...
)

## S3 method for class 'Correlation'
lines(
  x,
  choices = 1L:2L,
  col = par("col"),
  lty = 2,
  ar.length = 0.1,
  pos = NULL,
  cex = par("cex"),
  labels = rownames(x),
  ...
)

Arguments

x

A numeric vector, matrix or data frame (or any object for is.Correlation() or as.Correlation()).

...

Further arguments passed to functions.

formula

A formula with no response variable, referring only to numeric variables.

data

An optional data frame (or similar, see model.frame()) containing the variables in the formula. By default the variables are taken from environment(formula).

subset

An optional vector used to select rows (observations) of the data matrix x.

na.action

A function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options() and na.fail() is used if that is not set. The 'factory-fresh' default is na.omit().

y

NULL (default), or a vector, matrix or data frame with compatible dimensions to x for Correlation(). The default is equivalent to x = y, but more efficient.

use

An optional character string giving a method for computing correlations in the presence of missing values. This must be (an abbreviation of) one of the strings "everything", "all.obs", "complete.obs", "na.or.complete", or "pairwise.complete.obs".

method

A character string indicating which correlation coefficient is to be computed. One of "pearson" (default), "kendall", or "spearman", can be abbreviated.

digits

Digits to print after the decimal separator.

cutoff

Correlation coefficients lower than this (in absolute value) are suppressed.

object

A 'Correlation' object.

cutpoints

The cut points to use for categories. Specify only positive values (absolute value of correlation coefficients are summarized, or negative equivalents are automatically computed for the graph. Do not include 0 or 1 in the cutpoints).

symbols

The symbols to use to summarize the correlation matrix.

outline

Do we draw the outline of the ellipse?

palette

A function that can produce a palette of colors.

col

Color of the ellipse. If NULL (default), the colors will be computed using cutpoints and palette.

numbers

Do we print correlation values in the center of the ellipses?

type

Do we plot a complete matrix, or only lower or upper triangle?

diag

Do we plot items on the diagonal? They have always a correlation of one.

cex.lab

The expansion factor for labels.

cex

The expansion factor for text.

choices

The items to select.

lty

The line type to draw.

ar.length

The length of the arrow head.

pos

The position relative to arrows.

labels

The label to draw near the arrows.

Value

Correlation() and as.Correlation() create a 'Correlation' object, while is.Correlation() tests for it.

There are print() and summary() methods for the 'Correlation' object that differ in the symbolic encoding of the correlations, (using symnum() for summary()), which makes large correlation matrices more readable.

The plot() method draws ellipses on a graph to represent the correlation matrix visually. This is essentially the plotcorr() function from package ellipse, with slightly different default arguments and with default cutpoints equivalent to those used in the summary() method.

Author(s)

Philippe Grosjean phgrosjean@sciviews.org, wrapping code in package ellipse, function plotcorr() for the plot.Correlation() method.

See Also

cov(), cov2cor(), cov.wt(), symnum(), plotcorr() and look also at panel_cor()

Examples

# This is a simple correlation coefficient
cor(rnorm(10), runif(10))
Correlation(rnorm(10), runif(10))

# 'Correlation' objects allow better inspection of the correlation matrices
# than the output of default R cor() function
(longley.cor <- Correlation(longley))
summary(longley.cor) # Synthetic view of the correlation matrix
plot(longley.cor)    # Graphical representation

# Use of the formula interface
(mtcars.cor <- Correlation(~ mpg + cyl + disp + hp, data = mtcars,
  method = "spearman", na.action = "na.omit"))

mtcars.cor2 <- Correlation(mtcars, method = "spearman")
print(mtcars.cor2, cutoff = 0.6)
summary(mtcars.cor2)
plot(mtcars.cor2, type = "lower")

mtcars.cor2["mpg", "cyl"] # Extract a correlation from the correlation matrix

SciViews/SciViews documentation built on Sept. 16, 2023, 10:26 p.m.