correlation | R Documentation |
Produces measures of association for all variables in a data frame with confidence intervals when available.
correlation(
data = NULL,
printClasses = FALSE,
progress = TRUE,
methodNum = "pearson",
methodOrd = "kendall",
methodNumOrd = "spearman",
methodNumNom = "eta",
methodNumBin = "pearson",
testChisq = "chisq",
ci = FALSE,
conf = 0.95,
R = 1000,
correct = FALSE,
reportIncomplete = TRUE,
na.action = "na.omit",
digits = 3,
pDigits = 4,
...
)
data |
A data frame. |
printClasses |
If |
progress |
If |
methodNum |
The method for the correlation for two numeric variables.
The default is |
methodOrd |
The method for the correlation for two ordinal variables.
The default is |
methodNumOrd |
The method for the correlation of a numeric and
an ordinal variable.
The default is |
methodNumNom |
The method for the correlation of a numeric and a nominal variable. The default is |
methodNumBin |
The method for the correlation of a numeric and
a binary variable.
The default is |
testChisq |
The method for the test of two nominal variables.
The default is |
ci |
If |
conf |
The confidence level for confidence intervals. |
R |
The number of replications to use for bootstrap confidence intervals for applicable methods. |
correct |
Passed to |
reportIncomplete |
If |
na.action |
If |
digits |
The number of decimal places in the output of most statistics. |
pDigits |
The number of decimal places in the output for p-values. |
... |
Other arguments. |
It’s important that variables are assigned the correct class to get an appropriate measure of association. That is, factor variables should be of class "factor", not "character". Ordered factors should be ordered factors (and have their levels in the correct order!).
Date variables are treated as numeric.
The default for measures of association tend to be "parametric" type. That is, e.g. Pearson correlation where appropriate.
Nonparametric measures of association will be reported
with the options
methodNum = "spearman", methodNumNom = "epsilon",
methodNumBin = "glass"
.
A data frame of variables, association statistics, p-values, and confidence intervals.
Salvatore Mangiafico, mangiafico@njaes.rutgers.edu
https://rcompanion.org/handbook/I_14.html
phi
,
spearmanRho
,
cramerV
,
freemanTheta
,
wilcoxonRG
Length = c(0.29, 0.25, NA, 0.40, 0.50, 0.57, 0.62, 0.88, 0.99, 0.90)
Rating = factor(ordered=TRUE, levels=c("Low", "Medium", "High"),
x = rep(c("Low", "Medium", "High"), c(3,3,4)))
Color = factor(rep(c("Red", "Green", "Blue"), c(4,4,2)))
Flag = factor(rep(c(TRUE, FALSE, TRUE), c(5,4,1)))
Answer = factor(rep(c("Yes", "No", "Yes"), c(4,3,3)), levels=c("Yes", "No"))
Location = factor(rep(c("Home", "Away", "Other"), c(2,4,4)))
Distance = factor(ordered=TRUE, levels=c("Low", "Medium", "High"),
x = rep(c("Low", "Medium", "High"), c(5,2,3)))
Start = seq(as.Date("2024-01-01"), by = "month", length.out = 10)
Data = data.frame(Length, Rating, Color, Flag, Answer, Location, Distance, Start)
correlation(Data)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.