Description Usage Arguments Value Author(s) Examples
View source: R/listTopCorrelatedVariables.R
This function computes the Pearson, Spearman, or Kendall correlation for each specified variable in the data set and returns a list of the variables that are correlated to them. It also provides a short variable list without the highly correlated variables.
1 2 3 4 5  listTopCorrelatedVariables(variableList,
data,
pvalue = 0.001,
corthreshold = 0.9,
method = c("pearson", "kendall", "spearman"))

variableList 
A data frame with two columns. The first one must have the names of the candidate variables and the other one the description of such variables 
data 
A data frame where all variables are stored in different columns 
pvalue 
The maximum pvalue, associated to 
corthreshold 
The minimum correlation score, associated to 
method 
Correlation method: Pearson productmoment ("pearson"), Spearman's rank ("spearman"), or Kendall rank ("kendall") 
correlated.variables 
A data frame with two columns:

short.list 
A vector with a list of variables that are not correlated to each other. For every correlated pair, only the variable that first entered the correlation analysis was kept 
Jose G. TamezPena and Antonio MartinezTorteya
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32  ## Not run:
# Start the graphics device driver to save all plots in a pdf format
pdf(file = "Example.pdf")
# Get the stage C prostate cancer data from the rpart package
library(rpart)
data(stagec)
# Split the stages into several columns
dataCancer < cbind(stagec[,c(1:3,5:6)],
gleason4 = 1*(stagec[,7] == 4),
gleason5 = 1*(stagec[,7] == 5),
gleason6 = 1*(stagec[,7] == 6),
gleason7 = 1*(stagec[,7] == 7),
gleason8 = 1*(stagec[,7] == 8),
gleason910 = 1*(stagec[,7] >= 9),
eet = 1*(stagec[,4] == 2),
diploid = 1*(stagec[,8] == "diploid"),
tetraploid = 1*(stagec[,8] == "tetraploid"),
notAneuploid = 11*(stagec[,8] == "aneuploid"))
# Remove the incomplete cases
dataCancer < dataCancer[complete.cases(dataCancer),]
# Load a prestablished data frame with the names and descriptions of all variables
data(cancerVarNames)
# Get the variables that have a correlation coefficient larger
# than 0.65 at a pvalue of 0.05
cor < listTopCorrelatedVariables(variableList = cancerVarNames,
data = dataCancer,
pvalue = 0.05,
corthreshold = 0.65,
method = "pearson")
# Shut down the graphics device driver
dev.off()
## End(Not run)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.