Description Usage Arguments Details Value Author(s) References
View source: R/univariateRankVariables.R
This function reports the mean and standard deviation for each feature in a model, and ranks them according to a userspecified score. Additionally, it does a KolmogorovSmirnov (KS) test on the raw and zstandardized data. It also reports the raw and zstandardized ttest score, the pvalue of the Wilcoxon ranksum test, the integrated discrimination improvement (IDI), the net reclassification improvement (NRI), the net residual improvement (NeRI), and the area under the ROC curve (AUC). Furthermore, it reports the zvalue of the variable significance on the fitted model.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30  univariateRankVariables(variableList,
formula,
Outcome,
data,
categorizationType = c("Raw",
"Categorical",
"ZCategorical",
"RawZCategorical",
"RawTail",
"RawZTail",
"Tail",
"RawRaw"),
type = c("LOGIT", "LM", "COX"),
rankingTest = c("zIDI",
"zNRI",
"IDI",
"NRI",
"NeRI",
"Ztest",
"AUC",
"CStat",
"Kendall"),
cateGroups = c(0.1, 0.9),
raw.dataFrame = NULL,
description = ".",
uniType = c("Binary","Regression"),
FullAnalysis=TRUE,
acovariates = NULL,
timeOutcome = NULL
)

variableList 
A data frame with the candidate variables to be ranked 
formula 
An object of class 
Outcome 
The name of the column in 
data 
A data frame where all variables are stored in different columns 
categorizationType 
How variables will be analyzed: As given in 
type 
Fit type: Logistic ("LOGIT"), linear ("LM"), or Cox proportional hazards ("COX") 
rankingTest 
Variables will be ranked based on: The zscore of the IDI ("zIDI"), the zscore of the NRI ("zNRI"), the IDI ("IDI"), the NRI ("NRI"), the NeRI ("NeRI"), the zscore of the model fit ("Ztest"), the AUC ("AUC"), the Somers' rank correlation ("Cstat"), or the Kendall rank correlation ("Kendall") 
cateGroups 
A vector of percentiles to be used for the categorization procedure 
raw.dataFrame 
A data frame similar to 
description 
The name of the column in 
uniType 
Type of univariate analysis: Binary classification ("Binary") or regression ("Regression") 
FullAnalysis 
If FALSE it will only order the features according to its zstatistics of the linear model 
acovariates 
the list of covariates 
timeOutcome 
the name of the Time to event feature 
This function will create valid dummy categorical variables if, and only if, data
has been zstandardized.
The pvalues provided in cateGroups
will be converted to its corresponding zscore, which will then be used to create the categories.
If non zstandardized data were to be used, the categorization analysis would return wrong results.
A sorted data frame. In the case of a binary classification analysis, the data frame will have the following columns:
Name 
Name of the raw variable or of the dummy variable if the data has been categorized 
parent 
Name of the raw variable from which the dummy variable was created 
descrip 
Description of the parent variable, as defined in 
cohortMean 
Mean value of the variable 
cohortStd 
Standard deviation of the variable 
cohortKSD 
D statistic of the KS test when comparing a normal distribution and the distribution of the variable 
cohortKSP 
Associated pvalue to the 
caseMean 
Mean value of cases (subjects with 
caseStd 
Standard deviation of cases 
caseKSD 
D statistic of the KS test when comparing a normal distribution and the distribution of the variable only for cases 
caseKSP 
Associated pvalue to the 
caseZKSD 
D statistic of the KS test when comparing a normal distribution and the distribution of the zstandardized variable only for cases 
caseZKSP 
Associated pvalue to the 
controlMean 
Mean value of controls (subjects with 
controlStd 
Standard deviation of controls 
controlKSD 
D statistic of the KS test when comparing a normal distribution and the distribution of the variable only for controls 
controlKSP 
Associated pvalue to the 
controlZKSD 
D statistic of the KS test when comparing a normal distribution and the distribution of the zstandardized variable only for controls 
controlZKSP 
Associated pvalue to the 
t.Rawvalue 
Normal inverse pvalue (zvalue) of the ttest performed on 
t.Zvalue 
zvalue of the ttest performed on 
wilcox.Zvalue 
zvalue of the Wilcoxon ranksum test performed on 
ZGLM 
zvalue returned by the 
zNRI 
zvalue returned by the 
zIDI 
zvalue returned by the 
zNeRI 
zvalue returned by the 
ROCAUC 
Area under the ROC curve returned by the 
cStatCorr 
c index of Somers' rank correlation returned by the 
NRI 
NRI returned by the 
IDI 
IDI returned by the 
NeRI 
NeRI returned by the 
kendall.r 
Kendall τ rank correlation coefficient between the variable and the binary outcome 
kendall.p 
Associated pvalue to the 
TstudentRes.p 
pvalue of the improvement in residuals, as evaluated by the paired ttest 
WilcoxRes.p 
pvalue of the improvement in residuals, as evaluated by the paired Wilcoxon ranksum test 
FRes.p 
pvalue of the improvement in residual variance, as evaluated by the Ftest 
caseN_Z_Low_Tail 
Number of cases in the low tail 
caseN_Z_Hi_Tail 
Number of cases in the top tail 
controlN_Z_Low_Tail 
Number of controls in the low tail 
controlN_Z_Hi_Tail 
Number of controls in the top tail 
In the case of regression analysis, the data frame will have the following columns:
Name 
Name of the raw variable or of the dummy variable if the data has been categorized 
parent 
Name of the raw variable from which the dummy variable was created 
descrip 
Description of the parent variable, as defined in 
cohortMean 
Mean value of the variable 
cohortStd 
Standard deviation of the variable 
cohortKSD 
D statistic of the KS test when comparing a normal distribution and the distribution of the variable 
cohortKSP 
Associated pvalue to the 
cohortZKSD 
D statistic of the KS test when comparing a normal distribution and the distribution of the zstandardized variable 
cohortZKSP 
Associated pvalue to the 
ZGLM 
zvalue returned by the glm or Cox procedure for the zstandardized variable 
zNRI 
zvalue returned by the 
NeRI 
NeRI returned by the 
cStatCorr 
c index of Somers' rank correlation returned by the 
spearman.r 
Spearman ρ rank correlation coefficient between the variable and the outcome 
pearson.r 
Pearson r productmoment correlation coefficient between the variable and the outcome 
kendall.r 
Kendall τ rank correlation coefficient between the variable and the outcome 
kendall.p 
Associated pvalue to the 
TstudentRes.p 
pvalue of the improvement in residuals, as evaluated by the paired ttest 
WilcoxRes.p 
pvalue of the improvement in residuals, as evaluated by the paired Wilcoxon ranksum test 
FRes.p 
pvalue of the improvement in residual variance, as evaluated by the Ftest 
Jose G. TamezPena
Pencina, M. J., D'Agostino, R. B., & Vasan, R. S. (2008). Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Statistics in medicine 27(2), 157172.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.