varimpPred: Variable Importance and Predictions
In khadijaaziz/icardaFIGSr: Subsetting using Focused Identification of the Germplasm Strategy (FIGS)

Description Usage Arguments Details Value Author(s) See Also Examples

varimpPred calculates Variable Importance and makes predictions, it returns a list containing a data frame of variable importance scores, predictions or class probabilities, and corresponding plots.

varimpPred(
  newdata,
  y,
  positive,
  model,
  scale = FALSE,
  auc = FALSE,
  predict = FALSE,
  ...
)

`newdata`	object of class "data.frame" having test data.
`y`	character. Target variable.
`positive`	character. The positive class for the target variable if y is factor. Usually, it is the first level of the factor.
`model`	expression. The model object returned after training a model on training data.
`scale`	boolean. If `TRUE`, scales the variable importance values to between 0-100. Default: FALSE.
`auc`	boolean. If `TRUE`, calculates the area under the ROC curve and returns the value. Default: FALSE.
`predict`	boolean. If `TRUE`, calculates class probabilities and returns them as a data frame. Default: FALSE
`...`	additional arguments to be passed to `varImp` function in the package `caret`.

The importance measure for each variable is calculated based on the type of model.

For example for linear models, the absolute value of the t-statistic of each parameter is used in the importance calculation.

For classification models, with the exception of classification trees, bagged trees and boosted trees, a variable importance score is calculated for each class. See varImp for details on model-specific metrics.

varimpPred can be used to obtain either variable importance metrics, predictions, class probabilities, or a combination of these.

For classification models with predict = TRUE, class probabilities and ROC curve are given in the results.

For regression models with predict = TRUE, predictions and residuals versus predicted plot are given.

A list object with importance measures for variables in newdata, predictions for regression models, class probabilities for classification models, and corresponding plots.

newdata should be either the test data that remains after splitting whole data into training and test sets, or a new data set different from the one used to train the model.

If y is factor, class probabilities are calculated for each class. If y is numeric, predicted values are calculated.

A ROC curve is created if predict = TRUE and y is factor. Otherwise, a plot of residuals versus predicted values is created if y is numeric.

varimpPred relies on packages caret, ggplot2 and plotROC to perform the calculations and plotting.

Zakaria Kehel, Bancy Ngatia, Khadija Aziz, Zainab Azough

varImp, predict.train, ggplot, geom_roc, calc_auc

if(interactive()){
 # Calculate variable importance for classification model
 data("septoriaDurumWC")
 knn.mod <- tuneTrain(data = septoriaDurumWC,y = 'ST_S',method = 'knn')
 testdata <- knn.mod$`Test Data`
 knn.varimp<- varimpPred(newdata = testdata, y='ST_S', positive = 'R', model = knn.mod$Model)
 knn.varimp
 
 # Calculate variable importance and obtain class probabilities
 data("septoriaDurumWC")
 svm.mod <- tuneTrain(data = septoriaDurumWC, y = 'ST_S',method = 'svmLinear2',
                   predict = TRUE, positive = 'R',summary = twoClassSummary)
 testdata <- svm.mod$`Test Data`
 svm.varimp <- varimpPred(newdata = testdata, y = 'ST_S',
                          positive = 'R', model = svm.mod$Model,
                          ROC = TRUE, predict = TRUE)
 svm.varimp
 # Obtain variable importance plot for only first 20 variables
 # with highest measure
 svm.varimp <- varimpPred(newdata = testdata, y = 'ST_S',
                          positive = 'R', model = svm.mod$Model,
                          ROC = TRUE, predict = TRUE, top = 20)
 svm.varimp
 }

khadijaaziz/icardaFIGSr documentation built on Dec. 21, 2021, 6:38 a.m.

khadijaaziz/icardaFIGSr index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

khadijaaziz/icardaFIGSr
Subsetting using Focused Identification of the Germplasm Strategy (FIGS)

varimpPred: Variable Importance and Predictions
In khadijaaziz/icardaFIGSr: Subsetting using Focused Identification of the Germplasm Strategy (FIGS)

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Related to varimpPred in khadijaaziz/icardaFIGSr...

R Package Documentation

Browse R Packages

We want your feedback!

khadijaaziz/icardaFIGSr Subsetting using Focused Identification of the Germplasm Strategy (FIGS)

varimpPred: Variable Importance and Predictions In khadijaaziz/icardaFIGSr: Subsetting using Focused Identification of the Germplasm Strategy (FIGS)

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Related to varimpPred in khadijaaziz/icardaFIGSr...

R Package Documentation

Browse R Packages

We want your feedback!

khadijaaziz/icardaFIGSr
Subsetting using Focused Identification of the Germplasm Strategy (FIGS)

varimpPred: Variable Importance and Predictions
In khadijaaziz/icardaFIGSr: Subsetting using Focused Identification of the Germplasm Strategy (FIGS)