reportEquivalentVariables: Report the set of variables that will perform an equivalent...

Description Usage Arguments Value Author(s)

Description

Given a model, this function will report a data frame with all the variables that may be interchanged in the model without affecting its classification performance. For each variable in the model, this function will loop all candidate variables and report all of which result in an equivalent or better zIDI than the original model.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
	reportEquivalentVariables(object,
	                          pvalue = 0.05,
	                          data,
	                          variableList,
	                          Outcome = "Class",
	                          timeOutcome=NULL,
	                          type = c("LOGIT", "LM", "COX"),
	                          description = ".",
	                          method="BH",
	                          osize=0,
	                          fitFRESA=TRUE)

Arguments

object

An object of class lm, glm, or coxph containing the model to be analyzed

pvalue

The maximum p-value, associated to the IDI , allowed for a pair of variables to be considered equivalent

data

A data frame where all variables are stored in different columns

variableList

A data frame with two columns. The first one must have the names of the candidate variables and the other one the description of such variables

Outcome

The name of the column in data that stores the variable to be predicted by the model

timeOutcome

The name of the column in data that stores the time to event

type

Fit type: Logistic ("LOGIT"), linear ("LM"), or Cox proportional hazards ("COX")

description

The name of the column in variableList that stores the variable description

method

The method used by the p-value adjustment algorithm

osize

The number of features used for p-value adjustment

fitFRESA

if TRUE it will use the cpp based fitting method

Value

pvalueList

A list with all the unadjusted p-values of the equivalent features per model variable

equivalentMatrix

A data frame with three columns. The first column is the original variable of the model. The second column lists all variables that, if interchanged, will not statistically affect the performance of the model. The third column lists the corresponding z-scores of the IDI for each equivalent variable.

formulaList

a character vector with all the equivalent formulas

equivalentModel

a bagged model that used all the equivalent formulas. The model size is limited by the number of observations

Author(s)

Jose G. Tamez-Pena


FRESA.CAD documentation built on Jan. 13, 2021, 3:39 p.m.