AutoScore_rank: AutoScore STEP(i): Rank variables with machine learning...
In AutoScore: An Interpretable Machine Learning-Based Automatic Clinical Score Generator

AutoScore_rank

R Documentation

AutoScore STEP(i): Rank variables with machine learning (AutoScore Module 1)

Description

AutoScore STEP(i): Rank variables with machine learning (AutoScore Module 1)

Usage

AutoScore_rank(train_set, validation_set = NULL, method = "rf", ntree = 100)

Arguments

`train_set`	A processed `data.frame` that contains data to be analyzed, for training.
`validation_set`	A processed `data.frame` that contains data to be analyzed, only for auc-based ranking.
`method`	method for ranking. Options: 1. 'rf' - random forest (default), 2. 'auc' - auc-based (required validation set). For "auc", univariate models will be built based on the train set, and the variable ranking is constructed via the AUC performance of corresponding univariate models on the validation set ('validation_set').
`ntree`	Number of trees in the random forest (Default: 100).

Details

The first step in the AutoScore framework is variable ranking. We use random forest (RF), an ensemble machine learning algorithm, to identify the top-ranking predictors for subsequent score generation. This step correspond to Module 1 in the AutoScore paper.

Value

Returns a vector containing the list of variables and its ranking generated by machine learning (random forest)

References

Breiman, L. (2001), Random Forests, Machine Learning 45(1), 5-32
Xie F, Chakraborty B, Ong MEH, Goldstein BA, Liu N. AutoScore: A Machine Learning-Based Automatic Clinical Score Generator and Its Application to Mortality Prediction Using Electronic Health Records. JMIR Medical Informatics 2020;8(10):e21798

Examples

# see AutoScore Guidebook for the whole 5-step workflow
data("sample_data")
names(sample_data)[names(sample_data) == "Mortality_inpatient"] <- "label"
ranking <- AutoScore_rank(sample_data, ntree = 50)

AutoScore documentation built on Oct. 16, 2022, 1:06 a.m.