rf_reg.by_datasets: rf_reg.by_datasets

View source: R/ranger_crossRF_util.R

rf_reg.by_datasetsR Documentation

rf_reg.by_datasets

Description

It runs standard random forests with oob estimation for regression of c_category in each the sub-datasets splited by the s_category, and apply the model to all the other datasets. The output includes accuracy, auc and Kappa statistics.

Usage

rf_reg.by_datasets(
  df,
  metadata,
  s_category,
  c_category,
  nfolds = 5,
  rf_imp_pvalues = FALSE,
  verbose = FALSE,
  ntree = 500
)

Arguments

df

Training data: a data.frame.

metadata

Sample metadata with at least two columns.

s_category

A string indicates the category in the sample metadata: a ‘factor’ defines the sample grouping for data spliting.

c_category

A indicates the category in the sample metadata: a 'factor' used as sample label for rf classification in each of splited datasets.

nfolds

The number of folds in the cross validation.

rf_imp_pvalues

A boolean value indicate if compute both importance score and pvalue for each feature.

verbose

Show computation status and estimated runtime.

ntree

The number of trees.

Value

...

Author(s)

Shi Huang

See Also

ranger

Examples

df <- data.frame(rbind(t(rmultinom(14, 14*5, c(.21,.6,.12,.38,.099))),
            t(rmultinom(16, 16*5, c(.001,.6,.42,.58,.299))),
            t(rmultinom(30, 30*5, c(.011,.6,.22,.28,.289))),
            t(rmultinom(30, 30*5, c(.091,.6,.32,.18,.209))),
            t(rmultinom(30, 30*5, c(.001,.6,.42,.58,.299)))))
df0 <- data.frame(t(rmultinom(120, 600,c(.001,.6,.2,.3,.299))))
metadata<-data.frame(f_s=factor(c(rep("A", 60), rep("B", 60))),
                     f_s1=factor(c(rep(TRUE, 60), rep(FALSE, 60))),
                     f_c=factor(c(rep("C", 30), rep("H", 30), rep("D", 30), rep("P", 30))),
                     age=c(1:60, 2:61)
                     )

reg_res<-rf_reg.by_datasets(df, metadata, nfolds=5, s_category='f_s', c_category='age')
reg_res

shihuang047/crossRanger documentation built on Feb. 7, 2023, 10:03 p.m.