fit_rfc: Hidden genome random forest classifier (rfc)
In c7rishi/hidgenclassifier: Functions for Bayesian hierarchical hidden genome classifier

fit_rfc

R Documentation

Hidden genome random forest classifier (rfc)

Description

Hidden genome random forest classifier (rfc)

Usage

fit_rfc(
  X,
  Y,
  backend = "ranger",
  tune = TRUE,
  mtry = NULL,
  n_mtry = 6,
  max.depth = c(0, 10^(-4:1)),
  num.trees = 1000,
  ...
)

fit_rf(
  X,
  Y,
  backend = "ranger",
  tune = TRUE,
  mtry = NULL,
  n_mtry = 6,
  max.depth = c(0, 10^(-4:1)),
  num.trees = 1000,
  ...
)

Arguments

`X`	data design matrix with observations across rows and predictors across columns. For a typical hidden genome classifier each row represents a tumor and the columns represent (possibly normalized by some functions of the total mutation burden in tumors) binary 1-0 presence/absence indicators of raw variants, counts of mutations at specific genes and counts of mutations corresponding to specific mutation signatures etc.
`Y`	character vector or factor denoting the cancer type of tumors whose mutation profiles are listed across the rows of `X`.
`backend`	Which backend to use? Available options are "ranger" and "randomForest" corresponding to the respective R packages. NOTE: randomForest does not support sparseMatrix, and the predictor matrix is coerced into an ordinary matrix. This means using randomForest will likely be more memory intensive and hence slower than ranger. NOTE: ranger and randomForest are required to be installed separately.
`tune`	logical. Tune the random forest hyper parameters? Only used if backend = "ranger". Defaults to TRUE. If TRUE, a list of models are trained with various mtry and num.trees parameters, and the fitted model with minimum oob prediction error is returned.
`...`	additional arguments passed to ranger::ranger or randomForest::randomForest (depending on backend).

Details

Light wrapper around randomForest or ranger to use in hidden genome classification

Examples

data("impact")
top_v <- variant_screen_mi(
  maf = impact,
  variant_col = "Variant",
  cancer_col = "CANCER_SITE",
  sample_id_col = "patient_id",
  mi_rank_thresh = 50,
  return_prob_mi = FALSE
)
var_design <- extract_design(
  maf = impact,
  variant_col = "Variant",
  sample_id_col = "patient_id",
  variant_subset = top_v
)

canc_resp <- extract_cancer_response(
  maf = impact,
  cancer_col = "CANCER_SITE",
  sample_id_col = "patient_id"
)
pid <- names(canc_resp)
# create five stratified random folds
# based on the response cancer categories
set.seed(42)
folds <- data.table::data.table(
  resp = canc_resp
)[,
  foldid := sample(rep(1:5, length.out = .N)),
  by = resp
]$foldid

# 80%-20% stratified separation of training and
# test set tumors
idx_train <- pid[folds != 5]
idx_test <- pid[folds == 5]


## Not run: 
# train a classifier on the training set
# using only variants (will have low accuracy
# -- no meta-feature information used)
fit0 <- fit_rfc(
  X = var_design[idx_train, ],
  Y = canc_resp[idx_train],
  tune = FALSE
)

pred0 <- predict_rfc(
  fit = fit0,
  Xnew = var_design[idx_test, ]
)

## End(Not run)

c7rishi/hidgenclassifier documentation built on June 14, 2024, 11:10 a.m.

c7rishi/hidgenclassifier index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

c7rishi/hidgenclassifier
Functions for Bayesian hierarchical hidden genome classifier

fit_rfc: Hidden genome random forest classifier (rfc)
In c7rishi/hidgenclassifier: Functions for Bayesian hierarchical hidden genome classifier

Hidden genome random forest classifier (rfc)

Description

Usage

Arguments

Details

Examples

Related to fit_rfc in c7rishi/hidgenclassifier...

R Package Documentation

Browse R Packages

We want your feedback!

c7rishi/hidgenclassifier Functions for Bayesian hierarchical hidden genome classifier

fit_rfc: Hidden genome random forest classifier (rfc) In c7rishi/hidgenclassifier: Functions for Bayesian hierarchical hidden genome classifier

Hidden genome random forest classifier (rfc)

Description

Usage

Arguments

Details

Examples

Related to fit_rfc in c7rishi/hidgenclassifier...

R Package Documentation

Browse R Packages

We want your feedback!

c7rishi/hidgenclassifier
Functions for Bayesian hierarchical hidden genome classifier

fit_rfc: Hidden genome random forest classifier (rfc)
In c7rishi/hidgenclassifier: Functions for Bayesian hierarchical hidden genome classifier