fit_smlc: Hidden genome sparse multinomial logistic classifier (smlc)
In c7rishi/hidgenclassifier: Functions for Bayesian hierarchical hidden genome classifier

fit_smlc

R Documentation

Hidden genome sparse multinomial logistic classifier (smlc)

Description

Hidden genome sparse multinomial logistic classifier (smlc)

Usage

fit_smlc(X, Y, grouped = TRUE, alpha = 1, normalize_rows = NULL, ...)

fit_mlogit(X, Y, grouped = TRUE, alpha = 1, normalize_rows = NULL, ...)

Arguments

`X`	data design matrix with observations across rows and predictors across columns. For a typical hidden genome classifier each row represents a tumor and the columns represent (possibly normalized by some functions of the total mutation burden in tumors) binary 1-0 presence/absence indicators of raw variants, counts of mutations at specific genes and counts of mutations corresponding to specific mutation signatures etc.
`Y`	character vector or factor denoting the cancer type of tumors whose mutation profiles are listed across the rows of `X`.
`grouped`	logical. Use group-lasso penalty instead of the ordinary lasso penalty? Defaults to TRUE.
`alpha`	The elasticnet mixing parameter. Passed to cv.glmnet
`normalize_rows`	vector of the same length as `nrow(X)` to be used to normalize the rows of `X`. If NULL (default), no normalization is performed.
`...`	additional arguments passed to `cv.glmnet`.

Value

Returns a list containing the cv.glmnet fitted object, the original X and Y and the estimated intercept vector alpha and regression coefficients matrix beta.

Note

The function is a light wrapper around cv.glmnet with family = "multinomial", and type.multinomial = "grouped" if grouped = TRUE. cv.glmnet tunes the sparsity hyper-parameter using cross-validation. fit_smlc by default uses a 10-fold cross-validation similar to the default of cv.glmnet (can be changed by supplying nfolds in ...); however with a stratified random partition (based on the categories of Y), instead of the default simple random partition used in cv.glmnet. Override this by supplying foldid to cv.glmnet in the .... In addition, fit_smlc sets maxit = 1e6, trace.it = TRUE in ... by default (instead of the default maxit = 1e5 set in glmnet).

Examples

data("impact")
top_v <- variant_screen_mi(
  maf = impact,
  variant_col = "Variant",
  cancer_col = "CANCER_SITE",
  sample_id_col = "patient_id",
  mi_rank_thresh = 50,
  return_prob_mi = FALSE
)
var_design <- extract_design(
  maf = impact,
  variant_col = "Variant",
  sample_id_col = "patient_id",
  variant_subset = top_v
)

canc_resp <- extract_cancer_response(
  maf = impact,
  cancer_col = "CANCER_SITE",
  sample_id_col = "patient_id"
)
pid <- names(canc_resp)
# create five stratified random folds
# based on the response cancer categories
set.seed(42)
folds <- data.table::data.table(
  resp = canc_resp
)[,
  foldid := sample(rep(1:5, length.out = .N)),
  by = resp
]$foldid

# 80%-20% stratified separation of training and
# test set tumors
idx_train <- pid[folds != 5]
idx_test <- pid[folds == 5]

# train a classifier on the training set
# using only variants (will have low accuracy
# -- no meta-feature information used
fit0 <- fit_mlogit(
  X = var_design[idx_train, ],
  Y = canc_resp[idx_train]
)

pred0 <- predict_mlogit(
  fit = fit0,
  Xnew = var_design[idx_test, ]
)

c7rishi/hidgenclassifier documentation built on June 14, 2024, 11:10 a.m.

c7rishi/hidgenclassifier index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

c7rishi/hidgenclassifier
Functions for Bayesian hierarchical hidden genome classifier

fit_smlc: Hidden genome sparse multinomial logistic classifier (smlc)
In c7rishi/hidgenclassifier: Functions for Bayesian hierarchical hidden genome classifier

Hidden genome sparse multinomial logistic classifier (smlc)

Description

Usage

Arguments

Value

Note

Examples

Related to fit_smlc in c7rishi/hidgenclassifier...

R Package Documentation

Browse R Packages

We want your feedback!

c7rishi/hidgenclassifier Functions for Bayesian hierarchical hidden genome classifier

fit_smlc: Hidden genome sparse multinomial logistic classifier (smlc) In c7rishi/hidgenclassifier: Functions for Bayesian hierarchical hidden genome classifier

Hidden genome sparse multinomial logistic classifier (smlc)

Description

Usage

Arguments

Value

Note

Examples

Related to fit_smlc in c7rishi/hidgenclassifier...

R Package Documentation

Browse R Packages

We want your feedback!

c7rishi/hidgenclassifier
Functions for Bayesian hierarchical hidden genome classifier

fit_smlc: Hidden genome sparse multinomial logistic classifier (smlc)
In c7rishi/hidgenclassifier: Functions for Bayesian hierarchical hidden genome classifier