README.md

cdsrmodels

This package contains modeling and biomarker analysis functions created by the Cancer Data Science team.

Install

library(devtools)
devtools::install_github("broadinstitute/cdsr_models")

NOTE: If you have issues installing the WGCNA package which cdsr_models depends you can fix them by updating to R >4.0. If you don't want to update to R >4.0 contact William for a workaround.

The package can then be loaded by calling

library(cdsrmodels)

Modeling functions

discrete_test

Compares binary features, such as lineage and mutation, running a two class comparison on the difference in mean response between cell lines with the feature and without it. Run on response vector y and feature matrix X

cdsrmodels::discrete_test(X, y)

lin_associations

Compares continuous features, such as gene expression, calculating correlations between response and each feature. Run on feature matrix A, response vector y, and an optional matrix of confounders W. Other parameters can also be tuned and are explained in the function documentation.

cdsrmodels::lin_associations(A, y, W=NULL)

random_forest

Fits a random forest to a feature matrix X and a response vector y returning estimates of variable importance for each feature, as well as model level statistics such as R-squared. Other parameters can also be tuned and are explained in the function documentation.

cdsrmodels::random_forest(X, y)

random_forest_gauss

Analogous to random_forest but uses gausscov for feature selection in each fold.

cdsrmodels::random_forest_gauss(X, y)


broadinstitute/cdsr_models documentation built on Aug. 9, 2022, 10:36 a.m.