citrus.endpointRegress: Regress against an experimental endpoint

View source: R/citrus.model.R

citrus.endpointRegressR Documentation

Regress against an experimental endpoint

Description

Regress cluster properties against an experimental endpoint of interest. Models are fit on supplied features and constrained by regularization thresholds (glmnet and pamr) or FDR (sam). Stratifying features are returned along with corresponding cluster IDs.

Usage

citrus.endpointRegress(modelType, citrus.foldFeatureSet, labels, family, ...)

Arguments

modelType

Method to be used for model-fitting. Valid options are: glmnet,pamr, and sam.

citrus.foldFeatureSet

A citrus.foldFeatureSet object.

labels

Vector of endpoint values for analyzed samples.

family

Family of model to fit. Valid options are classification and continuous.

...

Other parameters passed to model-fitting methods.

Details

If independent clusterings are run (i.e. citrus.clusterAndMapFolds is run with nFolds > 1), model are fit on each feature set calculated for each clustering fold and final regularization thresholds are selected by predicting endpoint values for leftout samples whose data was mapped to existing cluster space. If a single clustering was run (i.e. citrus.clusterAndMapFolds is run with nFolds = 1), cross-validation is used to select final regularization thresholds based on features derived from a clustering of all samples. Regardless of how regularization thresholds are selected, the final reported features are from the final model constructed from all features, constrained by identified optimal regularization thresholds.

Value

A citrus.regression object with the following properties:

regularizationThresholds

Regularization thresholds used to constrain all constructed models. Not applicable for sam models.

foldModels

A citrus.endpointModel constructed from each independent fold feature set. NULL if nFolds = 1.

finalModel

A citrus.endpointModel constructed from features derived from the clustering of all samples together.

thresholdCVRates

Matrix containing the average error rate and standard error of models at each regularization threshold. FDR also reported where possible.

cvMinima

Values and indicies of pre-selected cross-validation error-rate thresholds.

differentialFeatures

Non-zero model features and corresponding clusters from the finalModel constrained by cvMinima.

modelType

Type of model fit on data.

family

Family of regression model.

labels

Endpoint labels of analyzed samples.

Author(s)

Robert Bruggner

Examples

# Where the data lives
dataDirectory = file.path(system.file(package = "citrus"),"extdata","example1")

# Create list of files to be analyzed
fileList = data.frame("unstim"=list.files(dataDirectory,pattern=".fcs"))

# Read the data
citrus.combinedFCSSet = citrus.readFCSSet(dataDirectory,fileList)

# List of columns to be used for clustering
clusteringColumns = c("Red","Blue")

# List disease group of each sample
labels = factor(rep(c("Healthy","Diseased"),each=10))

# Cluster data
citrus.foldClustering = citrus.clusterAndMapFolds(citrus.combinedFCSSet,clusteringColumns,labels,nFolds=4)

# Build abundance features
citrus.foldFeatureSet = citrus.calculateFoldFeatureSet(citrus.foldClustering,citrus.combinedFCSSet)

# Endpoint regress
citrus.regressionResult = citrus.endpointRegress(modelType="pamr",citrus.foldFeatureSet,labels,family="classification")

nolanlab/citrus documentation built on April 19, 2024, 6:49 p.m.