pathway_tControl: Calculate pathway-specific Student's t-scores from a null...

View source: R/superPC_pathway_tControl.R

pathway_tControlR Documentation

Calculate pathway-specific Student's t-scores from a null distribution for supervised PCA

Description

Parametrically resample the response vector before model analysis. Then extract principal components (PCs) from the gene pathway, and return the test statistics associated with the first numPCs principal components at a set of threshold values based on the resampled values of the response.

Usage

pathway_tControl(
  pathway_vec,
  geneArray_df,
  response_mat,
  responseType = c("survival", "regression", "categorical"),
  n.threshold = 20,
  numPCs = 1,
  min.features = 3
)

Arguments

pathway_vec

A character vector of the measured -Omes in the chosen gene pathway. These should match a subset of the rownames of the gene array.

geneArray_df

A "tall" pathway data frame (p \times N). Each subject or tissue sample is a column, and the rows are the -Ome measurements for that sample.

response_mat

A response matrix corresponding to responseType. For "regression" and "categorical", this will be an N \times 1 matrix of response values. For "survival", this will be an N \times 2 matrix with event times in the first column and observed event indicator in the second.

responseType

A character string. Options are "survival", "regression", and "categorical".

n.threshold

The number of bins into which to split the feature scores in the fit object returned internally by the superpc.train function.

numPCs

The number of PCs to extract from the pathway.

min.features

What is the smallest number of genes allowed in each pathway? This argument must be kept constant across all calls to this function which use the same pathway list. Defaults to 3.

Details

This is a wrapper function to call superpc.train and superpc.st after response parametric bootstrapping with the RandomControlSample suite of functions. This response sampling will act as a null distribution against which to compare the results from the pathway_tScores function.

This wrapper is designed to facilitate apply calls (in parallel or serially) of these two functions over a list of gene pathways. When numPCs is equal to 1, we recommend using a simplify-style apply variant, such as sapply (shown in lapply) or parSapply (shown in clusterApply), then transposing the resulting matrix.

Value

A matrix with numPCs rows and n.threshold columns. The matrix values are model t-statisics for each PC included (rows) at each threshold level (columns).

See Also

pathway_tScores; RandomControlSample; superpc.train; superpc.st

Examples

  # DO NOT CALL THIS FUNCTION DIRECTLY.
  # Use SuperPCA_pVals() instead

## Not run: 
  data("colon_pathwayCollection")
  data("colonSurv_df")

  colon_OmicsSurv <- CreateOmics(
    assayData_df = colonSurv_df[, -(2:3)],
    pathwayCollection_ls = colon_pathwayCollection,
    response = colonSurv_df[, 1:3],
    respType = "surv"
  )

  asthmaGenes_char <-
    getTrimPathwayCollection(colon_OmicsSurv)[["KEGG_ASTHMA"]]$IDs
  resp_mat <- matrix(
    c(getEventTime(colon_OmicsSurv), getEvent(colon_OmicsSurv)),
    ncol = 2
  )

  pathway_tControl(
    pathway_vec = asthmaGenes_char,
    geneArray_df = t(getAssay(colon_OmicsSurv)),
    response_mat = resp_mat,
    responseType = "survival"
  )

## End(Not run)


gabrielodom/pathwayPCA documentation built on July 10, 2023, 3:32 a.m.