PathwaytValues: Calculate pathway-specific Student's t-scores from a null...

View source: R/superPC_pathway_tValues.R

PathwaytValuesR Documentation

Calculate pathway-specific Student's t-scores from a null distribution or the true distribution for supervised PCA

Description

If we sample from the null, distribution, first parametrically resample the response vector before model analysis (f we calculate Student t statistics from the true distribution instead, the response matrix is untouched). Then extract principal components (PCs) from the gene pathway, and return the test statistics associated with the first numPCs principal components at a set of threshold values based on the values of the parametrically resampled response (for the null distribution) or the response itself (for the true distribution).

Usage

PathwaytValues(
  pathway_vec,
  geneArray_df,
  response_mat,
  responseType = c("survival", "regression", "categorical"),
  control = FALSE,
  n.threshold = 20,
  numPCs = 1,
  min.features = 3
)

Arguments

pathway_vec

A character vector of the measured -Omes in the chosen gene pathway. These should match a subset of the rownames of the gene array.

geneArray_df

A "tall" pathway data frame (p \times N). Each subject or tissue sample is a column, and the rows are the -Ome measurements for that sample.

response_mat

A response matrix corresponding to responseType. For "regression" and "categorical", this will be an N \times 1 factor matrix of response values. For "survival", this will be an N \times 2 matrix with event times in the first column and observed event indicator in the second. You can create a factor matrix of a factor a with the command dim(a) <- c(k, 1), where k = length(a).

responseType

A character string. Options are "survival", "regression", and "categorical".

control

Should the responses be parametrically resampled to generate a control distribution? Defaults to FALSE.

n.threshold

The number of bins into which to split the feature scores in the fit object returned internally by the superpc.train function.

numPCs

The number of PCs to extract from the pathway.

min.features

What is the smallest number of genes allowed in each pathway? This argument must be kept constant across all calls to this function which use the same pathway list. Defaults to 3.

Details

This is a wrapper function to call superpc.train and superpc.st. This wrapper is designed to facilitate apply calls (in parallel or serially) of these two functions over a list of gene pathways. When numPCs is equal to 1, we recommend using a simplify-style apply variant, such as sapply (shown in lapply) or parSapply (shown in clusterApply), then transposing the resulting matrix.

If control = TRUE, the RandomControlSample suite of functions first parametrically bootstrapps the response. This control response will be used to contrstruct a null distribution against which to compare the results calculated with the original response values.

Value

If control = TRUE, a matrix with numPCs rows and n.threshold columns. The matrix values are model t-statisics for each PC included (rows) at each threshold level (columns).

If control = TRUE, the same matrix as above is contained as the tscor element of a list (the first element). The other list elements are PCs_mat (the matrix of PCs) and loadings (the matrix of -Ome loadings corresponding to the PCs).

See Also

pathway_tScores; pathway_tControl; RandomControlSample; superpc.train; superpc.st

Examples

  # DO NOT CALL THIS FUNCTION DIRECTLY.
  # Use SuperPCA_pVals() instead

## Not run: 
  data("colon_pathwayCollection")
  data("colonSurv_df")

  colon_OmicsSurv <- CreateOmics(
    assayData_df = colonSurv_df[, -(2:3)],
    pathwayCollection_ls = colon_pathwayCollection,
    response = colonSurv_df[, 1:3],
    respType = "surv"
  )

  asthmaGenes_char <-
    getTrimPathwayCollection(colon_OmicsSurv)[["KEGG_ASTHMA"]]$IDs
  resp_mat <- matrix(
    c(getEventTime(colon_OmicsSurv), getEvent(colon_OmicsSurv)),
    ncol = 2
  )

  PathwaytValues(
    pathway_vec = asthmaGenes_char,
    geneArray_df = t(getAssay(colon_OmicsSurv)),
    response_mat = resp_mat,
    responseType = "survival"
  )

  PathwaytValues(
    pathway_vec = asthmaGenes_char,
    geneArray_df = t(getAssay(colon_OmicsSurv)),
    response_mat = resp_mat,
    responseType = "survival",
    control = TRUE
  )

## End(Not run)


gabrielodom/pathwayPCA documentation built on July 10, 2023, 3:32 a.m.