pants: Pathway analysis via network smoothing (Pants)

Description Usage Arguments Details Value Examples

View source: R/pants.R

Description

Pants algorithm to test if scores of features (i.e. analytes such as a gene, protein, or metabolite) in a pathway or those connected to the pathway in an interaction network are greater than randomized ones. Allows for testing group differences (limma_contrasts) with contrast.v & design; correlation (limma_cor) with design; or mediation (hitman) with exposure & covariates.

Usage

1
2
3
4
5
6
pants(object, Gmat, phenotype = NULL, type = c("contrasts",
  "correlation", "mediation"), contrast.v = NULL, design = NULL,
  exposure = NULL, covariates = NULL, ker = NULL, annot.df = NULL,
  ntop = 25, score_fcn = abs, nperm = 10^4 - 1,
  ret.pwy.dfs = FALSE, ret.null.mats = FALSE, min.nfeats = 3,
  ncores = 1, name = NA, seed = 0)

Arguments

object

Matrix-like data object containing log-ratios or log-expression values, with rows corresponding to features (e.g. genes) and columns to samples. Must have rownames that are non-duplicated and non-empty.

Gmat

Binary feature (e.g. gene) by pathway inclusion matrix, indicating which features are in which pathways.

phenotype

Vector of sample characteristics (correlation: numeric; contrasts: character). Should be same length as ncol(object).

type

Type of ezlimma analysis per feauture; must be one of"contrasts" (limma_contrasts), "correlation" (limma_cor), or "mediation" (hitman). You can specify just the initial letter.

contrast.v

Named vector of contrasts, passed to makeContrasts.

design

Design matrix of the experiment, with rows corresponding to samples and columns to coefficients to be estimated.

exposure

Numeric vector or matrix of exposures. Ignored if type!="mediation".

covariates

Numeric vector with one element per sample or matrix-like object with rows corresponding to samples and columns to covariates to be adjusted for.

ker

Laplacian kernel matrix representing the interaction network.

annot.df

Table of feature annotations that are appended to feature statistics.

ntop

Number of top features that most impact a pathway to include.

score_fcn

Function that transforms the t-statistics from the contrasts into a non-negative value. Its input must be a vector of same length as number of elements in contrast.v (usually one). Its output must be a non-negative scalar. Ignored if hitman is TRUE.

nperm

Number of sample permutations to evaluate significance of pathways.

ret.pwy.dfs

Logical; return list of data frames written out to CSVs?

ret.null.mats

Logical; return matrices with null distributions for features and pathways?

min.nfeats

Minimum number of features (e.g. genes) needed in a gene set for testing.

ncores

Integer. If > 1, number of cores to use for parallel computing. You can detect how many are available for your system using detectCores.

name

Name for the folder and Excel file that get written. Set to NA to avoid writing output.

seed

Integer seed to set for reproducility.

Details

Without mediation, phenotype's are permuted, since this properly permutes the object to phenotype mapping. object could be equivalently permuted. With mediation, because object is tested for its association to both phenotype and exposure, colnames(object) are permuted, which offers more available permutations.

Scores for features in the kernel but not in the data are assigned a score of zero by default for sparsity. Scores for features and pathways are compared to null scores, which are generated by permuting the columns of object and rerunning the algorithm. These are the stats returned in feature.stats.

For makeCluster, the cluster type depends on the OS, which is tested in the body of the function using .Platform$OS.type.

If !is.na(name), an Excel file with "_pants.xlsx" appended to name gets written out with links to CSVs containing the statistics and annotation of features most affecting the pathway's score. The annotation (and possibly other statistics) are from annot.df. Additionally, the CSVs contain whether each feature is in the pathway, and an impact column describing the impact of each feature on the pathway's score. Since a pathway's score is calculated in pants, impact uses the feature statistics calculated in pants by comparing to permutation. The feature statistics from ezlimma and those from pants are nearly identical, though; the main difference is that pants feature significances are limited by the number of permutations, so they flatten near the extreme. The features with the largest magnitude impact score are selected and can be visualized with ezlimmaplot::plot_pwy. These features may increase or decrease a pathway's score.

Value

List of at least two data frames:

pwy.stats

A data frame with columns

nfeatures

number of features in the pathway.

score

only returned if ret.null.mats is TRUE; pathway score (larger is more significant) to compare to null.pwy.mat

z

pathway permutation z-score (larger is more significant)

p

pathway permutation p-value

FDR

pathway FDR calculated from p-values with p.adjust(p, method="BH")

feature.stats

A data frame with columns

score

without mediation, feature's score from applying score_fcn to moderated t-statistics; or with mediation, parametric z-score from hitman's mediation p-value.

z

feature non-parametric z-score (larger is more significant) from comparing score vs. this feature's scores in permutations (before smoothing)

p

feature's non-parametric permutation p-value

FDR

feature's non-parametric FDR from permutation p

if ret.pwy.dfs is TRUE:

pwy.dfs

List of data frames written out to CSVs

And if ret.null.mats is TRUE:

null.feature.mat

Matrix with features as rows and permutations as columns, where each element represents the score of that feature in that permutation

null.pwy.mat

Matrix with pathways as rows and permutations as columns, where each element represents the score of that pathway in that permutation

sample.perms

Matrix with samples as rows and permutations as columns, where each element represents the index of the sample simulated to represent the sample in the row in that permutation

Examples

1
# A workflow is described in the vignette; instructions to view the vignette are in the README.

jdreyf/PANTS documentation built on July 18, 2019, 10:12 a.m.