ct.compareContrasts: Identify Replicated Signals in Pooled Screens Using...

View source: R/compareContrasts.R

ct.compareContrastsR Documentation

Identify Replicated Signals in Pooled Screens Using Conditional Scoring

Description

This function identifies signals that are present in one or more screening experiment contrasts using a conditional strategy. Specifically, this function identifies all significant signals (according to user definitions) in a set of provided results DF and returns a 'simplifiedResult' dataframe derived from the first provided contrast with an appended logical column indicating whether there is evidence for signal replication in the other provided resultsDFs.

Signals are considered replicated if they cross the specified stringent threshold (default: Q = 0.1) in one or more of the provided contrasts, and are similarly enriched or depleted at the relaxed threshold (default: P = 0.1) in all of the remaining contrasts. If a single contrast is provided, all signals crossing the stringent threshold are considered replicated.

Signals are compared across screens on the basis of ct.regularizeContrasts, so users must provide an identifier with which to standardize targets ('geneID' by default).

Usage

ct.compareContrasts(
  dflist,
  statistics = c("best.q", "best.p"),
  cutoffs = c(0.1, 0.1),
  same.dir = rep(TRUE, length(dflist)),
  return.stats = FALSE,
  nperm = 10000,
  ...
)

Arguments

dflist

A list of (possibly simplified) results data.frames produced by ct.generateResults.

statistics

Statistics to use to define congruence; may be a single value, but internally coerced to a vector of length 2 where the first value corresponds to the stringent cutoff annd the second value is used for the relaxed cutoff. Must be 'best.p' or 'best.q'.

cutoffs

Numeric value(s) corresponding to the significance cutoff(s) used to define stringent and relaxed values of 'statistics'. Internally coerced to a vector of length 2.

same.dir

Logical vector of the same length as 'dflist' indicating whether replicating signals are expected to go in the same direction (e.g., enrich/deplete in their respective screens). For example, a 'dflist' of length 3 could be specified as c(TRUE, TRUE, FALSE), indicating that replicating signals should be enriched in both of the first two contrasts and depleted in the third to be considered replicated (or vise-versa). Default is 'rep(TRUE, length(dflist))'.

return.stats

When TRUE, return the significance of overlap instead of the logical vector (by permutation).

nperm

numeric indicating number of permutations when 'return.stats' is true (default 10000).

...

Other arguments to 'ct.simpleResult()', especially 'collapse'.

Value

If 'return.stats' is 'FALSE', returns the first contrast as a 'simplifiedResult' data.frame, with a 'replicated' logical column indicating whether each signal replicates in all of the provided screens according to the specified logic.

If 'return.stats' is 'TRUE', returns a dataframe indicating the permutation-based test statistics summarizing the evidence for significantly enriched signal replication across the provided contrasts (enrich, deplete, and all together).

Author(s)

Russell Bainer

Examples

data('resultsDF')
summary(ct.compareContrasts(list(resultsDF, resultsDF[1:5000,]))$replicated)
ct.compareContrasts(list(resultsDF, resultsDF[1:5000,]), return.stats = TRUE)

RussBainer/gCrisprTools documentation built on Nov. 5, 2022, 2:35 p.m.