False discovery rate and power for PWM Score distributions.

Share:

Description

Computes score cutoffs for a PWM or a PCM, given distributions as calculated with computeScoreDist(). Cutoffs can be computed for a given false discovery rate (FDR), for a given false negative rate (FNR), and the optimal tradeoff between the two, in the sense that c \times FDR = FNR for some c that the user may choose.

Usage

1
scoreDistCutoffs(scoreDist, n, m = 1, c = 1, cutoff = 0.01)

Arguments

scoreDist

A ProfileDist object, as computed by computeScoreDist()

n

The number of scores considered for the given PWM. If one sequence is considered and a score is computed for all overlapping windows of the same length as the PWM, this will be the length of the sequence, minus the PWM length plus 1. If scanning a sequence and its reverse complement too, this number must be further multiplied by two. The number forms the basis for the FDR, since this is a multiple testing problem.

m

The number of true positives assumed for computing the FNR.

c

A factor expressing how much more important the FDR is compared to the FNR, when computing the tradeoff cutoff that considers both FDR and FNR. See Rahmann et al. for details.

cutoff

The FDR and FNR considered, typically 0.01 or 0.05.

Value

a list with elements:

cutoffa

Score cutoff for FDR=cutoff

cutoffb

Score cutoff for FNR=cutoff

cutoffopt

Score cutoff for c*FDR = FNR

References

Rahmann, S., Mueller, T., and Vingron, M. (2003). On the power of profiles for transcription factor binding site detection. Stat Appl Genet Mol Biol 2, Article7.

Examples

1
2
3
data(INR)
thedist <- computeScoreDist(regularizeMatrix(INR), 0.5)
scoreDistCutoffs(thedist, n=2000, cutoff=0.05)