FastScreenP: Using p-values to screen and filter the data set
In markvdwiel/ShrinkBayes: Bayesian analysis of high-dimensional omics data, either Gaussian or counts

FastScreenP

R Documentation

Using p-values to screen and filter the data set

Description

This is a convenience function to speed up your computations when the data set is very large.

Usage

FastScreenP(pvalues, method = "padjust", threshold = ifelse(method == "pvalue", 0.1, 0.5), adjmethod = "BH")

Arguments

`pvalues`	Numerical vector of p-values.
`method`	Character string either equal to "padjust" or "pvalue"
`threshold`	Numeric. Upper threshold for reporting indices
`adjmethod`	Character string. P-value adjustment method in `p.adjust`

Details

This function is particularly convenient when you have access to a method that provides very fast computation of p-values. If such p-values are large, you may opt to not compute posteriors for the corresponding features. Please be aware that such filtering should take place after all priors have been fit and fixed to avoid biases. See the vignette for an example. ScreenData provides computation of p-values for two-group and multi-group settings.

Value

Integer vector containing all indices with (adjusted) p-values below the threshold

Note

Use with CARE. Do not use it to filter data for ShrinkSeq or ShrinkGauss, because this would introduce a bias.

Author(s)

Mark A. van de Wiel

Examples

# Simulation adopted from limma. Simulate gaussian data for 1000 probes and 6 samples
# Samples are in two groups
# First fifty probes are differentially expressed in second group
# Std deviations vary with prior df=4
sd <- 0.3*sqrt(4/rchisq(1000,df=4))
y <- matrix(rnorm(1000*6,sd=sd),1000,6)
rownames(y) <- paste("Feature",1:1000)
y[1:50,4:6] <- y[1:50,4:6] + 2

group <- factor(c(1,1,1,2,2,2))

#performs t-test for all 1000 rows
pvals <- ScreenData(y, "group", np = FALSE, ncpus = 2)

#screens on the basis of FDR, FDR <= 0.5. Since this is an initial screen, it is best to be liberal.
whichin <- FastScreenP(pvals)

#run ShrinkSeq, ShrinkGauss and UpdatePrior functions on the ENTIRE data set:
#for these functions computing time does not strongly depend on the number of features
#For function FitAllShrink() use the selected set of features:
## Not run: 
fit <- FitAllShrink(form=form,dat=y[whichin,],...)

## End(Not run)

markvdwiel/ShrinkBayes documentation built on March 27, 2022, 7:47 p.m.