Description Usage Arguments Details Value Note Author(s) References See Also Examples
View source: R/FastDeltaWeighted.R
This function takes in the raw read count data matrix from RNA-Seq experiment, with rows corresponding to genes and
columns corresponding to samples/libraries, and conducts differential expression analysis based on the robust superdelta2 method. Whether or not to use the total number of mapped read counts per-library as sample weights (the W
parameter) is up to user's decision.
1 2 | superdelta2(mydata, offset = 1, Grps, W = NULL, trim = .2,
adjp.thresh = 0.05, prop = 1.0)
|
mydata |
A data matrix with rows being genes and columns being samples. Note that this argument requires RAW read count of RNA-seq gene expression data, but not pre-processed data. |
offset |
A value to be added when taking logarithm to avoid log of zero. Default value is 1. Commonly used values include 0.5, 1, 2, and 3. |
Grps |
A character vector of length equal to the total number of samples in the data matrix, indicating group labels. This object is used to create design matrix for further computation. |
W |
Sample weights to be used. Default is NULL, which means unweighted (equally weighted). Users may want to use per-sample total read counts as the weighting metric. This is shown to have marginal gain in some cases. |
trim |
Trimming proportion when removing the most extreme values of between-group sum of squares. Default is 0.2. |
adjp.thresh |
Cutoff of adjusted p-value to define significance. Default is 0.05. |
prop |
The proportion of reference genes used in the main (second round) spherical trimming for bias correction. The default value is 1.0, which means using all genes for bias correction. Using a small proportion can improve the computational speed, at the cost of slightly less accurate bias correction. |
If not NULL
, W
must be a vector of positive per-sample
weights of length equal to the number of columns (samples) in the data matrix,
to specify per-sample weights. The program will automatically
normalize the weights so that they sum up to 1. We recommend using the total
number of mapped read counts as per-sample weights.
The superdelta2
function will return a list with the following objects:
SigGenes |
A named vector of the significant gene IDs. Typically the name of this vector is inherited from the rownames of the input data matrix. |
Fstats |
A vector of length equal to the number of genes in the data matrix indicating the super-delta F-statistic for each gene. |
pvalues.F |
Raw p-values obtained from the super-delta F-statistics. |
padj.F |
Benjamini adjusted p-values obtained from the super-delta F-statistics. |
Log2FC |
Log2 fold-change of post-hoc pairwise t-tests. Technically this is equal to the numerator of Tukey style t-statistics. |
tstats |
A matrix of row dimension equal to the number of genes in the input data matrix, providing the super-delta post-hoc Tukey t-statistics. |
pvalues.t |
Raw p-values obtained from the super-delta Tukey t-statistics. |
padj.t |
Benjamini adjusted p-values obtained from the super-delta Tukey t-statistics. |
WBGRMS |
Estimated weighted between-group mean square, namely the top part of F-statistics. |
WWGRMS |
Estimated weighted within-group mean square, namely the bottom part of F-statistics. |
sigma2hat |
Estimated sigma square (variance of the error terms). |
Rtrim |
Saved R object containing the information of the main trim. |
The superdelta2
function implements a general-purpose differential gene expression analysis procedure that can be used in a multiple group (>=3) setting. It inherits the basic idea of superdelta (see References for details) with a robust internal normalization and an asymptotically unbiased estimator of the “oracle” between-group difference. It applies a robust trimming procedure to the estimated between-group sum of squares and takes advantage of all the within-group sum of squares, to end up with the superdelta F-statistics. In addition, superdelta2
is accompanied with a Tukey style pairwise comparison to also obtain post-hoc t-statistics.
Yuhang Liu, Xing Qiu, Jinfeng Zhang, and Zihan Cui
Liu, Y., Zhang, J., & Qiu, X. (2017). Super-delta: a new differential gene expression analysis procedure with robust data normalization. BMC bioinformatics, 18(1), 582.
SIM1, SIM2, SIM3
1 2 3 4 5 6 7 8 | ## Load the sample data
data(SampleData)
## Number of genes and samples
ngenes <- 5000; n1 <- n2 <- n3 <- 50; ns <- c(n1,n2,n3)
Groups <- c(rep("A",n1), rep("B",n2), rep("C",n3))
mod1 <- superdelta2(mydata = SIM1, offset = 1, Grps = Groups)
mod2 <- superdelta2(mydata = SIM2, offset = 1, Grps = Groups)
mod3 <- superdelta2(mydata = SIM3, offset = 1, Grps = Groups)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.