QL.fit | R Documentation |
Fit a quasi-likelihood model to RNA-seq expression count data using the methods detailed in Lund, Nettleton, McCarthy, and Smyth (2012).
QL.fit(counts, design.list, test.mat=cbind(1L, 2:length(design.list)), log.offset=NULL, Model="NegBin", print.progress=TRUE, NBdisp="trend", bias.fold.tolerance=1.10, ...)
counts |
RNA-seq data matrix of integer expression counts. Each row contains observations from a single gene. Each column contains observations from a single experimental unit. |
design.list |
List of design matrices for the full model and reduced model(s). The first element of |
test.mat |
T by 2 matrix dictating which designs are to be compared, where T is the total number of desired hypothesis tests for each gene. Each row contains two integers, which provide the indices within |
log.offset |
A vector of log-scale, additive factors used to adjust estimated log-scale means for differences in library sizes across samples. Commonly used offsets include |
Model |
Must be one of "Poisson" or "NegBin", specifying use of a quasi-Poisson or quasi-negative binomial model, respectively. |
print.progress |
logical. If |
NBdisp |
Used only when |
bias.fold.tolerance |
A numerical value no smaller than 1. If the bias reduction of maximum likelihood estimates of (log) fold change is likely to result in a ratio of fold changes greater than this value, then bias reduction will be performed on such genes. Setting |
... |
arguments to be passed to the function |
list containing:
"LRT" |
matrix providing unadjusted likelihood ratio test statistics. Each column contains statistics from a single hypothesis test, applied separately to each gene. |
"LRT.Bart" |
The same as |
"phi.hat.dev" |
vector providing unshrunken, deviance-based estimates of quasi-dispersion (phi) for each gene. |
"phi.hat.pearson" |
vector providing unshrunken, Pearson estimates of quasi-dispersion (phi) for each gene. |
"mn.cnt" |
vector providing average count for each gene. |
"den.df" |
denominator degrees of freedom. Equal to the number of samples minus the number of fitted parameters in the full model, which is specified by the first element of |
"num.df" |
vector of numerator degrees of freedom for each test, computed as the difference in the number of fitted parameters between the full and reduced models for each test. |
"Model" |
Either "Poisson" or "NegBin", specifying which model (quasi-Poisson or quasi-negative binomial, respectively) was used. |
"nb.disp" |
Only appears when |
fitted.values |
matrix of fitted mean values |
coefficients |
matrix of estimated coefficients for each gene. Note that these are given on the log scale. (i.e. intercept coefficient reports log(average count) and non-intercept coefficients report estimated (and bias reduced, when appropriate) log fold-changes.) |
Originally authored by Steve Lund lundsp@gmail.com; later modified by Long Qu rtistician@gmail.com
Kosmidis and Firth (2009) "Bias reduction in exponential family nonlinear models" Biometrika, 96, 793–804.
Lund, Nettleton, McCarthy and Smyth (2012) "Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates" SAGMB, 11(5).
McCarthy, Chen and Smyth (2012) "Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation" Nucleic Acids Res. 40(10), 4288–97.
Cordeiro (1983) "Improved Likelihood Ratio Statistics for Generalized Linear Models" Journal of the Royal Statistical Society. Series B (Methodological), 45, 404–413.
QL.results
, NBDev
, PoisDev
, mockRNASeqData
data(mockRNASeqData) ### Create list of designs describing model under null and alternative hypotheses design.list=list( mockRNASeqData$design.matrix, matrix(1, nrow=rep(mockRNASeqData$nsamples)) ) ## Not run: ### Analyze using QL, QLShrink and QLSpline methods applied to quasi-Poisson model fit = with(mockRNASeqData, QL.fit(counts, design.list, log.offset=log(estimated.normalization), Model="Poisson", bias.fold.tolerance=Inf) ) results = QL.results(fit) ### How many significant genes at FDR=.05 from QLSpline method? apply(results$Q.values[[3]]<.05,2,sum) ### Indexes for Top 10 most significant genes from QLSpline method head(order(results$P.values[[3]]), 10) ## End(Not run) ## Not run: ### Analyze using QL, QLShrink and QLSpline methods ### applied to quasi-negative binomial model fit2 = with(mockRNASeqData, QL.fit(counts, design.list, log.offset=log(estimated.normalization), Model="NegBin") ) ########################################################## ## Note: 'NBdisp' typically will not be specified when ## ## calling QL.fit while analyzing real data. Providing ## ## numeric values for 'NBdisp' prevents neg binomial ## ## dispersions from being estimated from the data. ## ########################################################## results2<-QL.results(fit2) ### How many significant genes at FDR=.05 for QLSpline method? apply(results2$Q.values[[3]]<.05,2,sum) ### Indexes for Top 10 most significant genes from QLShrink method head(order(results2$P.values[[2]]), 10) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.