lengthNorm.limma.createRmd: Generate a '.Rmd' file containing code to perform...
In csoneson/compcodeR: RNAseq data simulation, differential expression analysis and performance comparison of differential expression methods

View source: R/generateRmdCodeDiffExpPhylo.R

lengthNorm.limma.createRmd

R Documentation

Generate a `.Rmd` file containing code to perform differential expression analysis with length normalized counts + limma

Description

A function to generate code that can be run to perform differential expression analysis of RNAseq data (comparing two conditions) by applying a length normalizing transformation followed by differential expression analysis with limma. The code is written to a .Rmd file. This function is generally not called by the user, the main interface for performing differential expression analysis is the runDiffExp function.

Usage

lengthNorm.limma.createRmd(
  data.path,
  result.path,
  codefile,
  norm.method,
  extra.design.covariates = NULL,
  length.normalization = "RPKM",
  data.transformation = "log2",
  trend = FALSE,
  block.factor = NULL
)

Arguments

`data.path`	The path to a .rds file containing the `phyloCompData` object that will be used for the differential expression analysis.
`result.path`	The path to the file where the result object will be saved.
`codefile`	The path to the file where the code will be written.
`norm.method`	The between-sample normalization method used to compensate for varying library sizes and composition in the differential expression analysis. The normalization factors are calculated using the `calcNormFactors` of the `edgeR` package. Possible values are `"TMM"`, `"RLE"`, `"upperquartile"` and `"none"`
`extra.design.covariates`	A vector containing the names of extra control variables to be passed to the design matrix of `limma`. All the covariates need to be a column of the `sample.annotations` data frame from the `phyloCompData` object, with a matching column name. The covariates can be a numeric vector, or a factor. Note that "condition" factor column is always included, and should not be added here. See Details.
`length.normalization`	one of "none" (no length correction), "TPM", or "RPKM" (default). See details.
`data.transformation`	one of "log2", "asin(sqrt)" or "sqrt". Data transformation to apply to the normalized data.
`trend`	should an intensity-trend be allowed for the prior variance? Default to `FALSE`.
`block.factor`	Name of the factor specifying a blocking variable, to be passed to `duplicateCorrelation` function of the `limma` package. All the factors need to be a `sample.annotations` from the `phyloCompData` object. Default to null (no block structure).

Details

For more information about the methods and the interpretation of the parameters, see the limma package and the corresponding publications.

The length.matrix field of the phyloCompData object is used to normalize the counts, using one of the following formulas:

length.normalization="none" : CPM_{gi} = \frac{N_{gi} + 0.5}{NF_i \times \sum_{g} N_{gi} + 1} \times 10^6
length.normalization="TPM" : TPM_{gi} = \frac{(N_{gi} + 0.5) / L_{gi}}{NF_i \times \sum_{g} N_{gi}/L_{gi} + 1} \times 10^6
length.normalization="RPKM" : RPKM_{gi} = \frac{(N_{gi} + 0.5) / L_{gi}}{NF_i \times \sum_{g} N_{gi} + 1} \times 10^9

where N_{gi} is the count for gene g and sample i, where L_{gi} is the length of gene g in sample i, and NF_i is the normalization for sample i, normalized using calcNormFactors of the edgeR package.

The function specified by the data.transformation is then applied to the normalized count matrix.

The "+0.5" and "+1" are taken from Law et al 2014, and dropped from the normalization when the transformation is something else than log2.

The "\times 10^6" and "\times 10^9" factors are omitted when the asin(sqrt) transformation is taken, as asin can only be applied to real numbers smaller than 1.

The design model used in the lmFit uses the "condition" column of the sample.annotations data frame from the phyloCompData object as well as all the covariates named in extra.design.covariates. For example, if extra.design.covariates = c("var1", "var2"), then sample.annotations must have two columns named "var1" and "var2", and the design formula in the lmFit function will be: ~ condition + var1 + var2.

Value

The function generates a .Rmd file containing the code for performing the differential expression analysis. This file can be executed using e.g. the knitr package.

Author(s)

Charlotte Soneson, Paul Bastide, Mélina Gallopin

References

Smyth GK (2005): Limma: linear models for microarray data. In: 'Bioinformatics and Computational Biology Solutions using R and Bioconductor'. R. Gentleman, V. Carey, S. Dudoit, R. Irizarry, W. Huber (eds), Springer, New York, pages 397-420

Smyth, G. K., Michaud, J., and Scott, H. (2005). The use of within-array replicate spots for assessing differential expression in microarray experiments. Bioinformatics 21(9), 2067-2075.

Law, C.W., Chen, Y., Shi, W. et al. (2014) voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol 15, R29.

Musser, JM, Wagner, GP. (2015): Character trees from transcriptome data: Origin and individuation of morphological characters and the so‐called “species signal”. J. Exp. Zool. (Mol. Dev. Evol.) 324B: 588– 604.

Examples

try(
if (require(limma)) {
tmpdir <- normalizePath(tempdir(), winslash = "/")
## Simulate data
mydata.obj <- generateSyntheticData(dataset = "mydata", n.vars = 1000, 
                                    samples.per.cond = 5, n.diffexp = 100, 
                                    id.species = factor(1:10),
                                    lengths.relmeans = rpois(1000, 1000),
                                    lengths.dispersions = rgamma(1000, 1, 1),
                                    output.file = file.path(tmpdir, "mydata.rds"))
## Add covariates
## Model fitted is count.matrix ~ condition + test_factor + test_reg
sample.annotations(mydata.obj)$test_factor <- factor(rep(1:2, each = 5))
sample.annotations(mydata.obj)$test_reg <- rnorm(10, 0, 1)
saveRDS(mydata.obj, file.path(tmpdir, "mydata.rds"))
## Diff Exp
runDiffExp(data.file = file.path(tmpdir, "mydata.rds"), result.extent = "length.limma", 
           Rmdfunction = "lengthNorm.limma.createRmd", 
           output.directory = tmpdir, norm.method = "TMM",
           extra.design.covariates = c("test_factor", "test_reg"))
})

csoneson/compcodeR documentation built on Dec. 23, 2024, 10:42 a.m.

csoneson/compcodeR index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

csoneson/compcodeR
RNAseq data simulation, differential expression analysis and performance comparison of differential expression methods

lengthNorm.limma.createRmd: Generate a '.Rmd' file containing code to perform...
In csoneson/compcodeR: RNAseq data simulation, differential expression analysis and performance comparison of differential expression methods

Generate a `.Rmd` file containing code to perform differential expression analysis with length normalized counts + limma

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Related to lengthNorm.limma.createRmd in csoneson/compcodeR...

R Package Documentation

Browse R Packages

We want your feedback!

csoneson/compcodeR RNAseq data simulation, differential expression analysis and performance comparison of differential expression methods

lengthNorm.limma.createRmd: Generate a '.Rmd' file containing code to perform... In csoneson/compcodeR: RNAseq data simulation, differential expression analysis and performance comparison of differential expression methods

Generate a .Rmd file containing code to perform differential expression analysis with length normalized counts + limma

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Related to lengthNorm.limma.createRmd in csoneson/compcodeR...

R Package Documentation

Browse R Packages

We want your feedback!

csoneson/compcodeR
RNAseq data simulation, differential expression analysis and performance comparison of differential expression methods

lengthNorm.limma.createRmd: Generate a '.Rmd' file containing code to perform...
In csoneson/compcodeR: RNAseq data simulation, differential expression analysis and performance comparison of differential expression methods

Generate a `.Rmd` file containing code to perform differential expression analysis with length normalized counts + limma