readscount2deseq: Iniitialize a DESeq2 object from raw reads count matrix.

Description Usage Arguments Value Examples

View source: R/transcriptome.R

Description

Iniitialize a DESeq2 object from raw reads count matrix.

Usage

1
2
3
4
5
6
7
8
readscount2deseq(
  count_matrix_file,
  sampleFile,
  design,
  covariate = NULL,
  filter = NULL,
  rundeseq = T
)

Arguments

count_matrix_file

A multiple column file with the first column as gene names (must be unique) and other columns as gene expression reads count in related samples.

Gene untrt_N61311  untrt_N052611 ... trt_N61311  trt_N052611 ...
GeneA  2 3 ... 10  20  ...
GeneB  2 3 ... 100  220  ...
GeneC  12 33 ... 10  20  ...
GeneD  222 301 ... 10  20  ...
sampleFile

A file containing at least two columns. The first column is sample name just like the first column of salmon_file_list. Other columns are sample attributes. Normally one of sample attributes should contain the group information each sample belongs to.

One simple example (conditions represent group information)

Samp    conditions
untrt_N61311    untrt
untrt_N052611    untrt
untrt_N080611    untrt
untrt_N061011    untrt
trt_N61311    trt
trt_N052611    trt
trt_N080611    trt
trt_N061011    trt

Another example (3rd column meaning samples from two batches)

Samp    conditions  batch
untrt_N61311    untrt A
untrt_N052611    untrt A
untrt_N080611    untrt B
untrt_N061011    untrt B
trt_N61311    trt A
trt_N052611    trt A
trt_N080611    trt B
trt_N061011    trt B
design

A column name from "sampleFile" like "conditions" in example. This will be used as group variable for DE tests. Currently only simple design is allowed. If one wants to model multiple variables, construct one representation of super variable as indicated in https://support.bioconductor.org/p/67600/#67612 may be useful.

covariate

Names of columns containing informations maybe covariates like batch effects or other sample info. Multiple covariates should be supplied as a vector.

filter

Filter genes with low read counts. Default genes with total reads count lower than half of number of samples will be filtered out. One can give any number here. Normally default is OK. The DESeq2 will ao auto filter too. Check https://www.bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html.

rundeseq

Default TRUE. The function will perfrom deseq analysis using DESeq and return analyzed DESeqDataSet object. If FALSE, just return a DESeqDataSet object and one can run DESeqon it with more customed parameters.

Value

A DESeqDataSet object.

Examples

1
dds <- readscount2deseq(count_matrix_file, sampleFile, "conditions")

Tong-Chen/YSX documentation built on Jan. 25, 2021, 2:49 a.m.