readDGE: Read and Merge a Set of Files Containing Count Data

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/readDGE.R


Reads and merges a set of text files containing gene expression counts.


readDGE(files, path=NULL, columns=c(1,2), group=NULL, labels=NULL, ...)



character vector of filenames, or a data.frame of sample information containing a column called files.


character string giving the directory containing the files. Defaults to the current working directory.


numeric vector stating which columns of the input files contain the gene names and counts respectively.


optional vector or factor indicating the experimental group to which each file belongs.


character vector giving short names to associate with the files. Defaults to the file names.


other arguments are passed to read.delim.


Each file is assumed to contain digital gene expression data for one genomic sample or count library, with gene identifiers in the first column and counts in the second column. Gene identifiers are assumed to be unique and not repeated in any one file. The function creates a combined table of counts with rows for genes and columns for samples. A count of zero will be entered for any gene that was not found in any particular sample.

By default, the files are assumed to be tab-delimited and to contain column headings. Other file formats can be handled by adding arguments to be passed to read.delim. For example, use header=FALSE if there are no column headings and use sep="," to read a comma-separated file.

Instead of being a vector, the argument files can be a data.frame containing all the necessary sample information. In that case, the filenames and group identifiers can be given as columns files and group respectively, and the labels can be given as the row.names of the data.frame.


A DGEList object containing a matrix of counts, with a row for each unique tag found in the input files and a column for each input file.


Mark Robinson and Gordon Smyth

See Also

See read.delim for other possible arguments that can be accepted.

DGEList-class, DGEList.


#  Read all .txt files from current working directory

## Not run: files <- dir(pattern="*\\.txt$")
RG <- readDGE(files)
## End(Not run)

edgeR documentation built on March 18, 2018, 2:35 p.m.