easyRNASeq-easyRNASeq: easyRNASeq method

Description Usage Arguments Details Value Author(s) See Also Examples

Description

This function is a wrapper around the more low level functionalities of the package. Is the easiest way to get a count matrix from a set of read files. It does the following:

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
## S4 method for signature 'character'
easyRNASeq(
  filesDirectory = getwd(),
  organism = character(1),
  chr.sizes = c("auto"),
  readLength = integer(1),
  annotationMethod = c("biomaRt", "env", "gff", "gtf", "rda"),
  annotationFile = character(1),
  annotationObject = GRangesList(),
  format = c("bam", "aln"),
  gapped = FALSE,
  count = c("exons", "features", "genes", "islands", "transcripts"),
  outputFormat = c("matrix", "SummarizedExperiment", "DESeq", "edgeR", "RNAseq"),
  pattern = character(1),
  filenames = character(0),
  nbCore = 1,
  filter = srFilter(),
  type = "SolexaExport",
  chr.sel = c(),
  summarization = c("bestExons", "geneModels"),
  normalize = FALSE,
  max.gap = integer(1),
  min.cov = 1L,
  min.length = integer(1),
  plot = TRUE,
  conditions = c(),
  validity.check = TRUE,
  chr.map = data.frame(),
  ignoreWarnings = FALSE,
  silent = FALSE,
  ...
)

Arguments

filesDirectory

The directory where the files to be used are located. Defaults to the current directory.

organism

A character string describing the organism

chr.sizes

A vector or a list containing the chromosomes' size of the selected organism or simply the string "auto". See details.

readLength

The read length in bp

annotationMethod

The method to fetch the annotation, one of "biomaRt","env","gff","gtf" or "rda". All methods but "biomaRt" and "env" require the annotationFile to be set. The "env" method requires the annotationObject to be set.

annotationFile

The location (full path) of the annotation file

annotationObject

A GRangesList object containing the annotation.

format

The format of the reads, one of "aln","bam". If not "bam", all the types supported by the ShortRead package are supported too. As of version 1.3.5, it defaults to bam.

gapped

Is the bam file provided containing gapped alignments?

count

The feature used to summarize the reads. One of 'exons','features','genes','islands' or 'transcripts'. See details.

outputFormat

By default, easyRNASeq returns a matrix. If one of DESeq,edgeR,RNAseq, SummarizedExperiment is provided then the respective object is returned.

pattern

For easyRNASeq, the pattern of file to look for, e.g. "bam$"

filenames

The name, not the path, of the files to use

nbCore

defines how many CPU core to use when computing the geneModels. Use the default parallel library

filter

The filter to be applied when loading the data using the "aln" format

type

The type of data when using the "aln" format. See the ShortRead library.

chr.sel

A vector of chromosome names to subset the final results.

summarization

A character defining which method to use when summarizing reads by genes. So far, only "geneModels" is available.

normalize

A boolean to convert the returned counts in RPKM. Valid when the outputFormat is left undefined (i.e. when a matrix is returned) and when it is DESeq or edgeR. Note that it is not advised to normalize the data prior DESeq or edgeR usage!

max.gap

When computing read islands, the maximal gap size allowed between two islands to merge them

min.cov

When computing read islands, the minimal coverage to take into account for calling an island

min.length

The minimal size an island should have to be kept

plot

Whether or not to plot assessment graphs.

conditions

A vector of descriptor, each sample must have a descriptor if you use outputFormat DESeq or edgeR. The size of this list must be equal to the number of sample. In addition the vector should be named with the filename of the corresponding samples.

validity.check

Shall UCSC chromosome name convention be enforced? This is only supported for a set of organisms, which are Dmelanogaster, Hsapiens, Mmusculus and Rnorvegicus; otherwise the argument 'chr.map' can be used to complement it.

chr.map

A data.frame describing the mapping of original chromosome names towards wished chromosome names. See details.

ignoreWarnings

set to TRUE (bad idea! they have a good reason to be there) if you do not want warning messages.

silent

set to TRUE if you do not want messages to be printed out.

...

additional arguments. See details

Details

Value

Returns a count table (a matrix of m features x n samples). If the outputFormat option has been set, a corresponding object is returned: a RangedSummarizedExperiment, a DESeq:newCountDataset, a edgeR:DGEList or RNAseq.

Author(s)

Nicolas Delhomme

See Also

RNAseq RangedSummarizedExperiment edgeR:DGEList DESeq:newCountDataset ShortRead:readAligned

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
 ## Not run: 
	library(BSgenome.Dmelanogaster.UCSC.dm3)

 # get the example data
 tdir <- tutorialData()


 # get an example annotation file
 gAnnot.rda <- fetchData("gAnnot.rda")

	# creating a count table from 4 bam files
	count.table <- easyRNASeq(filesDirectory="tdir",
					pattern="[A,C,T,G]{6}\\.bam$",
				format="bam",
				readLength=36L,
				organism="Dmelanogaster",
				chr.sizes=seqlengths(Dmelanogaster),
				annotationMethod="rda",
				annotationFile=gAnnot.rda,
				count="exons")

	# an example of a chr.map
	chr.map <- data.frame(from=c("2L","2R","MT"),to=c("chr2L","chr2R","chrMT"))

	# an example of a GRangesList annotation
	grngs <- GRanges(seqnames=c("chr01","chr01","chr02"),
                     ranges=IRanges(
                             start=c(10,30,100),
                             end=c(21,53,123)),
                          strand=c("+","+","-"),
                          transcript=c("trA1","trA2","trB"),
                          gene=c("gA","gA","gB"),
                          exon=c("e1","e2","e3")
                          )

	grngsList<-split(grngs,seqnames(grngs))

## End(Not run)

easyRNASeq documentation built on April 30, 2020, 2 a.m.