DEXSeqDataSet-class | R Documentation |
The DEXSeqDataSet
is a subclass of DESeqDataSet
,
specifically designed to adapt the DESeqDataSet
to test
for differences in exon usage.
DEXSeqDataSet( countData, sampleData,
design= ~ sample + exon + condition:exon,
featureID, groupID, featureRanges=NULL,
transcripts=NULL, alternativeCountData=NULL)
DEXSeqDataSetFromHTSeq(
countfiles, sampleData,
design= ~ sample + exon + condition:exon,
flattenedfile=NULL )
DEXSeqDataSetFromSE( SE,
design= ~ sample + exon + condition:exon )
countData |
A matrix of count data of non-negative integer values. The rows correspond to counts for each exon counting bin, the columns correspond to samples. Note that biological replicates should each get their own column, while the counts of technical replicates (i.e., several sequencing runs/lanes from the same sample) should be summed up into a single column |
alternativeCountData |
DEXSeq can be also used for test for differences in exon inclusion based on the number of reads supporting the inclusion of an exon and the number of reads supporting the exclusion of an exon. A matrix of count data of non-negative integer values.The rows correspond to exonic regions and the columns correspond to samples. This matrix should contain the number of exon-exon junction reads that skip each exon in each sample. If NULL, then the sum of the other exons belonging to the same gene is considered for testing (i.e. the normal DEXSeq approach). |
countfiles |
A character vector containing the path to the files that were originated with the script 'dexseq_count.py'. |
sampleData |
A |
design |
A formula which specifies the design of the experiment. It must specify an interaction term between a variable from the sampleData columns with the 'exon' variable. By default, the design will be '~ sample + exon + condition:exon'. This formula indicates the contrast between 'condition' and exon', i.e. differences in exon usage due to changes in the 'condition' variable. See the vignette for more examples of other designs. |
featureID |
A character vector of counting regions identifiers ordered according to the rows in countData. The identifiers names can be repeated between groups but not within groups. |
groupID |
A vector of group identifiers ordered according to its respective row in countData. It must reflect the sets of counting regions belonging to the same group, for example, exon bins in belonging to the same gene should have the same group identifier. |
featureRanges |
Optional. |
transcripts |
Optional. A |
flattenedfile |
A character vector containing the path to the flattened annotation file that was originated with the script 'dexseq_prepare_annotation.py'. |
SE |
A |
A DEXSeqDataSet object.
DEXSeqDataSetFromHTSeq
DEXSeqDataSetFromSE
## Not run:
#######################################
### From the output of the ###
### acconpaning python scripts ###
#######################################
inDir = system.file("extdata", package="pasilla", mustWork=TRUE)
flattenedfile = file.path(inDir, "Dmel.BDGP5.25.62.DEXSeq.chr.gff")
sampleData = data.frame(
condition = c( rep("treated", 3), rep("untreated", 4) ),
type = c("single", "paired", "paired", "single", "single", "paired", "paired") )
countFiles <- list.files(inDir, pattern="fb.txt")
rownames( sampleData ) <- countFiles
DEXSeqDataSetFromHTSeq(
countfiles=file.path( inDir, countFiles ),
sampleData = sampleData,
design = ~ sample + exon + type:exon + condition:exon,
flattenedfile=flattenedfile )
#######################################
### From GRanges derived objects ###
#######################################
library(GenomicRanges)
library(GenomicFeatures)
library(GenomicAlignments)
hse <- makeTxDbFromBiomart( biomart="ensembl",
dataset="hsapiens_gene_ensembl",
host="grch37.ensembl.org")
bamDir <- system.file(
"extdata", package="parathyroidSE", mustWork=TRUE )
fls <- list.files( bamDir, pattern="bam$", full=TRUE )
bamlst <- BamFileList(
fls, index=character(),
yieldSize=100000, obeyQname=TRUE )
exonicParts <- exonicParts( hse, linked.to.single.gene.only = TRUE )
SE <- summarizeOverlaps( exonicParts, bamlst,
mode="Union", singleEnd=FALSE,
ignore.strand=TRUE, inter.feature=FALSE, fragments=TRUE )
colData(SE)$condition <- c("A", "A", "B")
DEXSeqDataSetFromSE( SE,
design= ~ sample + exon + condition:exon )
#######################################
### From elementary data structures ###
#######################################
countData <- matrix( rpois(10000, 100), nrow=1000 )
sampleData <- data.frame(
condition=rep( c("untreated", "treated"), each=5 ) )
design <- formula( ~ sample + exon + condition:exon )
groupID <- rep(
paste0("gene", 1:10),
each= 100 )
featureID <- rep(
paste0("exon", 1:100),
times= 10 )
DEXSeqDataSet( countData, sampleData, design,
featureID, groupID )
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.