enrichedRegions: Find significantly enriched regions in sequencing...
In htSeqTools: Quality Control, Visualization and Processing for High-Throughput Sequencing data

Description Usage Arguments Details Value Methods Examples

Find regions with a significant accumulation of reads in a sequencing experiment.

1
2
3

enrichedRegions(sample1, sample2, regions, minReads=10, mappedreads,
pvalFilter=0.05, exact=FALSE, p.adjust.method='none', twoTailed=FALSE,
mc.cores=1)

`sample1`	Either start and end of sequences in sample 1 (`IRangesList`, `RangedData` or `IRanges` object), of `list` with sequences for all samples (`sample2` must be left missing in this case) .
`sample2`	Same for sample 2. Can be left missing.
`regions`	If specified, the analysis is restricted to the regions indicated in `regions`, using the value columns featuring the read count for each sample. This is mutually exclusive with the default behaviour where `sample1` and/or `sample2` are provided.
`minReads`	The regions to be tested for enrichment are those with coverage greater or equal than `minReads`. If `sample1` is a `list`, the overall coverage adding all samples is used. Otherwise, if `twoTailed` is FALSE, only the reads in sample 1 are counted. If `twoTailed` is TRUE, the sum of reads in samples 1 and 2 are counted.
`mappedreads`	Number of mapped reads for the sample. Has to be of class integer. Will be used to compute RPKM.
`pvalFilter`	Only regions with P-value below `pvalFilter` are reported as being enriched.
`exact`	If set to TRUE, an exact test is used whenever some expected cell counts are 5 or less (chi-square test based on permutations if `sample1` is a `list` object, Fisher's exact test otherwise), i.e. when the asymptotic chi-square/likelihood-ratio test calculations break down. Ignored if `sample2` is missing, as in this case calculations are always exact.
`p.adjust.method`	P-value adjustment method, passed on to `p.adjust`.
`twoTailed`	If set to FALSE, only regions with a higher concentration of reads in sample 1 than in sample 2 are reported. If set to TRUE, regions with higher concentration of sample 2 reads are also reported. Ignored if `sample2` is missing.
`mc.cores`	If `mc.cores` is greater than 1, computations are performed in parallel for each element in the `IRangesList` objects. Whenever possible the `mclapply` function is used, therefore exactly `mc.cores` are used. For some signatures `mclapply` cannot be used, in which case the `parallel` function from package `parallel` is used. Note: the latter option launches as many parallel processes as there are elements in `x`, which can place strong demands on the processor and memory.

The calculations depend on whether sample2 is missing or not. Non-missing sample2 case. First, regions with coverage above minReads are selected. Second, the number of reads falling in the selected regions are computed for sample 1 and sample 2. Third, the counts are compared via a chi-square test (with Yates continuity correction), which takes into account the total number of sequences in each sample. Finally, statistically significant regions are selected and returned in RangedData or list objects.

Missing sample2. First, regions with coverage above minReads are selected. Second, the number of reads in sample 1 falling in the selected regions is computed. Third, the proportion of reads in each region is tested for enrichment via a one-tailed Binomial exact test.

Object of class RangedData indicating the significantly enriched regions, the number of reads in each sample for those regions, the fold changes (adjusted considering the overall number of sequences in each sample) and the chi-square test P-values.

signature(sample1 = "missing", sample2 = "missing", regions = "RangedData"): ranges(regions) indicates the chromosome, start and end of genomic regions, while values{regions} should indicate the observed number of reads for each group in each region. enrichedRegions tests the null hypothesis that the proportion of reads in the region is equal across all groups via a likelihood-ratio test (or permutation-based chi-square for regions where the expected counts are below 5 for some group).
signature(sample1 = "list", sample2 = "missing", regions = "missing"): Each element in sample1 contains the read start/end of an individual sample. enrichedRegions identifies regions with high concentration of reads (across all samples) and then compares the counts across groups using a likelihood-ratio test (or permutation-based chi-square for regions where the expected counts are below 5 for some group).
signature(sample1 = "RangedData", sample2 = "RangedData", regions = "missing"): space(sample1) indicates the chromosome, start(sample1) and end(sample1) the start/end position of the reads. Similarly for sample2. enrichedRegions identifies regions with high concentration of reads (across all samples) and then compares the counts across groups using a likelihood-ratio test (or permutation-based chi-square for regions where the expected counts are below 5 for some group).
signature(sample1 = "RangedData", sample2 = "missing", regions = "missing"): space(sample1) indicates the chromosome, start(sample1) and end(sample1) the start/end position of the reads. enrichedRegions tests the null hypothesis that an unusually high proportion of reads has been observed in the region using an exact binomial test.

set.seed(1)
st <- round(rnorm(1000,500,100))
strand <- rep(c('+','-'),each=500)
space <- rep('chr1',length(st))
sample1 <- RangedData(IRanges(st,st+38),strand=strand,space=space)
st <- round(rnorm(1000,1000,100))
sample2 <- RangedData(IRanges(st,st+38),strand=strand,space=space)
enrichedRegions(sample1,sample2,twoTailed=TRUE)

htSeqTools documentation built on May 6, 2019, 3:39 a.m.

htSeqTools index

Manual for the htSeqTools library

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

htSeqTools
Quality Control, Visualization and Processing for High-Throughput Sequencing data

enrichedRegions: Find significantly enriched regions in sequencing...
In htSeqTools: Quality Control, Visualization and Processing for High-Throughput Sequencing data

Description

Usage

Arguments

Details

Value

Methods

Examples

Related to enrichedRegions in htSeqTools...

R Package Documentation

Browse R Packages

We want your feedback!

htSeqTools Quality Control, Visualization and Processing for High-Throughput Sequencing data

enrichedRegions: Find significantly enriched regions in sequencing... In htSeqTools: Quality Control, Visualization and Processing for High-Throughput Sequencing data

Description

Usage

Arguments

Details

Value

Methods

Examples

Related to enrichedRegions in htSeqTools...

R Package Documentation

Browse R Packages

We want your feedback!

htSeqTools
Quality Control, Visualization and Processing for High-Throughput Sequencing data

enrichedRegions: Find significantly enriched regions in sequencing...
In htSeqTools: Quality Control, Visualization and Processing for High-Throughput Sequencing data