# Calculate statistics for regions in the genome

### Description

For each region of interest or TSS, this routine interrogates probes or sequence
data for either a high level of absolute signal or a change in signal for some
specified contrast of interest. Regions can be surroundings of TSSs, or can be
user-specified regions. The function determines if the `start`

and `end`

coordinates of `anno`

should be used as regions or as TSSs, if the up and down
coordinates are `NULL`

or are numbers.

### Usage

The ANY,data.frame method:

`blocksStats{ANY,data.frame}(x, anno, ...)`

The ANY,GRanges method:

`blocksStats{ANY,GRanges}(x, anno, up = NULL, down = NULL, ...)`

### Arguments

- x:
A

`GRangesList`

,`AffymetrixCelSet`

, or a`data.frame`

of data. Or a`character`

vector of BAM paths to the location of the BAM files.- anno:
Either a

`data.frame`

or a`GRanges`

giving the gene coordinates or regions of interest. If it is a`data.frame`

, then the column names are (at least)`chr`

,`name`

,`start`

,`end`

. Column`strand`

is also mandatory, if`up`

and`down`

are`NULL`

.- seq.len:
If sequencing reads need to be extended, the fragment size to be used.

- p.anno:
A

`data.frame`

with (at least) columns`chr`

,`position`

, and`index`

. This is an optional parameter of the`AffymetrixCelSet`

method, because it can be automatically retrieved for such array data. The parameter is also optional, if`mapping`

is not`NULL`

.- mapping:
If a mapping with

`annotationLookup`

or`annotationBlocksLookup`

has already been done, it can be passed in, and avoids unnecessary re-conmputing of the mapping list within`blocksStats`

.- chrs:
If

`p.anno`

is`NULL`

, and is retrieved from an ACP file, this vector gives the textual names of the chromosomes.- log2.adj:
Whether to take $log_2$ of array intensities.

- design:
A design matrix specifying the contrast to compute (i.e. The samples to use and what differences to take.).

- up:
The number of bases upstream to consider in calculation of statistics. If not provided, the starts and ends in

`anno`

are used as region boundaries.- down:
The number of bases upstream to consider in calculation of statistics. If not provided, the starts and ends in

`anno`

are used as region boundaries.- lib.size:
A string that indicates whether to use the total lane count, total count within regions specified by

`anno`

, or normalisation to a reference lane by the negative binomial quantile-to-quantile method, as the library size for each lane. For total lane count use`"lane"`

, for region sums use`"blocks"`

, and for the normalisation use`"ref"`

.- robust:
Numeric. If it is 0, then a robust linear model is not fitted. If it is greater than 0, a robust linear model is used, and the number specifies the minimum number of probes a region has to have, for statistics to be reported for that region.

- p.adj:
The method used to adjust p-values for multiple testing. Possible values are listed in

`p.adjust`

.- Acutoff:
If

`libSize`

is`"ref"`

, this argument must be provided. Otherwise, it must not. This parameter is a cutoff on the "A" values to take, before calculating trimmed mean.- verbose:
Logical; whether to output commments of the processing.

- ...
Parameters described above, that are not used in the function called, but are passed further into a private function that uses them in its processing.

### Details

For array data, the statstics are either determined by a t-test, or a linear model. For sequencing data, the two groups are assumed to be from a negative binomial distribution, and an exact test is used.

### Value

A `data.frame`

, with the same number of rows as there are features described
by `anno`

, but with additional columns for the statistics calculated at each
feature.

### Author(s)

Mark Robinson

### See Also

`annotationLookup`

and `annotationBlocksLookup`

### Examples

1 2 3 4 5 6 7 8 9 10 11 | ```
require(GenomicRanges)
intensities <- matrix(c(6.8, 6.5, 6.7, 6.7, 6.9,
8.8, 9.0, 9.1, 8.0, 8.9), ncol = 2)
colnames(intensities) <- c("Normal", "Cancer")
d.matrix <- matrix(c(-1, 1))
colnames(d.matrix) <- "Cancer-Normal"
probe.anno <- data.frame(chr = rep("chr1", 5),
position = c(4000, 5100, 6000, 7000, 8000),
index = 1:5)
anno <- GRanges("chr1", IRanges(7500, 10000), '+', name = "Gene 1")
blocksStats(intensities, anno, 2500, 2500, probe.anno, log2.adj = FALSE, design = d.matrix)
``` |