intra-range-methods: Intra range transformations of an IRanges, IPos, Views,...

intra-range-methodsR Documentation

Intra range transformations of an IRanges, IPos, Views, RangesList, or MaskCollection object

Description

Range-based transformations are grouped in 2 categories:

  1. Intra range transformations (e.g. shift()) transform each range individually (and independently of the other ranges). They return an object parallel to the input object, that is, where the i-th range corresponds to the i-th range in the input. Those transformations are described below.

  2. Inter range transformations (e.g. reduce()) transform all the ranges together as a set to produce a new set of ranges. They return an object that is generally NOT parallel to the input object. Those transformations are described in the inter-range-methods man page (see ?`inter-range-methods`).

Except for threebands(), all the transformations described in this man page are endomorphisms that operate on a single "range-based" object, that is, they transform the ranges contained in the input object and return them in an object of the same class as the input object.

Usage

shift(x, shift=0L, use.names=TRUE)

narrow(x, start=NA, end=NA, width=NA, use.names=TRUE)

resize(x, width, fix="start", use.names=TRUE, ...)

flank(x, width, start=TRUE, both=FALSE, use.names=TRUE, ...)

promoters(x, upstream=2000, downstream=200, use.names=TRUE, ...)
terminators(x, upstream=2000, downstream=200, use.names=TRUE, ...)

reflect(x, bounds, use.names=TRUE)

restrict(x, start=NA, end=NA, keep.all.ranges=FALSE, use.names=TRUE)

threebands(x, start=NA, end=NA, width=NA)

Arguments

x

An IRanges, IPos, Views, RangesList, or MaskCollection object.

shift

An integer vector containing the shift information. Recycled as necessary so that each element corresponds to a range in x.

Can also be a list-like object parallel to x if x is a RangesList object.

use.names

TRUE or FALSE. Should names be preserved?

start, end

If x is an IRanges, IPos or Views object: A vector of integers for all functions except for flank. For restrict, the supplied start and end arguments must be vectors of integers, eventually with NAs, that specify the restriction interval(s). Recycled as necessary so that each element corresponds to a range in x. Same thing for narrow and threebands, except that here start and end must contain coordinates relative to the ranges in x. See the Details section below. For flank, start is a logical indicating whether x should be flanked at the start (TRUE) or the end (FALSE). Recycled as necessary so that each element corresponds to a range in x.

Can also be list-like objects parallel to x if x is a RangesList object.

width

If x is an IRanges, IPos or Views object: For narrow and threebands, a vector of integers, eventually with NAs. See the SEW (Start/End/Width) interface for the details (?solveUserSEW). For resize and flank, the width of the resized or flanking regions. Note that if both is TRUE, this is effectively doubled. Recycled as necessary so that each element corresponds to a range in x.

Can also be a list-like object parallel to x if x is a RangesList object.

fix

If x is an IRanges, IPos or Views object: A character vector or character-Rle of length 1 or length(x) containing the values "start", "end", and "center" denoting what to use as an anchor for each element in x.

Can also be a list-like object parallel to x if x is a RangesList object.

...

Additional arguments for methods.

both

If TRUE, extends the flanking region width positions into the range. The resulting range thus straddles the end point, with width positions on either side.

upstream, downstream

Vectors of non-NA non-negative integers. Recycled as necessary so that each element corresponds to a range in x. Can also be list-like objects parallel to x if x is a RangesList object.

upstream defines the number of nucleotides toward the 5' end and downstream defines the number toward the 3' end, relative to the transcription start site. Promoter regions are formed by merging the upstream and downstream ranges.

Default values for the upstream and downstream arguments of promoters() were chosen based on our current understanding of gene regulation. On average, promoter regions in the mammalian genome are 5000 bp upstream and downstream of the transcription start site. Note that the same default values are used in terminators() at the moment. However this could be revisited if the case is made to use values that reflect more closely the biology of terminator regions in mammalian genomes.

bounds

An IRanges object to serve as the reference bounds for the reflection, see below.

keep.all.ranges

TRUE or FALSE. Should ranges that don't overlap with the restriction interval(s) be kept? Note that "don't overlap" means that they end strictly before start - 1 or start strictly after end + 1. Ranges that end at start - 1 or start at end + 1 are always kept and their width is set to zero in the returned IRanges object.

Details

Unless specified otherwise, when x is a RangesList object, any transformation described here is equivalent to applying the transformation to each list element in x.

shift

shift shifts all the ranges in x by the amount specified by the shift argument.

narrow

narrow narrows the ranges in x i.e. each range in the returned IntegerRanges object is a subrange of the corresponding range in x. The supplied start/end/width values are solved by a call to solveUserSEW(width(x), start=start, end=end, width=width) and therefore must be compliant with the rules of the SEW (Start/End/Width) interface (see ?solveUserSEW for the details). Then each subrange is derived from the original range according to the solved start/end/width values for this range. Note that those solved values are interpreted relatively to the original range.

resize

resize resizes the ranges to the specified width where either the start, end, or center is used as an anchor.

flank

flank generates flanking ranges for each range in x. If start is TRUE for a given range, the flanking occurs at the start, otherwise the end. The widths of the flanks are given by the width parameter. The widths can be negative, in which case the flanking region is reversed so that it represents a prefix or suffix of the range in x. The flank operation is illustrated below for a call of the form flank(x, 3, TRUE), where x indicates a range in x and - indicates the resulting flanking region:

    ---xxxxxxx

If start were FALSE:

       xxxxxxx---

For negative width, i.e. flank(x, -3, FALSE), where * indicates the overlap between x and the result:

       xxxx***

If both is TRUE, then, for all ranges in x, the flanking regions are extended into (or out of, if width is negative) the range, so that the result straddles the given endpoint and has twice the width given by width. This is illustrated below for flank(x, 3, both=TRUE):

    ---***xxxx

promoters and terminators

promoters generates promoter ranges for each range in x relative to the transcription start site (TSS), where TSS is start(x). The promoter range is expanded around the TSS according to the upstream and downstream arguments. upstream represents the number of nucleotides in the 5' direction and downstream the number in the 3' direction. The full range is defined as, (start(x) - upstream) to (start(x) + downstream - 1). For documentation for using promoters on a GRanges object see ?`promoters,GenomicRanges-method` in the GenomicRanges package.

terminators is similar to promoters except that the generated ranges are relative to the transcription end sites (TES) returned by end(x).

reflect

reflect "reflects" or reverses each range in x relative to the corresponding range in bounds, which is recycled as necessary. Reflection preserves the width of a range, but shifts it such the distance from the left bound to the start of the range becomes the distance from the end of the range to the right bound. This is illustrated below, where x represents a range in x and [ and ] indicate the bounds:

      [..xxx.....]
      becomes
      [.....xxx..]

restrict

restrict restricts the ranges in x to the interval(s) specified by the start and end arguments.

threebands

threebands extends the capability of narrow by returning the 3 ranges objects associated to the narrowing operation. The returned value y is a list of 3 ranges objects named "left", "middle" and "right". The middle component is obtained by calling narrow with the same arguments (except that names are dropped). The left and right components are also instances of the same class as x and they contain what has been removed on the left and right sides (respectively) of the original ranges during the narrowing.

Note that original object x can be reconstructed from the left and right bands with punion(y$left, y$right, fill.gap=TRUE).

Author(s)

H. Pagès, M. Lawrence, and P. Aboyoun

See Also

  • inter-range-methods for inter range transformations.

  • The IRanges, IPos, Views, RangesList, and MaskCollection classes.

  • The intra-range-methods man page in the GenomicRanges package for intra range transformations of genomic ranges.

  • setops-methods for set operations on IRanges objects.

  • endoapply in the S4Vectors package.

Examples

## ---------------------------------------------------------------------
## shift()
## ---------------------------------------------------------------------

## On an IRanges object:
ir1 <- successiveIRanges(c(19, 5, 0, 8, 5))
ir1
shift(ir1, shift=-3)

## On an IRangesList object:
range1 <- IRanges(start=c(1, 2, 3), end=c(5, 2, 8))
range2 <- IRanges(start=c(15, 45, 20, 1), end=c(15, 100, 80, 5))
range3 <- IRanges(start=c(-2, 6, 7), width=c(8, 0, 0))  # with empty ranges
collection <- IRangesList(one=range1, range2, range3)
shift(collection, shift=5)  # same as endoapply(collection, shift, shift=5)

## Sanity check:
res1 <- shift(collection, shift=5)
res2 <- endoapply(collection, shift, shift=5)
stopifnot(identical(res1, res2))

## ---------------------------------------------------------------------
## narrow()
## ---------------------------------------------------------------------

## On an IRanges object:
ir2 <- ir1[width(ir1) != 0]
narrow(ir2, start=4, end=-2)
narrow(ir2, start=-4, end=-2)
narrow(ir2, end=5, width=3)
narrow(ir2, start=c(3, 4, 2, 3), end=c(12, 5, 7, 4))

## On an IRangesList object:
narrow(collection[-3], start=2)
narrow(collection[-3], end=-2)

## On a MaskCollection object:
mask1 <- Mask(mask.width=29, start=c(11, 25, 28), width=c(5, 2, 2))
mask2 <- Mask(mask.width=29, start=c(3, 10, 27), width=c(5, 8, 1))
mask3 <- Mask(mask.width=29, start=c(7, 12), width=c(2, 4))
mymasks <- append(append(mask1, mask2), mask3)
mymasks
narrow(mymasks, start=8)

## ---------------------------------------------------------------------
## resize()
## ---------------------------------------------------------------------

## On an IRanges object:
resize(ir2, 200)
resize(ir2, 2, fix="end")

## On an IRangesList object:
resize(collection, width=200)

## ---------------------------------------------------------------------
## flank()
## ---------------------------------------------------------------------

## On an IRanges object:
ir3 <- IRanges(start=c(2,5,1), end=c(3,7,3))
flank(ir3, 2)
flank(ir3, 2, start=FALSE)
flank(ir3, 2, start=c(FALSE, TRUE, FALSE))
flank(ir3, c(2, -2, 2))
flank(ir3, 2, both = TRUE)
flank(ir3, 2, start=FALSE, both=TRUE)
flank(ir3, -2, start=FALSE, both=TRUE)

## On an IRangesList object:
flank(collection, width=10)

## ---------------------------------------------------------------------
## promoters()
## ---------------------------------------------------------------------

## On an IRanges object:
ir4 <- IRanges(start=10001:10004, end=12000)
promoters(ir4, upstream=800, downstream=0)
promoters(ir4, upstream=0, downstream=50)
promoters(ir4, upstream=800, downstream=50)

promoters(ir4, upstream=0, downstream=1)  # TSS

## On an IRangesList object:
promoters(collection, upstream=5, downstream=2)

## ---------------------------------------------------------------------
## reflect()
## ---------------------------------------------------------------------

## On an IRanges object:
bounds <- IRanges(start=c(0, 5, 3), end=c(10, 6, 9))
reflect(ir3, bounds)

## reflect() does not yet support IRangesList objects!

## ---------------------------------------------------------------------
## restrict()
## ---------------------------------------------------------------------

## On an IRanges object:
restrict(ir1, start=12, end=34)
restrict(ir1, start=20)
restrict(ir1, start=21)
restrict(ir1, start=21, keep.all.ranges=TRUE)

## On an IRangesList object:
restrict(collection, start=2, end=8)
restrict(collection, start=2, end=8, keep.all.ranges=TRUE)

## ---------------------------------------------------------------------
## threebands()
## ---------------------------------------------------------------------

## On an IRanges object:
z <- threebands(ir2, start=4, end=-2)
ir2b <- punion(z$left, z$right, fill.gap=TRUE)
stopifnot(identical(ir2, ir2b))
threebands(ir2, start=-5)

## threebands() does not support IRangesList objects.

Bioconductor/IRanges documentation built on Dec. 7, 2024, 1:19 p.m.