IRanges-constructor: The IRanges constructor and supporting functions

IRanges-constructorR Documentation

The IRanges constructor and supporting functions

Description

The IRanges function is a constructor that can be used to create IRanges instances.

solveUserSEW is a low-level utility function for solving a set of user-supplied start/end/width triplets.

Usage

## IRanges constructor:
IRanges(start=NULL, end=NULL, width=NULL, names=NULL, ...)

## Supporting functions (not for the end user):
solveUserSEW(refwidths, start=NA, end=NA, width=NA,
             rep.refwidths=FALSE,
             translate.negative.coord=TRUE,
             allow.nonnarrowing=FALSE)

Arguments

start, end, width

For IRanges: NULL or vector of integers.

For solveUserSEW: vector of integers (eventually with NAs).

names

A character vector or NULL.

...

Metadata columns to set on the IRanges object. All the metadata columns must be vector-like objects of the same length as the object to construct.

refwidths

Vector of non-NA non-negative integers containing the reference widths.

rep.refwidths

TRUE or FALSE. Use of rep.refwidths=TRUE is supported only when refwidths is of length 1.

translate.negative.coord, allow.nonnarrowing

TRUE or FALSE.

IRanges constructor

Return the IRanges object containing the ranges specified by start, end and width. Input falls into one of two categories:

Category 1

start, end and width are numeric vectors (or NULLs). If necessary they are recycled to the length of the longest (NULL arguments are filled with NAs). After this recycling, each row in the 3-column matrix obtained by binding those 3 vectors together is "solved" i.e. NAs are treated as unknown in the equation end = start + width - 1. Finally, the solved matrix is returned as an IRanges instance.

Category 2

The start argument is a logical vector or logical Rle object and IRanges(start) produces the same result as as(start, "IRanges"). Note that, in that case, the returned IRanges instance is guaranteed to be normal.

Note that the names argument is never recycled (to remain consistent with what `names<-` does on standard vectors).

Supporting functions

solveUserSEW(refwidths, start=NA, end=NA, width=NA, rep.refwidths=FALSE, translate.negative.coord=TRUE, allow.nonnarrowing=FALSE):

Use of rep.refwidths=TRUE is supported only when refwidths is of length 1. If rep.refwidths=FALSE (the default) then start, end and width are recycled to the length of refwidths (it's an error if one of them is longer than refwidths, or is of zero length while refwidths is not). If rep.refwidths=TRUE then refwidths is first replicated L times where L is the length of the longest of start, end and width. After this replication, start, end and width are recycled to the new length of refwidths (L) (it's an error if one of them is of zero length while L is != 0).

From now, refwidths, start, end and width are integer vectors of equal lengths. Each row in the 3-column matrix obtained by binding those 3 vectors together must contain at least one NA (otherwise an error is returned). Then each row is "solved" i.e. the 2 following transformations are performed (i is the indice of the row): (1) if translate.negative.coord is TRUE then a negative value of start[i] or end[i] is considered to be a -refwidths[i]-based coordinate so refwidths[i]+1 is added to it to make it 1-based; (2) the NAs in the row are treated as unknowns which values are deduced from the known values in the row and from refwidths[i].

The exact rules for (2) are the following. Rule (2a): if the row contains at least 2 NAs, then width[i] must be one of them (otherwise an error is returned), and if start[i] is one of them it is replaced by 1, and if end[i] is one of them it is replaced by refwidths[i], and finally width[i] is replaced by end[i] - start[i] + 1. Rule (2b): if the row contains only 1 NA, then it is replaced by the solution of the width[i] == end[i] - start[i] + 1 equation.

Finally, the set of solved rows is returned as an IRanges object of the same length as refwidths (after replication if rep.refwidths=TRUE).

Note that an error is raised if either (1) the set of user-supplied start/end/width values is invalid or (2) allow.nonnarrowing is FALSE and the ranges represented by the solved start/end/width values are not narrowing the ranges represented by the user-supplied start/end/width values.

Author(s)

Hervé Pagès

See Also

  • IRanges-class for the IRanges class.

  • narrow

Examples

## ---------------------------------------------------------------------
## A. USING THE IRanges() CONSTRUCTOR
## ---------------------------------------------------------------------
IRanges(start=11, end=rep.int(20, 5))
IRanges(start=11, width=rep.int(20, 5))
IRanges(-2, 20)  # only one range
IRanges(start=c(2, 0, NA), end=c(NA, NA, 14), width=11:0)
IRanges()  # IRanges instance of length zero
IRanges(names=character())

## With ranges specified as strings:
IRanges(c("11-20", "15-14", "-4--2"))

## With logical input:
x <- IRanges(c(FALSE, TRUE, TRUE, FALSE, TRUE))  # logical vector input
isNormal(x)  # TRUE
x <- IRanges(Rle(1:30) %% 5 <= 2)  # logical Rle input
isNormal(x)  # TRUE

## ---------------------------------------------------------------------
## B. USING solveUserSEW()
## ---------------------------------------------------------------------
refwidths <- c(5:3, 6:7)
refwidths

solveUserSEW(refwidths)
solveUserSEW(refwidths, start=4)
solveUserSEW(refwidths, end=3, width=2)
solveUserSEW(refwidths, start=-3)
solveUserSEW(refwidths, start=-3, width=2)
solveUserSEW(refwidths, end=-4)

## The start/end/width arguments are recycled:
solveUserSEW(refwidths, start=c(3, -4, NA), end=c(-2, NA))

## Using 'rep.refwidths=TRUE':
solveUserSEW(10, start=-(1:6), rep.refwidths=TRUE)
solveUserSEW(10, end=-(1:6), width=3, rep.refwidths=TRUE)

Bioconductor/IRanges documentation built on Nov. 17, 2024, 6:54 p.m.