MaskCollection-class: MaskCollection objects

Description Details Basic accessor methods Constructor Other methods Subsetting and appending Other methods Author(s) See Also Examples

Description

The MaskCollection class is a container for storing a collection of masks that can be used to mask regions in a sequence.

Details

In the context of the Biostrings package, a mask is a set of regions in a sequence that need to be excluded from some computation. For example, when calling alphabetFrequency or matchPattern on a chromosome sequence, you might want to exclude some regions like the centromere or the repeat regions. This can be achieved by putting one or several masks on the sequence before calling alphabetFrequency on it.

A MaskCollection object is a vector-like object that represents such set of masks. Like standard R vectors, it has a "length" which is the number of masks contained in it. But unlike standard R vectors, it also has a "width" which determines the length of the sequences it can be "put on". For example, a MaskCollection object of width 20000 can only be put on an XString object of 20000 letters.

Each mask in a MaskCollection object x is just a finite set of integers that are >= 1 and <= width(x). When "put on" a sequence, these integers indicate the positions of the letters to mask. Internally, each mask is represented by a NormalIRanges object.

Basic accessor methods

In the code snippets below, x is a MaskCollection object.

length(x): The number of masks in x.

width(x): The common with of all the masks in x. This determines the length of the sequences that x can be "put on".

active(x): A logical vector of the same length as x where each element indicates whether the corresponding mask is active or not.

names(x): NULL or a character vector of the same length as x.

desc(x): NULL or a character vector of the same length as x.

nir_list(x): A list of the same length as x, where each element is a NormalIRanges object representing a mask in x.

Constructor

Mask(mask.width, start=NULL, end=NULL, width=NULL): Return a single mask (i.e. a MaskCollection object of length 1) of width mask.width (a single integer >= 1) and masking the ranges of positions specified by start, end and width. See the IRanges constructor (?IRanges) for how start, end and width can be specified. Note that the returned mask is active and unnamed.

Other methods

In the code snippets below, x is a MaskCollection object.

isEmpty(x): Return a logical vector of the same length as x, indicating, for each mask in x, whether it's empty or not.

max(x): The greatest (or last, or rightmost) masked position for each mask. This is a numeric vector of the same length as x.

min(x): The smallest (or first, or leftmost) masked position for each mask. This is a numeric vector of the same length as x.

maskedwidth(x): The number of masked position for each mask. This is an integer vector of the same length as x where all values are >= 0 and <= width(x).

maskedratio(x): maskedwidth(x) / width(x)

Subsetting and appending

In the code snippets below, x and values are MaskCollection objects.

x[i]: Return a new MaskCollection object made of the selected masks. Subscript i can be a numeric, logical or character vector.

x[[i, exact=TRUE]]: Extract the mask selected by i as a NormalIRanges object. Subscript i can be a single integer or a character string.

append(x, values, after=length(x)): Add masks in values to x.

Other methods

In the code snippets below, x is a MaskCollection object.

collapse(x): Return a MaskCollection object of length 1 obtained by collapsing all the active masks in x.

Author(s)

Hervé Pagès

See Also

NormalIRanges-class, read.Mask, MaskedXString-class, reverse, alphabetFrequency, matchPattern

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  ## Making a MaskCollection object:
  mask1 <- Mask(mask.width=29, start=c(11, 25, 28), width=c(5, 2, 2))
  mask2 <- Mask(mask.width=29, start=c(3, 10, 27), width=c(5, 8, 1))
  mask3 <- Mask(mask.width=29, start=c(7, 12), width=c(2, 4))
  mymasks <- append(append(mask1, mask2), mask3)
  mymasks
  length(mymasks)
  width(mymasks)
  collapse(mymasks)

  ## Names and descriptions:
  names(mymasks) <- c("A", "B", "C")  # names should be short and unique...
  mymasks
  mymasks[c("C", "A")]  # ...to make subsetting by names easier
  desc(mymasks) <- c("you can be", "more verbose", "here")
  mymasks[-2]

  ## Activate/deactivate masks:
  active(mymasks)["B"] <- FALSE
  mymasks
  collapse(mymasks)
  active(mymasks) <- FALSE  # deactivate all masks
  mymasks
  active(mymasks)[-1] <- TRUE  # reactivate all masks except mask 1
  active(mymasks) <- !active(mymasks)  # toggle all masks

  ## Other advanced operations:
  mymasks[[2]]
  length(mymasks[[2]])
  mymasks[[2]][-3]
  append(mymasks[-2], gaps(mymasks[2]))

Example output

Loading required package: BiocGenerics
Loading required package: parallel

Attaching package:BiocGenericsThe following objects are masked frompackage:parallel:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked frompackage:stats:

    IQR, mad, sd, var, xtabs

The following objects are masked frompackage:base:

    anyDuplicated, append, as.data.frame, basename, cbind, colnames,
    dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
    rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which.max, which.min

Loading required package: S4Vectors
Loading required package: stats4

Attaching package:S4VectorsThe following object is masked frompackage:base:

    expand.grid

MaskCollection of length 3 and width 29
masks:
  maskedwidth maskedratio active
1           9   0.3103448   TRUE
2          14   0.4827586   TRUE
3           6   0.2068966   TRUE
all masks together:
  maskedwidth maskedratio
           19   0.6551724
[1] 3
[1] 29
MaskCollection of length 1 and width 29
masks:
  maskedwidth maskedratio active
1          19   0.6551724   TRUE
MaskCollection of length 3 and width 29
masks:
  maskedwidth maskedratio active names
1           9   0.3103448   TRUE     A
2          14   0.4827586   TRUE     B
3           6   0.2068966   TRUE     C
all masks together:
  maskedwidth maskedratio
           19   0.6551724
MaskCollection of length 2 and width 29
masks:
  maskedwidth maskedratio active names
1           6   0.2068966   TRUE     C
2           9   0.3103448   TRUE     A
all masks together:
  maskedwidth maskedratio
           11   0.3793103
MaskCollection of length 2 and width 29
masks:
  maskedwidth maskedratio active names       desc
1           9   0.3103448   TRUE     A you can be
2           6   0.2068966   TRUE     C       here
all masks together:
  maskedwidth maskedratio
           11   0.3793103
MaskCollection of length 3 and width 29
masks:
  maskedwidth maskedratio active names         desc
1           9   0.3103448   TRUE     A   you can be
2          14   0.4827586  FALSE     B more verbose
3           6   0.2068966   TRUE     C         here
all masks together:
  maskedwidth maskedratio
           19   0.6551724
all active masks together:
  maskedwidth maskedratio
           11   0.3793103
MaskCollection of length 1 and width 29
masks:
  maskedwidth maskedratio active
1          11   0.3793103   TRUE
MaskCollection of length 3 and width 29
masks:
  maskedwidth maskedratio active names         desc
1           9   0.3103448  FALSE     A   you can be
2          14   0.4827586  FALSE     B more verbose
3           6   0.2068966  FALSE     C         here
all masks together:
  maskedwidth maskedratio
           19   0.6551724
all active masks together:
  maskedwidth maskedratio
            0           0
NormalIRanges object with 3 ranges and 0 metadata columns:
          start       end     width
      <integer> <integer> <integer>
  [1]         3         7         5
  [2]        10        17         8
  [3]        27        27         1
[1] 3
NormalIRanges object with 2 ranges and 0 metadata columns:
          start       end     width
      <integer> <integer> <integer>
  [1]         3         7         5
  [2]        10        17         8
MaskCollection of length 3 and width 29
masks:
  maskedwidth maskedratio active names       desc
1           9   0.3103448   TRUE     A you can be
2           6   0.2068966  FALSE     C       here
3          15   0.5172414  FALSE                 
all masks together:
  maskedwidth maskedratio
           21   0.7241379
all active masks together:
  maskedwidth maskedratio
            9   0.3103448

IRanges documentation built on Dec. 14, 2020, 2 a.m.