BSgenome-utils: BSgenome utilities

Description Usage Arguments Value Author(s) See Also Examples

Description

Utilities for BSgenome objects.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
## S4 method for signature 'BSgenome'
matchPWM(pwm, subject, min.score = "80%", exclude = "",
       maskList = logical(0))
## S4 method for signature 'BSgenome'
countPWM(pwm, subject, min.score = "80%", exclude = "", 
       maskList = logical(0))
## S4 method for signature 'BSgenome'
vmatchPattern(pattern, subject, max.mismatch = 0, min.mismatch = 0,
            with.indels = FALSE, fixed = TRUE, algorithm = "auto",
            exclude = "", maskList = logical(0),  userMask =
               IRangesList(), invertUserMask = FALSE)
## S4 method for signature 'BSgenome'
vcountPattern(pattern, subject, max.mismatch = 0, min.mismatch = 0,
            with.indels = FALSE, fixed = TRUE, algorithm = "auto",
            exclude = "", maskList = logical(0),  userMask =
               IRangesList(), invertUserMask = FALSE)
## S4 method for signature 'BSgenome'
vmatchPDict(pdict, subject, max.mismatch = 0, min.mismatch = 0,
          fixed = TRUE, algorithm = "auto", verbose = FALSE,
          exclude = "", maskList = logical(0))
## S4 method for signature 'BSgenome'
vcountPDict(pdict, subject, max.mismatch = 0, min.mismatch = 0,
          fixed = TRUE, algorithm = "auto", collapse = FALSE,
          weight = 1L, verbose = FALSE, exclude = "", maskList = logical(0))

Arguments

pwm

A numeric matrix with row names A, C, G and T representing a Position Weight Matrix.

pattern

A DNAString object containing the pattern sequence.

pdict

A DNAStringSet object containing the pattern sequences.

subject

A BSgenome object containing the subject sequences.

min.score

The minimum score for counting a match. Can be given as a character string containing a percentage (e.g. "85%") of the highest possible score or as a single number.

max.mismatch, min.mismatch

The maximum and minimum number of mismatching letters allowed (see ?`lowlevel-matching` for the details). If non-zero, an inexact matching algorithm is used.

with.indels

If TRUE then indels are allowed. In that case, min.mismatch must be 0 and max.mismatch is interpreted as the maximum "edit distance" allowed between any pattern and any of its matches (see ?`matchPattern` for the details).

fixed

If FALSE then IUPAC extended letters are interpreted as ambiguities (see ?`lowlevel-matching` for the details).

algorithm

For vmatchPattern and vcountPattern one of the following: "auto", "naive-exact", "naive-inexact", "boyer-moore", "shift-or", or "indels".

For vmatchPDict and vcountPDict one of the following: "auto", "naive-exact", "naive-inexact", "boyer-moore", or "shift-or".

collapse, weight

ignored arguments.

verbose

TRUE or FALSE.

exclude

A character vector with strings that will be used to filter out chromosomes whose names match these strings.

maskList

A named logical vector of maskStates preferred when used with a BSGenome object. When using the bsapply function, the masks will be set to the states in this vector.

userMask

An IntegerRangesList, containing a mask to be applied to each chromosome. See bsapply.

invertUserMask

Whether the userMask should be inverted.

Value

A GRanges object for matchPWM with two elementMetadata columns: "score" (numeric), and "string" (DNAStringSet).

A GRanges object for vmatchPattern.

A GRanges object for vmatchPDict with one elementMetadata column: "index", which represents a mapping to a position in the original pattern dictionary.

A data.frame object for countPWM and vcountPattern with three columns: "seqname" (factor), "strand" (factor), and "count" (integer).

A DataFrame object for vcountPDict with four columns: "seqname" ('factor' Rle), "strand" ('factor' Rle), "index" (integer) and "count" ('integer' Rle). As with vmatchPDict the index column represents a mapping to a position in the original pattern dictionary.

Author(s)

P. Aboyoun

See Also

matchPWM, matchPattern, matchPDict, bsapply

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
  library(BSgenome.Celegans.UCSC.ce2)
  data(HNF4alpha)

  pwm <- PWM(HNF4alpha)
  matchPWM(pwm, Celegans)
  countPWM(pwm, Celegans)

  pattern <- consensusString(HNF4alpha)
  vmatchPattern(pattern, Celegans, fixed = "subject")
  vcountPattern(pattern, Celegans, fixed = "subject")

  vmatchPDict(HNF4alpha[1:10], Celegans)
  vcountPDict(HNF4alpha[1:10], Celegans)

BSgenome documentation built on May 6, 2019, 2:29 a.m.