PairwiseAlignments-class: PairwiseAlignments, PairwiseAlignmentsSingleSubject, and...

Description Usage Arguments Details Object extraction methods General information methods Aligned sequence methods Subject position methods Numeric summary methods Subsetting methods Author(s) See Also Examples

Description

The PairwiseAlignments class is a container for storing a set of pairwise alignments.

The PairwiseAlignmentsSingleSubject class is a container for storing a set of pairwise alignments with a single subject.

The PairwiseAlignmentsSingleSubjectSummary class is a container for storing the summary of a set of pairwise alignments.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
## Constructors:
## When subject is missing, pattern must be of length 2
## S4 method for signature 'XString,XString'
PairwiseAlignments(pattern, subject,
  type = "global", substitutionMatrix = NULL, gapOpening = 0, gapExtension = 1)
## S4 method for signature 'XStringSet,missing'
PairwiseAlignments(pattern, subject,
  type = "global", substitutionMatrix = NULL, gapOpening = 0, gapExtension = 1)
## S4 method for signature 'character,character'
PairwiseAlignments(pattern, subject,
  type = "global", substitutionMatrix = NULL, gapOpening = 0, gapExtension = 1,
  baseClass = "BString")
## S4 method for signature 'character,missing'
PairwiseAlignments(pattern, subject,
  type = "global", substitutionMatrix = NULL, gapOpening = 0, gapExtension = 1,
  baseClass = "BString")

Arguments

pattern

a character vector of length 1 or 2, an XString, or an XStringSet object of length 1 or 2.

subject

a character vector of length 1 or an XString object.

type

type of alignment. One of "global", "local", "overlap", "global-local", and "local-global" where "global" = align whole strings with end gap penalties, "local" = align string fragments, "overlap" = align whole strings without end gap penalties, "global-local" = align whole strings in pattern with consecutive subsequence of subject, "local-global" = align consecutive subsequence of pattern with whole strings in subject.

substitutionMatrix

substitution matrix for the alignment. If NULL, the diagonal values and off-diagonal values are set to 0 and 1 respectively.

gapOpening

the cost for opening a gap in the alignment.

gapExtension

the incremental cost incurred along the length of the gap in the alignment.

baseClass

the base XString class to use in the alignment.

Details

Before we define the notion of alignment, we introduce the notion of "filled-with-gaps subsequence". A "filled-with-gaps subsequence" of a string string1 is obtained by inserting 0 or any number of gaps in a subsequence of s1. For example L-A–ND and A–N-D are "filled-with-gaps subsequences" of LAND. An alignment between two strings string1 and string2 results in two strings (align1 and align2) that have the same length and are "filled-with-gaps subsequences" of string1 and string2.

For example, this is an alignment between LAND and LEAVES:

1
2
3
    L-A
    LEA
  

An alignment can be seen as a compact representation of one set of basic operations that transforms string1 into align1. There are 3 different kinds of basic operations: "insertions" (gaps in align1), "deletions" (gaps in align2), "replacements". The above alignment represents the following basic operations:

1
2
3
4
5
6
    insert E at pos 2
    insert V at pos 4
    insert E at pos 5
    replace by S at pos 6 (N is replaced by S)
    delete at pos 7 (D is deleted)
  

Note that "insert X at pos i" means that all letters at a position >= i are moved 1 place to the right before X is actually inserted.

There are many possible alignments between two given strings string1 and string2 and a common problem is to find the one (or those ones) with the highest score, i.e. with the lower total cost in terms of basic operations.

Object extraction methods

In the code snippets below, x is a PairwiseAlignments object, except otherwise noted.

alignedPattern(x), alignedSubject(x): Extract the aligned patterns or subjects as an XStringSet object. The 2 objects returned by alignedPattern(x) and alignedSubject(x) are guaranteed to have the same shape (i.e. same length() and width()).

pattern(x), subject(x): Extract the aligned patterns or subjects as an AlignedXStringSet0 object.

summary(object, ...): Generates a summary for the PairwiseAlignments object.

General information methods

In the code snippets below, x is a PairwiseAlignments object, except otherwise noted.

alphabet(x): Equivalent to alphabet(unaligned(subject(x))).

length(x): The common length of alignedPattern(x) and alignedSubject(x). There is a method for PairwiseAlignmentsSingleSubjectSummary as well.

type(x): The type of the alignment ("global", "local", "overlap", "global-local", or "local-global"). There is a method for PairwiseAlignmentsSingleSubjectSummary as well.

Aligned sequence methods

In the code snippets below, x is a PairwiseAlignmentsSingleSubject object, except otherwise noted.

aligned(x, degap = FALSE, gapCode="-", endgapCode="-"): If degap = FALSE, "align" the alignments by returning an XStringSet object containing the aligned patterns without insertions. If degap = TRUE, returns aligned(pattern(x), degap=TRUE). The gapCode and endgapCode arguments denote the code in the appropriate alphabet to use for the internal and end gaps.

as.character(x): Equivalent to as.character(alignedPattern(x)).

as.matrix(x): Returns an "exploded" character matrix representation of aligned(x).

toString(x): Equivalent to toString(as.character(x)).

Subject position methods

In the code snippets below, x is a PairwiseAlignmentsSingleSubject object, except otherwise noted.

consensusMatrix(x, as.prob=FALSE, baseOnly=FALSE, gapCode="-", endgapCode="-") See 'consensusMatrix' for more information.

consensusString(x) See 'consensusString' for more information.

coverage(x, shift=0L, width=NULL, weight=1L) See 'coverage,PairwiseAlignmentsSingleSubject-method' for more information.

Views(subject, start=NULL, end=NULL, width=NULL, names=NULL): The XStringViews object that represents the pairwise alignments along unaligned(subject(subject)). The start and end arguments must be either NULL/NA or an integer vector of length 1 that denotes the offset from start(subject(subject)).

Numeric summary methods

In the code snippets below, x is a PairwiseAlignments object, except otherwise noted.

nchar(x): The nchar of the aligned(pattern(x)) and aligned(subject(x)). There is a method for PairwiseAlignmentsSingleSubjectSummary as well.

insertion(x): An CompressedIRangesList object containing the locations of the insertions from the perspective of the pattern.

deletion(x): An CompressedIRangesList object containing the locations of the deletions from the perspective of the pattern.

indel(x): An InDel object containing the locations of the insertions and deletions from the perspective of the pattern.

nindel(x): An InDel object containing the number of insertions and deletions.

score(x): The score of the alignment. There is a method for PairwiseAlignmentsSingleSubjectSummary as well.

Subsetting methods

x[i]: Returns a new PairwiseAlignments object made of the selected elements.

rep(x, times): Returns a new PairwiseAlignments object made of the repeated elements.

Author(s)

P. Aboyoun

See Also

pairwiseAlignment, writePairwiseAlignments, AlignedXStringSet-class, XString-class, XStringViews-class, align-utils, pid

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
PairwiseAlignments("-PA--W-HEAE", "HEAGAWGHE-E")

pattern <- AAStringSet(c("HLDNLKGTF", "HVDDMPNAKLLL"))
subject <- AAString("SHLDTEKMSMKLL")
pa1 <- pairwiseAlignment(pattern, subject, substitutionMatrix="BLOSUM50",
                         gapOpening=3, gapExtension=1)
pa1

alignedPattern(pa1)
alignedSubject(pa1)
stopifnot(identical(width(alignedPattern(pa1)),
                    width(alignedSubject(pa1))))

as.character(pa1)

aligned(pa1)
as.matrix(pa1)
nchar(pa1)
score(pa1)

Example output

Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, basename, cbind, colMeans, colSums, colnames,
    dirname, do.call, duplicated, eval, evalq, get, grep, grepl,
    intersect, is.unsorted, lapply, lengths, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, rank, rbind,
    rowMeans, rowSums, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which, which.max, which.min

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following object is masked from 'package:base':

    expand.grid

Loading required package: IRanges
Loading required package: XVector

Attaching package: 'Biostrings'

The following object is masked from 'package:base':

    strsplit

Global PairwiseAlignments (1 of 1)
pattern: -PA--W-HEAE
subject: HEAGAWGHE-E
score: 4 
Global PairwiseAlignmentsSingleSubject (1 of 2)
pattern: -HLD----NLKGTF
subject: SHLDTEKMSMK-LL
score: 18 
  A AAStringSet instance of length 2
    width seq
[1]    14 -HLD----NLKGTF
[2]    15 -HVDD--MPNAKLLL
  A AAStringSet instance of length 2
    width seq
[1]    14 SHLDTEKMSMK-LL
[2]    15 SHLDTEKM-SMKLL-
[1] "-HLD----NLKGTF"  "-HVDD--MPNAKLLL"
  A AAStringSet instance of length 2
    width seq
[1]    13 -HLD----NLKTF
[2]    13 -HVDD--MNAKLL
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] "-"  "H"  "L"  "D"  "-"  "-"  "-"  "-"  "N"  "L"   "K"   "T"   "F"  
[2,] "-"  "H"  "V"  "D"  "D"  "-"  "-"  "M"  "N"  "A"   "K"   "L"   "L"  
[1] 13 13
[1] 18 24

Biostrings documentation built on Nov. 8, 2020, 11:12 p.m.