Percent Sequence Identity

Share:

Description

Calculates the percent sequence identity for a pairwise sequence alignment.

Usage

1
pid(x, type="PID1")

Arguments

x

a PairwiseAlignments object.

type

one of percent sequence identity. One of "PID1", "PID2", "PID3", and "PID4". See Details for more information.

Details

Since there is no universal definition of percent sequence identity, the pid function calculates this statistic in the following types:

"PID1":

100 * (identical positions) / (aligned positions + internal gap positions)

"PID2":

100 * (identical positions) / (aligned positions)

"PID3":

100 * (identical positions) / (length shorter sequence)

"PID4":

100 * (identical positions) / (average length of the two sequences)

Value

A numeric vector containing the specified sequence identity measures.

Author(s)

P. Aboyoun

References

A. May, Percent Sequence Identity: The Need to Be Explicit, Structure 2004, 12(5):737.

G. Raghava and G. Barton, Quantification of the variation in percentage identity for protein sequence alignments, BMC Bioinformatics 2006, 7:415.

See Also

pairwiseAlignment, PairwiseAlignments-class, match-utils

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
  s1 <- DNAString("AGTATAGATGATAGAT")
  s2 <- DNAString("AGTAGATAGATGGATGATAGATA")

  palign1 <- pairwiseAlignment(s1, s2)
  palign1
  pid(palign1)

  palign2 <-
    pairwiseAlignment(s1, s2,
      substitutionMatrix =
      nucleotideSubstitutionMatrix(match = 2, mismatch = 10, baseOnly = TRUE))
  palign2
  pid(palign2, type = "PID4")

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.