pairwise_alignment_sequence_identity: Calculate the percentage of pairwise sequence identity
In currocam/utilsHMMER: HMMERutils

pairwise_alignment_sequence_identity

R Documentation

Calculate the percentage of pairwise sequence identity

Description

Calculate the percentage of pairwise sequence identity

Usage

pairwise_alignment_sequence_identity(
  seqs,
  aln_type = "global",
  pid_type = "PID1"
)

Arguments

`seqs`	A named character vector to convert into a `Biostrings::AAStringSet` or a `Biostrings::AAStringSet` with the sequences of interest. If they are not named, arbitrary names will be given.
`aln_type`	A character vector of one containing the alignment type. Possible options are "global" (Needleman-Wunsch),"local" (Smith-Waterman) and "overlap".
`pid_type`	A character vector of one containing the definition of percent sequence identity. Possible options are "PID1", "PID2", "PID3" and "PID4".

Value

A long DataFrame with the results.

Alignment types

global: align whole strings with end gap penalties (Needleman-Wunsch).
local: align string fragments (Smith-Waterman).
overlap: align whole strings without end gap penalties.

Percent sequence identity

PID1: 100 * (identical positions) / (aligned positions + internal gap positions).
PID2: 100 * (identical positions) / (aligned positions).
PID3: 100 * (identical positions) / (length shorter sequence).
PID4: 100 * (identical positions) / (average length of the two sequences).

Examples

data(phmmer_2abl)
pairwise_alignment_sequence_identity(
    seqs = phmmer_2abl$hits.fullfasta[6:10],
    aln_type = "overlap",
    pid_type = "PID2"
)

currocam/utilsHMMER documentation built on Feb. 19, 2023, 9:54 p.m.