pairwise_alignment_sequence_identity: Calculate the percentage of pairwise sequence identity

View source: R/pairwise_alignment_sequence_identity.R

pairwise_alignment_sequence_identityR Documentation

Calculate the percentage of pairwise sequence identity

Description

Calculate the percentage of pairwise sequence identity

Usage

pairwise_alignment_sequence_identity(
  seqs,
  aln_type = "global",
  pid_type = "PID1",
  allow_parallelization = NULL
)

Arguments

seqs

A named character vector to convert into a Biostrings::AAStringSet or a Biostrings::AAStringSet with the sequences of interest.

aln_type

A character vector of one containing the alignment type. Possible options are "global" (Needleman-Wunsch),"local" (Smith-Waterman) and "overlap".

pid_type

A character vector of one containing the definition of percent sequence identity. Possible options are "PID1", "PID2", "PID3" and "PID4".

allow_parallelization

A character vector of one, by default NULL. If you want to parallelize the alignment of the sequences, speeding up the process, select multisession or multicore.

Value

A DataFrame of subclass pairwise_sequence_identity, so that it has associated S3 methods..

Visualize

The plot method can be called to visualize either a histogram or a default heatmap. Refer to the "examples" section.

Alignment types

  • global: align whole strings with end gap penalties.

  • local: align string fragments.

  • overlap: align whole strings without end gap penalties.

Percent sequence identity

  • PID1: 100 * (identical positions) / (aligned positions + internal gap positions).

  • PID2: 100 * (identical positions) / (aligned positions).

  • PID3: 100 * (identical positions) / (length shorter sequence).

  • PID4: 100 * (identical positions) / (average length of the two sequences).

Examples

## Not run: 
fasta <- Biostrings::readAAStringSet("fasta.fa")
pairwise.per <- pairwise_alignment_sequence_identity(
                seqs = fasta,
                aln_type = "overlap",
                pid_type = "PID2",
                allow_parallelization = "multisession")
plot(pairwise.per)
plot(pairwise.per, type = "heatmap")

## End(Not run)


currocam/taxa2hmmer documentation built on April 10, 2022, 11:02 a.m.