MultiplePairwiseAlignmentsToOneSubject: Separately align multiple pattern sequences to one subject

View source: R/MultiplePairwiseAlignmentsToOneSubject.R

MultiplePairwiseAlignmentsToOneSubjectR Documentation

Separately align multiple pattern sequences to one subject

Description

This function is useful to visualise the alignment position of multiple patterns on one subject. It uses Biostrings::pairwiseAlignment(), obtains the individual alignment boundaries and converts the results to a ggplot object. The function will fail if a gap is induced in the subject and at least two pattern alignments overlap at this gap. Method = local-global avoids indels in subject. Local cuts subject and pattern to the best matching sequence of both.

Usage

MultiplePairwiseAlignmentsToOneSubject(
  subject,
  patterns,
  type = c("global-local", "global", "local", "overlap", "local-global"),
  max_mismatch = NA,
  order_patterns = F,
  fix_subject_indels = F,
  rm_indel_inducing_pattern = F,
  seq_type = NULL,
  return_max_mismatch_info_only = F,
  matches_to_subject_and_pattern = list(c(T, T), c(F, T), c(F, F)),
  compare_seq_df_long_args = list(seq_original = NULL, match_symbol = ".", change_pattern
    = T, pattern_mismatch_as = "base", change_ref = T, ref_mismatch_as = "base",
    insertion_as = "base"),
  pairwiseAlignment_args = list(),
  algnmt_plot_args = list(add_length_suffix = T, pattern_lim_size = 2, verbose = F),
  order_subject_ranges = F
)

Arguments

subject

a named character or named DNAStringSet of one subject (only the DNAStringSet but not DNAString can hold a name)

patterns

a named character vector or named DNAStringSet of patterns to align to the subject sequence

type

the type of alignment passed to Biostrings::pairwiseAlignment; not every type may work well with this function (if there are overlapping ranges of the alignments to the subject for example)

max_mismatch

only use patterns that have a maximum number of mismatches with the subject

order_patterns

order pattern increasingly by alignment position (start)

fix_subject_indels

in case of overlapping indels and shared subject ranges, cut respective patterns to avoid indels

seq_type

set sequence type to AA or NT if necessary; if NULL it is attempted to guess the type

return_max_mismatch_info_only

only return information on mismatches of patterns with the subject; in this case no alignment is calculated

order_subject_ranges

Value

a list

Examples

## Not run: 
s <- stats::setNames("AAAACCCCTTTTGGGGAACCTTCC", "sub")
s <- Biostrings::DNAStringSet(s)
p <- stats::setNames(c("TTCC", "CCCC", "TTTT", "GGGG", "AAAA"), c("pat1", "pat2", "pat3", "pat4", "pat5"))
p <- Biostrings::DNAStringSet(p)
als <- igsc::MultiplePairwiseAlignmentsToOneSubject(subject = s, patterns = p)
als_ordered <- igsc::MultiplePairwiseAlignmentsToOneSubject(subject = s, patterns = p, order_patterns = T)

## End(Not run)

Close-your-eyes/igsc documentation built on Jan. 28, 2024, 10:28 p.m.