alignSeq: Align mutliple sequences

Description Usage Arguments Details Value See Also Examples

View source: R/alignSeq.R

Description

Perform multiple sequence alignment using one of three methods and output results to the console or as a pdf file. One may perform the alignment of all amino acid or nucleotide sequences in a single sample. Alternatively, one may search for a given sequence within a list of samples using an edit distance threshold.

Usage

1
2
alignSeq(list, sample = NULL, sequence = NULL, editDistance = 15,
  output = "console", type = "nucleotide", method = "ClustalOmega")

Arguments

list

A list of data frames consisting of antigen receptor sequences imported by the LymphoSeq function readImmunoSeq.

sample

A character vector indicating the name of the sample in the productive sequence list.

sequence

A character vector of one ore more amino acid or nucleotide CDR3 sequences to search.

editDistance

An integer giving the minimum edit distance that the sequence must be less than or equal to. See details below.

output

A character vector indicating where the multiple sequence alignemnt should be printed. Options include "console" or "pdf". If "pdf" is selected, the file is saved to the working directory. For "pdf" to work, Texshade must be installed. Refer to the Bioconductor package msa installation instructions for more details.

type

A character vector indicating whether "aminoAcid" or "nucleotide" sequences should be aligned. If "aminoAcid" is specified, then run productiveSeqs first.

method

A character vector indicating the multiple sequence alignment method to be used. Refer to the Bioconductor msa package for more details. Options incude "ClustalW", "ClustalOmega", and "Muscle".

Details

Edit distance is a way of quantifying how dissimilar two sequences are to one another by counting the minimum number of operations required to transform one sequence into the other. For example, an edit distance of 0 means the sequences are identical and an edit distance of 1 indicates that the sequences different by a single amino acid or nucleotide.

Value

Performs a multiple sequence alignemnt and prints to the console or saves a pdf to the working directory.

See Also

If having trouble saving pdf files, refer to Biconductor package msa for installation instructions http://bioconductor.org/packages/release/bioc/vignettes/msa/inst/doc/msa.pdf

Examples

1
2
3
4
5
6
7
8
file.path <- system.file("extdata", "IGH_sequencing", package = "LymphoSeq")

file.list <- readImmunoSeq(path = file.path)

productive.nt <- productiveSeq(file.list = file.list, aggregate = "nucleotide")

alignSeq(list = productive.nt, sample = "IGH_MVQ92552A_BL", type = "nucleotide", 
         method = "ClustalW", output = "console")

davidcoffey/LymphoSeq documentation built on Dec. 31, 2019, 9:52 p.m.