alakazam: The Alakazam package

alakazamR Documentation

The Alakazam package

Description

alakazam in a member of the Immcantation framework of tools and serves five main purposes:

  • Providing core functionality for other R packages in Immcantation. This includes common tasks such as file I/O, basic DNA sequence manipulation, and interacting with V(D)J segment and gene annotations.

  • Providing an R interface for interacting with the output of the pRESTO and Change-O tool suites.

  • Performing clonal abundance and diversity analysis on lymphocyte repertoires.

  • Performing lineage reconstruction on clonal populations of immunoglobulin (Ig) sequences.

  • Performing physicochemical property analyses of lymphocyte receptor sequences.

For additional details regarding the use of the alakazam package see the vignettes:
browseVignettes("alakazam")

File I/O

  • readChangeoDb: Input Change-O style files.

  • writeChangeoDb: Output Change-O style files.

Sequence cleaning

  • maskSeqEnds: Mask ragged ends.

  • maskSeqGaps: Mask gap characters.

  • collapseDuplicates: Remove duplicate sequences.

Lineage reconstruction

  • makeChangeoClone: Clean sequences for lineage reconstruction.

  • buildPhylipLineage: Perform lineage reconstruction of Ig sequences.

Lineage topology analysis

  • tableEdges: Tabulate annotation relationships over edges.

  • testEdges: Significance testing of annotation edges.

  • testMRCA: Significance testing of MRCA annotations.

  • summarizeSubtrees: Various summary statistics for subtrees.

  • plotSubtrees: Plot distributions of summary statistics for a population of trees.

Diversity analysis

  • countClones: Calculate clonal abundance.

  • estimateAbundance: Bootstrap clonal abundance curves.

  • alphaDiversity: Generate clonal alpha diversity curves.

  • plotAbundanceCurve: Plot clone size distribution as a rank-abundance

  • plotDiversityCurve: Plot clonal diversity curves.

  • plotDiversityTest: Plot testing at given diversity hill indicex.

Ig and TCR sequence annotation

  • countGenes: Calculate Ig and TCR allele, gene and family usage.

  • extractVRegion: Extract CDRs and FWRs sub-sequences.

  • getAllele: Get V(D)J allele names.

  • getGene: Get V(D)J gene names.

  • getFamily: Get V(D)J family names.

  • junctionAlignment: Junction alignment properties

Sequence distance calculation

  • seqDist: Calculate Hamming distance between two sequences.

  • seqEqual: Test two sequences for equivalence.

  • pairwiseDist: Calculate a matrix of pairwise Hamming distances for a set of sequences.

  • pairwiseEqual: Calculate a logical matrix of pairwise equivalence for a set of sequences.

Amino acid propertes

  • translateDNA: Translate DNA sequences to amino acid sequences.

  • aminoAcidProperties: Calculate various physicochemical properties of amino acid sequences.

  • countPatterns: Count patterns in sequences.

References

  1. Vander Heiden JA, Yaari G, et al. pRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires. Bioinformatics. 2014 30(13):1930-2.

  2. Stern JNH, Yaari G, Vander Heiden JA, et al. B cells populating the multiple sclerosis brain mature in the draining cervical lymph nodes. Sci Transl Med. 2014 6(248):248ra107.

  3. Wu Y-CB, et al. Influence of seasonal exposure to grass pollen on local and peripheral blood IgE repertoires in patients with allergic rhinitis. J Allergy Clin Immunol. 2014 134(3):604-12.

  4. Gupta NT, Vander Heiden JA, et al. Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data. Bioinformatics. 2015 Oct 15;31(20):3356-8.


alakazam documentation built on Sept. 30, 2023, 9:07 a.m.