doBatchCloneAnalysis: Perform phylogenetic inference & tree analysis in a batch

View source: R/doBatchCloneAnalysis.R

doBatchCloneAnalysisR Documentation

Perform phylogenetic inference & tree analysis in a batch

Description

This function loops through clones and perform IgPhyML phylogenetic tree inference and arborescence tree construction, and calculates tree metrics, as detailed in vignette of this package.

Usage

doBatchCloneAnalysis(
  inputDF,
  outputFolder,
  species = "Human",
  minCloneSize = 3,
  sequence_column = "Sequence",
  cloneID_column = "CloneID",
  label_column = "Seq_ID",
  colourColumn = "Subclass",
  IGHVgeneandallele_column = "V.GENE.and.allele",
  germlineSet = NULL,
  plotFormat = "png",
  phyloTreeType = "simple",
  phyloTreeOptions = NULL,
  makeArboTree = TRUE,
  useTempDir = TRUE
)

Arguments

inputDF

input data.frame containing repertoire data.

outputFolder

character, path for the folder where output files are to be stored.

species

characer, species. For now only "Human" is supported and accepted.

minCloneSize

numeric, minimum clone size to be considered for tree inference. (default: 3)

sequence_column

character, column name in inputDF which holds the sequences. (default: "Sequence")

cloneID_column

character, column name in inputDF which holds the clone IDs. (default: "CloneID")

label_column

character, column name in inputDF which holds the sequence identifiers. (default: "Seq_ID")

colourColumn

character, column name in inputDF which is to be categorised by colours in plots. (default: "Subclass")

IGHVgeneandallele_column

character, column name in inputDF which holds annotated germline V gene names. (default: "V.GENE.and.allele")

germlineSet

Optional, filepath pointing to a FASTA file containing germline sequences to be used. (default: NULL)

plotFormat

Either 'png' or 'pdf' (default: 'png')

phyloTreeType

"simple", "dnapars" or "igphyml". "simple" refers to a neighbour-joining tree; "dnapars" constructs a maximum parsimony tree using the phylip "dnapars" program (a local installation of the program is required). "igphyml" constructs a maximum likelihood tree taking into account mutational hotspot contexts in immunoglobulins, but is the most time consuming. (default: "dnapars")

phyloTreeOptions

A list to be fed as the 'parameter' entry in the treeConstruction argument of the cloneLineage function (see documentation of the cloneLineage function). If NULL, default settings will be used. (default: NULL)

makeArboTree

Should arborescence tree be calculated? (default: TRUE)

useTempDir

If TRUE, generate temporary directory and write results there. (default: TRUE)

Value

A list with each element named by a clone ID, each itself a list with two elements:

distances

data.frame with the distance-from-germline calculated from the lineage trees for each sequence in the clone. See Vignette for details.

csr_events

data.frame listing all class-switching events and an estimate distance-from-germline at which such event takes place.


Fraternalilab/BrepPhylo documentation built on Jan. 3, 2025, 10:03 a.m.