fullRepertoire: Simulates full heavy chain antibody repertoires for either...
In AbSim: Time Resolved Simulations of Antibody Repertoires

Description Usage Arguments Value See Also

Simulates full heavy chain antibody repertoires for either human or mice.

fullRepertoire(max.seq.num, max.timer, SHM.method, baseline.mut,
  SHM.branch.prob, SHM.branch.param, SHM.nuc.prob, species, VDJ.branch.prob,
  proportion.sampled, sample.time, max.tree.num, chain.type, vdj.model,
  vdj.insertion.mean, vdj.insertion.stdv)

`max.seq.num`	The maximum number of tips allowed at the end of the simulation. The simulation will end when either this or the max.timer is reached. Note - this function does not take clonal frequency into account. This parameter resembles the species richness, or the measure of unique sequences in the repertoire.
`max.timer`	The maximum number of time steps allowed during the simulation. The simulation will end when either this or the max.seq.num is reached.
`SHM.method`	The mode of SHM speciation events. Options are either: "poisson","data","motif","wrc", and "all". Specifying either "poisson" or "naive" will result in mutations that can occur anywhere in the heavy chain region, with each nucleotide having an equal probability for a mutation event. Specifying "data" focuses mutation events during SHM in the CDR regions (based on IMGT), and there will be an increased probability for transitions (and decreased probability for transversions). Specifying "motif" will cause neighbor dependent mutations based on a mutational matrix from high throughput sequencing data sets (Yaari et al., Frontiers in Immunology, 2013). "wrc" allows for only the WRC mutational hotspots to be included (where W equals A or T and R equals A or G). Specifying "all" will use all four types of mutations during SHM branching events, where the weights for each can be specified in the "SHM.nuc.prob" parameter.
`baseline.mut`	Specifies the probability (gamma) for each nucleotide to be mutated inbetween speciation events. These mutations do not cause any branching events. This parameter gives each site a probability to be mutated (in all current sequences) at each time step. Currently these are only Poisson distributed but future releases will change it to allow for other mutation methods.
`SHM.branch.prob`	Specifies the probability for a given sequence to undergo SHM events (thus, branching events) This parameter corresponds to the distribution specified in "SHM.branch.prob". For "identical" only one value should be supplied. For "uniform", a vector of length 3 should be specified corresponding to n,min,max respectively (stats::runif(n, min = 0, max = 1)). For "exponential", a single value controlling the rate parameter (from stats::rexp()) should be supplied. For "lognorm" a vector of length two should be supplied, with the first value corresponding to meanlog and the second corresponding to sdlog (from stats::rlnorm). Similarly, for "normal" distribution, two values corresponding to the mean and standard deviation (respectively) should be supplied.
`SHM.branch.param`	Describes the probability of undergoing SHM events. This parameter is responsible for describing how likely each sequence will undergo branching events in the phylogeny. The following options are possible: "identical", "uniform", "exponential" ("exp"), "lognormal" ("lognorm"), "normal" ("norm").
`SHM.nuc.prob`	Specifies the rate at which nucleotides change during speciation (SHM) events. This parameter depends on the type of mutation specified by SHM.method. For both "poisson" and "data", the input value determines the probability for each site to mutate (the whole sequence for "poisson" and the CDRs for "data"). For either "motif" or "wrc", the number of mutations per speciation event should be specified. Note that these are not probabilities, but the number of mutations that can occur (if the mutation is present in the sequence). If "all" is specified, the input should be a vector where the first element controls the poisson style mutations, second controls the "data", third controls the "motif" and fourth controls the "wrc".
`species`	Either "mus" for C57BL/6 germline genes or "hum" for human germline genes. These genes were taking from IMGT. When more than one allele was present for a given gene, the first was used.
`VDJ.branch.prob`	The probabilty of a new VDJ recombination event of occuring. For the singleLineage function this will result in a branching event at the site of the unmutated germline. For fullRepertoire function, this will cause a new tree to begin.
`proportion.sampled`	Value ranging from 0 and 1 specifying the proportion of sequences to be sampled at each time point. Specifiying 1 indicates that all sequences will be recovered at each time point, whereas 0.5 will sample half of the sequences.
`sample.time`	Integer array indicating the time points at which sampling events should occur.
`max.tree.num`	Integer value describing maximum number of trees allowed to generate the core sequences of the repertoire. Each of these trees is started by an independent VDJ recombination event.
`chain.type`	String determining whether heavy or light chain should be simulated. Either "heavy" for heavy chains or "light" for light chains. Heavy chains will have V-D-J recombination, whereas light chain will just have V-J recombination.
`vdj.model`	Specifies the model used to simulate V-D-J recombination. Can be either "naive" or "data". "naive" is chain independent and does not differentiate between different species. To rely on the default "experimental" options, this should be "data" and the parameter vdj.insertion.mean should be "default". This will allow for different mean additions for either the VD and JD junctions and will differ depending on species.
`vdj.insertion.mean`	Integer value describing the mean number of nucleotides to be inserted during simulated V-D-J recombination events. If "default" is entered, the mean will be normally distribut
`vdj.insertion.stdv`	Integer value describing the standard deviation corresponding to insertions of V-D-J recombination. No "default" parameter currently supported but will be updated with future experimental data. This should be a number if using a custom distribution for V-D-J recombination events, but can be "default" if using the "naive" vdj.model or the "data", with vdj.insertion.mean set to "default".

Returns a nested list. output[[1]][[1]] is an array of the simulated sequences output[[2]][[1]] is an array names corresponding to each sequence. For example, output[[2]][[1]][1] is the name of the sequence corresponding to output[[1]][[1]][1]. The simulated tree of this is found in output[[3]][[1]]. The length of the output list is determined by the number of sampling points Thus if you have two sampling points, output[[4]][[1]] would be a character array holding the sequences with output[[5]][[1]] as a character array holding the corresponding names. Then the sequences recovered second sampling point would be stored at output[[6]][[1]], with the names at output[[7]][[1]]. This nested list was designed for full antibody repertoire simulations, and thus, may seem unintuitive for the single lineage function. The first sequence and name corresponds to the germline sequence that served as the root of the tree. See vignette for comprehensive example

singleLineage

AbSim documentation built on May 2, 2019, 5:08 a.m.