sim.conthistory: Simulate a continuous stochastic character map

View source: R/make_Contsimmap.R

sim.conthistoryR Documentation

Simulate a continuous stochastic character map

Description

This function creates a continuous stochastic character map (class "contsimmap") by simulating and mapping evolving continuous characters on a given phylogeny.

Usage

sim.conthistory(
  tree,
  ntraits = 1,
  traits = paste0("trait_", seq_len(ntraits)),
  nobs = NULL,
  nsims = 100,
  res = 100,
  X0 = NULL,
  Xsig2 = NULL,
  Ysig2 = NULL,
  mu = NULL,
  verbose = FALSE
)

Arguments

tree

A phylogeny or list of phylogenies with or without mapped discrete characters (classes "phylo", "multiPhylo", "simmap", or "multiSimmap"). Lists of phylogenies with differing discrete character histories are allowed, but lists with differing topologies are currently not supported! I plan to implement "multiContsimmap"-type objects for this purpose in the future.

ntraits

The number traits to simulate.

traits

A character vector specifying the names of each trait. The <i>th unnamed trait defaults to "trait_<i>".

nobs

A numeric vector specifying the number of observations to simulate per tip/node. Entries are assigned to tips/nodes if named according to tree$tip.label/tree$node.label (node labels default to their numeric index if not provided in tree$node.label). Unnamed/unassigned entries are recycled to "fill in" values for any tips lacking input (but not internal nodes) in order of increasing node index. If no unnamed/unassigned entries are available, tips and internal nodes default to 1 and 0 observations, respectively. To specify differing numbers of observations for each phylogeny/simulation, format nobs instead as a list of vectors or matrix with each column/column name corresponding to different nodes (see section for more details recycling behavior across phylogenies/simulations).

nsims

The number of simulations to perform. Each simulation will correspond to a separate phylogeny in tree, which is recycled as needed. For example, providing a "multiSimmap" object of length 3 for tree and specifying nsims=7 will result in 7 simulations on phylogenies 1, 2, 3, 1, 2, 3, and 1, in that order.

res

Controls the approximate number of timepoints at which to sample trait values across the entire height of the phylogeny (i.e., from root to last-surviving tip). Higher values result in more densely-sampled character histories but take longer to simulate and use more computer memory.

X0

A numeric vector specifying the starting trait values at the root of the phylogeny. Entries are assigned to traits if named according to traits. Unnamed/unassigned entries are recycled to "fill in" values for any traits lacking input in the same order as given in traits. If no unnamed/unassigned entries are available, defaults to 0. To specify differing starting trait values for each phylogeny/simulation, format X0 instead as a list of vectors or matrix with each column/column name corresponding to different traits (see section for more details recycling behavior across phylogenies/simulations).

Xsig2

A list of numeric matrices/vectors specifying the evolutionary rate matrices for each discrete character state mapped onto the phylogeny. Evolutionary rate matrices are symmetric matrices with diagonal and off-diagonal entries corresponding to evolutionary rates and covariances, respectively. Vectors are assumed to be diagonal matrices (i.e., no evolutionary covariance among traits). Matrix/vector entries are assigned to pairs of traits based on associated row/column names (plain names in the case of vectors), which are matched to trait. When possible, unspecified matrix entries (NA entries) and row/column names are automatically set to make inputted matrices symmetric. Any remaining unnamed/unassigned rows/columns are recycled as a block-diagonal matrix to "fill in" values for traits completely lacking input in the same order as given in traits. All remaining unspecified rates and covariances default to 1 and 0, respectively. Specifying an invalid variance-covariance matrices (either due to asymmetry or not being positive semidefinite) result in an error. List entries are assigned to discrete character states if named according to the state names given in tree. Any unnamed/unassigned list entries are recycled to "fill in" values for states lacking input in alphabetical order. If no unnamed/unassigned list entries are available, defaults to identity matrix (i.e., all rates of 1 with no covariance). To specify differing evolutionary rate matrices for each phylogeny/simulation, format Xsig2 instead as a list of lists or matrix-shaped list with each column/column name corresponding to different discrete character states (see section for more details recycling behavior across phylogenies/simulations).

Ysig2

A list of numeric matrices/vectors specifying the intraspecifc and/or measurement error for each tip/node in the phylogeny. These are symmetric matrices with diagonal and off-diagonal entries corresponding to the variances and covariances, respectively, of trait measurements for a particular tip/node. Vectors are assumed to be diagonal matrices (i.e., no covariance among trait measurements within a given node). Matrix/vector entries are assigned to pairs of traits based on associated row/column names (plain names in the case of vectors), which are matched to trait. When possible, unspecified matrix entries (NA entries) and row/column names are automatically set to make inputted matrices symmetric. Any remaining unnamed/unassigned rows/columns are recycled as a block-diagonal matrix to "fill in" values for traits completely lacking input in the same order as given in traits. All remaining unspecified matrix entries default to 0. Specifying an invalid variance-covariance matrices (either due to asymmetry or not being positive semidefinite) result in an error. List entries are assigned to tips/nodes if named according to tree$tip.label/tree$node.label (node labels default to their numeric index if not provided in tree$node.label). Any unnamed/unassigned list entries are recycled to "fill in" values for nodes/tips lacking input in order of increasing node index. If no unnamed/unassigned list entries are available, defaults to matrix of 0s. To specify differing intraspecific/measurement errors for each phylogeny/simulation, format Xsig2 instead as a list of lists or matrix-shaped list with each column/column name corresponding to different tips/nodes (see section for more details recycling behavior across phylogenies/simulations).

verbose

Recycling Behavior

Each parameter input system (nobs, X0, Xsig2, Ysig2, and mu) has certain idiosyncrasies, but they all follow a similar philosophy. There are three steps:

  1. Match labeled inputs to appropriate nodes/traits/states.

  2. "Fill in" for nodes/traits/states lacking inputs with unlabeled inputs if they exist (recycling the unlabeled inputs as needed) and default values otherwise.

  3. If multiple parameter values are provided, different parameter values will be applied to each phylogeny in tree, recycling as necessary. This creates some interesting quirks when the length of different parameter values exceeds the number of phylogenies. For example, if tree contains 3 phylogenies and 4 lists of matrices are specified for Xsig2, then simulations on the 2nd and 3rd phylogenies will use the 2nd and 3rd Xsig2 lists, respectively, while simulations on the 1st phylogeny will alternate between the 1st and 4th Xsig2 lists.

I tried to make warning messages informative enough to make potentially unexpected recycling behaviors apparent. Users specifying complicated simulations involving varying parameter values can always double-check their inputs were interpreted correctly using get.param.info().


bstaggmartin/contSimmap documentation built on Jan. 26, 2024, 2:09 p.m.