R/data.R

#' 100 simulated data matrices
#'
#'  Contains the 100 simulated matrices generated by Congreve & Lamsdell (2016)
#'  using a heterogeneous Markov-k model, generated from the
#'  \link{clReferenceTree} topology, with all branches sharing an equal length.
#'
#' @format
#'  - `clPhyDat`: A list with 100 entries, each comprising a phyDat object
#' of 55 characters for 22 taxa.
#'  - `clMatrices`: A list with 100 entries, each comprising a list of character
#'  tokens for each simulated character, as read from raw nexus files using
#'  `ape::read.nexus.data`.  The four dummy 'characters' have been removed.
#'
#' @encoding UTF-8
#' @references
#' - Congreve, C. R. & Lamsdell, J. C. (2016). Implied weighting and its
#'   utility in palaeontological datasets: a study using modelled phylogenetic
#'    matrices. _Palaeontology_ 59(3), 447--465. \doi{10.1111/pala.12236}.
#' - Congreve, C. R. & Lamsdell, J. C. (2016). Data from: Implied weighting and its
#'   utility in palaeontological datasets: a study using modelled phylogenetic
#'   matrices. Dryad Digital Repository. \doi{10.5061/dryad.7dq0j}.
#'
#' @source \doi{10.5061/dryad.7dq0j}
"clPhyDat"

#' @rdname clPhyDat
"clMatrices"

#' Tree topology for matrix simulation
#'
#' The tree topology used to generate the matrices in \code{\link{clMatrices}}
#' Congreve & Lamsdell (2016).
#'
#' @format A single phylogenetic tree saved as an object of class \code{phylo}.
#'
#' @references
#'   - Congreve, C. R. & Lamsdell, J. C. (2016). Implied weighting and its
#'   utility in palaeontological datasets: a study using modelled phylogenetic
#'    matrices. _Palaeontology_ 59(3), 447--465. \doi{10.1111/pala.12236}.
#'   - Congreve, C. R. & Lamsdell, J. C. (2016). Data from: Implied weighting and its
#'   utility in palaeontological datasets: a study using modelled phylogenetic
#'    matrices. Dryad Digital Repository. \doi{10.5061/dryad.7dq0j}.
#'
#' @examples
#' data(clReferenceTree)
#' if (requireNamespace("ape", quietly = TRUE)) plot(clReferenceTree)
#'
#' @source Congreve & Lamsdell (2016).
"clReferenceTree"

#' Congreve and Lamsdell tree distances
#'
#' Distance of CL trees from generative tree.
#'
#' For each of the 100 matrices generated by Congreve & Lamsdell (2016), I conducted
#' phylogenetic analysis under different methods:
#'
#' \describe{
#' \item{`Mkv`: }{using the Markov K model in MrBayes;}
#' \item{`eq`: }{using equal weights in TNT;}
#' \item{`k1`, `k2`, `k3`, `k5`, `kX`: }{using implied weights in TNT,
#'   with the concavity constant (_k_) set to 1, 2, 3, 5, or 10;}
#' \item{`kC`: }{by taking the strict *c*onsensus of all trees recovered by implied
#' weights parsimony analysis under the _k_ values 2, 3, 5 and 10 (but not 1).}
#' }
#'
#' For each analysis, I recorded the strict consensus of all optimal trees, and
#' also the consensus of trees that were suboptimal by a specified degree.
#'
#' I then calculated, of the total number of quartets or partitions that were
#' resolved in the reference tree, how many were the *s*ame or *d*ifferent in
#' the tree that resulted from the phylogenetic analysis, and how many were
#' not resolved in this tree (*r2*).
#'
#' The data object contains a list whose elements are named after the methods,
#' as listed above.
#'
#' Each list entry is a three-dimensional array, whose dimensions are:
#' \enumerate{
#'
#'  \item  The suboptimality of the tree.  Different measures of node
#'      support are employed:
#'
#'          * `Mkv`: Posterior probabilities, at 2.5\% intervals (50\%, 52.5\%, ...
#'           97.5\%, 100\%).
#'
#'          * `Brem`: Bremer supports: the consensus of all trees that are
#'            (equal weights) 0, 1, .... 19, 20 steps less optimal than the optimal
#'            tree (implied weights: the consensus of all trees that are 0.73^(19:0)
#'            less optimal than the optimal tree).
#'
#'          * `Boot`: Bootstrap supports (symmetric resampling, _p_ = 0.33).
#'
#'          * `Jack`: Jackknife supports (_p_ = 0.36).
#'
#'            `Boot` and `Jack` results are reported both as the `freq`uency of splits
#'            among replicates, and using the `gc` (Groups Present / Contradicted)
#'            measure (Goloboff _et al_. 2003); frequency columns correspond to
#'            100\%, 97.5\%, 95\% ... 0\% support; gc columns correspond to 100\%, 95\%,
#'            ... 0\% present, 5\%, 10\%, ... 100\% contradicted.
#'
#'   \item  Counts of the condition of each quartet or partition:
#'
#'       * `Q`: The total number of quartets defined on 22 taxa.
#'
#'       * `N`: The total number of partitions present, counting each tree separately.
#'
#'       * `P1`: The number of partitions in tree 1 (the reconstructed tree).
#'
#'       * `P2`: The number of partitions in tree 2 (the generative tree).
#'
#'       * `s`: The number of quartets or partitions resolved identically in
#'              each tree.
#'       * `d`: The number of quartets resolved differently in each tree.
#'
#'       * `d1`: The number of partitions resolved in tree 1, but contradicted by
#'               tree 2.
#'
#'       * `d2`: The number of partitions resolved in tree 2, but contradicted by
#'               tree 1.
#'
#'       * `r1`: The number of partitions or quartets resolved in tree 1 that are
#'               neither present in nor contradicted by tree 2.
#'
#'       * `r2`: The number of partitions or quartets resolved in tree 2 that are
#'               neither present in nor contradicted by tree 1.
#'
#'       * `u`: The number of quartets that are not resolved in either tree.
#'
#'   \item  The number of the matrix, from 1 to 100.
#' }
#'
#' @seealso [clMatrices], [clReferenceTree].
#'
#' @references
#'   Goloboff, P. A., J. S. Farris, M. Källersjö, B. Oxelman, M. J. Ramírez, and
#'    C. A. Szumik. 2003. Improvements to resampling measures of group support.
#'    _Cladistics_ 19, 324--332. \doi{10.1016/S0748-3007(03)00060-4}.
#' @source Congreve, C. R. & Lamsdell, J. C. (2016). Implied weighting and its
#'   utility in palaeontological datasets: a study using modelled phylogenetic
#'    matrices. _Palaeontology_ 59(3), 447--465. \doi{10.1111/pala.12236}.
#'
#' @name clResults
NULL

#' @rdname clResults
"clBremQuartets"

#' @rdname clResults
"clBremPartitions"

#' @rdname clResults
"clMkvPartitions"

#' @rdname clResults
"clMkvQuartets"

#' @rdname clResults
"clBootFreqPartitions"

#' @rdname clResults
"clBootFreqQuartets"

#' @rdname clResults
"clJackFreqPartitions"

#' @rdname clResults
"clJackFreqQuartets"

#' @rdname clResults
"clBootGcPartitions"

#' @rdname clResults
"clBootGcQuartets"

#' @rdname clResults
"clJackGcPartitions"

#' @rdname clResults
"clJackGcQuartets"

#' Consistency indices
#'
#' Consistency indices of Congreve & Lamsdell datasets.
#'
#' @rdname congreveLamsdellMatrices
"clCI"

#' Default colours for analyses.
"clColours"
ms609/CongreveLamsdell2016 documentation built on March 5, 2024, 9:28 a.m.