p2c2m.complete: Execute the complete P2C2M pipeline via a single command
In P2C2M: Posterior Predictive Checks of Coalescent Models

Description Usage Arguments Value Author(s) References Examples

This function executes the complete P2C2M pipeline from beginning to end.

p2c2m.complete(path = "/home/user/Desktop/", xml.file = "beast.xml", 
    descr.stats = "LCWT,NDC", beast.vers = "1.8", 
    single.allele = c("O"), num.reps = 1, use.sorted = FALSE, 
    use.mpi = FALSE, save.metadata = FALSE, verbose = FALSE, 
    dbg = FALSE)

`path`	the absolute file path to the input directory, specified as a double-quoted string; if `"/home/user/Desktop/"` (the default), then the desktop itself is considered the input directory.
`xml.file`	the name of the BEAUTi-generated and XML-formatted input file, specified as a double-quoted string. The default is `"beast.xml"`.
`descr.stats`	the name(s) of the summary statistic(s) to be applied, specified as a double-quoted string. If multiple statistics are specified, they must be separated by commas. A total of four summary statistics is currently available: `"COAL"` and `"LCWT"` (both Rannala & Yang 2013), `"GSI"` (Cummings et al. 2008), `"NDC"` (Maddison 1997). The default is `"LCWT,NDC"`.
`beast.vers`	the version of *BEAST (Heled and Drummond 2010) used to perform the species tree inference, specified as a double-quoted string. Data parsers are located in the subdirectory exec/. Currently, the following parsers are available: `"1.7"` and `"1.8"`. The default is `"1.8"`.
`single.allele`	the name of a species that is represented by only a single allele, specified as a variable of mode vector. This setting is useful when defining an outgroup, because the species so defined does not contribute towards the calculation of the summary statistic 'GSI'. The default is `c("O")`.
`num.reps`	the number of simulation replicates to be conducted, specified as an integer. The default is `1` (i.e., no replication).
`use.sorted`	a logical specifying if the summary statistics generated from the posterior and from the posterior predictive distribution are to be ranked by magnitude prior to the calculation of the differences and the formation of the test distribution. The default is `FALSE`. This argument is only EXPERIMENTAL and should not be selected by regular users.
`use.mpi`	a logical specifying if P2C2M utilizes multiple computer CPUs (if such exist on the system) in order to speed up the calculations. Computations are then executed as parallel processes. The default is `FALSE`.
`save.metadata`	a logical specifying if P2C2M saves the metadata of the analysis to the output variable. The default is `FALSE`.
`verbose`	a logical specifying if P2C2M prints status information to the screen. The default is `FALSE`.
`dbg`	a logical specifying if P2C2M is to be run in a debug mode. If `TRUE`, then only the first 5 percent of input trees are analyzed and information useful for debugging is printed to the screen. Argument `dbg = TRUE` must be set in combination with argument `verbose = TRUE`. The default is `FALSE`. This argument is intended for developers and should not be selected by regular users.

The results of a P2C2M run comprise test statistics, measures of data dispersion and deviations marked at several quantile levels (analogous to P-values under different alpha-levels in a parametric simulation) for each gene under study and of the sum of all genes.

Michael Gruenstaeudl, Noah Reid

Maintainer: Michael Gruenstaeudl gruenstaeudl.1@osu.edu

Cummings, M.P., Neel, M.C. and Shaw, K.L. (2008) A genealogical approach to quantifying lineage divergence. Evolution, 62, 2411–2422.

Gruenstaeudl, M., Reid, N.M., Wheeler, G.R. and Carstens, B.C., submitted. Posterior Predictive Checks of Coalescent Models: P2C2M, an R package.

Heled, J. and Drummond, A.J. (2010) Bayesian inference of species trees from multilocus data. Molecular Biology And Evolution, 27, 570–580.

Maddison, W.P. (1997) Gene trees in species trees. Systematic Biology, 46, 523–536.

Rannala, B. and Yang, Z. (2003) Bayes Estimation of Species Divergence Times and Ancestral Population Sizes Using DNA Sequences From Multiple Loci. Genetics, 164, 1645–1656.

## Example of the minimal data requirements to run P2C2M

# The absolute path to the input directory is set
inPath <- system.file("extdata", "sim.E.003.small/", package="P2C2M")

# The name of the xml-file generated by BEAUTi and located in 
# "inPath" is set
inFile <- "sim.E.003.small.xml"

# Posterior predictive simulations with a setting of 2 simulation 
# replicates are preformed
sim.E.003.small <- p2c2m.complete(inPath, inFile, num.reps=2, save.metadata=TRUE)