NANUQ | R Documentation |
Apply the NANUQ algorithm of \insertCiteABR19;textualMSCquartets to infer a hybridization network from a collection of gene trees, under the level-1 network multispecies coalescent (NMSC) model.
NANUQ(
genedata,
outfile = "NANUQdist",
omit = FALSE,
epsilon = 0,
alpha = 0.05,
beta = 0.95,
taxanames = NULL,
plot = TRUE
)
genedata |
gene tree data that may be supplied in any of 3 forms:
|
outfile |
a character string giving an output file name stub for
saving a |
omit |
|
epsilon |
minimum for branch lengths to be treated as non-zero; ignored if gene tree data given as quartet table |
alpha |
a value or vector of significance levels for judging p-values testing a null hypothesis of no hybridization vs. an alternative of hybridization, for each quartet; a smaller value applies a less conservative test for a tree (more trees), hence a stricter requirement for desciding in favor of hybridization (fewer reticulations) |
beta |
a value or vector of significance levels for judging p-values testing
a null hypothesis of a star tree (polytomy) for each quartet vs. an alternative of anything else; a smaller value applies a less conservative
test for a star tree (more polytomies), hence a stricter requirement for deciding in favor of a resolved tree or network;
if vectors, |
taxanames |
if |
plot |
|
This function
counts displayed quartets across gene trees to form quartet count concordance factors (qcCFs),
applies appropriate hypothesis tests to judge qcCFs as representing putative hybridization,
resolved trees, or unresolved (star) trees using alpha
and beta
as significance levels,
produces a simplex plot showing results of the hypothesis tests for all qcCFs
computes the appropriate NANUQ distance table, writing it to a file.
The distance table file
can then be opened in the external software SplitsTree \insertCiteSplitsTreeMSCquartets (recommended) or within R using the package phangorn
to
obtain a circular split system under the Neighbor-Net algorithm, which is then depicted as a splits graph.
The splits graph should be interpreted via
the theory of \insertCiteABR19;textualMSCquartets to infer the level-1 species network, or to conclude the data does
not arise from the NMSC on such a network.
If alpha
and beta
are vectors, they must have the same length k. Then the i-th entries are paired to
produce k plots and k output files. This is equivalent to k calls to NANUQ
with scalar values of alpha
and beta
.
A call of NANUQ
with genedata
given as a table previously output from NANUQ
is
equivalent to a call of NANUQdist
. If genedata
is a table previously output from quartetTableResolved
which lacks columns of p-values for hypothesis tests, these will be appended to the table output by NANUQ
.
If plots are produced, each point represents an empirical quartet concordance factor, color-coded to represent test results.
In general, alpha
should be chosen to be small and beta
to be large so that most quartets are interpreted as resolved trees.
Usually, an initial call to NANUQ
will not give a good analysis, as values
of alpha
and beta
are likely to need some adjustment based on inspecting the data. Saving the returned
table from NANUQ
will allow for the results of the time-consuming computation of qcCFs to be
saved, along with p-values,
for input to further calls of NANUQ
with new choices of alpha
and beta
.
See the documentation for quartetNetworkDist
for an explanation of a small, rarely noticeable,
stochastic element of the algorithm.
For data sets of many gene trees, user time may be reduced by using parallel code for
counting displayed quartets. See quartetTableParallel
, where example commands are given.
a table $pTable
of quartets and p-values for judging fit to the MSC on quartet
trees, and a distance table $dist
, or list of distance tables, giving NANUQ distance (returned invisibly);
the table can be used as input to NANUQ
or NANUQdist
with new choices of alpha and beta, without re-tallying quartets on
gene trees; the distance table is to be used as input to NeighborNet.
ABR19MSCquartets
\insertRefSplitsTreeMSCquartets
quartetTable
, quartetTableParallel
, quartetTableDominant
, quartetTreeTestInd
,
quartetStarTestInd
, NANUQdist
, quartetTestPlot
, pvalHist
,
quartetNetworkDist
data(pTableYeastRokas)
out=NANUQ(pTableYeastRokas, alpha=.05, beta=.95, outfile = NULL)
# Specifying an outfile would write the distance table to it for opening in SplitsTree.
# Alternately, to use the phangorn implementation of NeighborNet
# within R, enter the following additional lines:
nn=neighborNet(out$dist)
plot(nn,"2D")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.