overfitSC | R Documentation |
Testing the robustness of search.conv
(Castiglione et al. 2019b) results to sampling effects and
phylogenetic uncertainty.
overfitSC(RR,y.list,phylo.list,s=0.25,
nodes=NULL,state=NULL,declust=FALSE,
aces=NULL,x1=NULL,aces.x1=NULL,cov=NULL,rootV=NULL, clus=0.5)
RR |
an object produced by |
y.list |
a list of multivariate phenotype related to the phylogenetic
trees provided as |
phylo.list |
a list of phylogenetic trees. The phylogenies in
|
s |
the percentage of tips to be cut off. It is set at 25% by default.
If |
nodes |
the argument |
state |
the argument |
declust |
the argument |
aces |
if used to produce the |
x1 |
the additional predictor to be specified if the RR object has been
created using an additional predictor (i.e. multiple version of
|
aces.x1 |
a named vector of ancestral character values at nodes for
|
cov |
if used to produce the |
rootV |
if used to produce the |
clus |
the proportion of clusters to be used in parallel computing. To
run the single-threaded version of |
Methods using a large number of parameters risk being overfit. This
usually translates in poor fitting with data and trees other than the those
originally used. With RRphylo
methods this risk is usually very low.
However, the user can assess how robust the results of search.conv
are by running resampleTree
and overfitSC
. The former is used
to subsample the tree according to a s
parameter (that is the
proportion of tips to be removed from the tree) and to alter tree topology
by means of swapONE
. Once a list of new phylogenetic trees
(phylo.list
) is generated, in case the shape data to feed to
search.conv
are reduced (e.g. via SVD), it is necessary to recompute
data reduction, thus obtaining a list of multivariate phenotypes related to
the phylogenetic trees (y.list
). Finally, overfitSC
performs
RRphylo
and search.conv
on each new set of tree and data.
Thereby, both the potential for overfit and phylogenetic uncertainty are
accounted for straight away.
Otherwise, a list of alternative phylogenies can be supplied to
overfitSC
. In this case subsampling and swapping arguments are
ignored, and robustness testing is performed on the alternative topologies
as they are. If a clade has to be tested in search.conv
, the
function scans each alternative topology searching for the corresponding
clade. If the species within such clade on the alternative topology differ
more than 10% from the species within the clade in the original tree, the
identity of the clade is considered disrupted and the test is not
performed.
The function returns a 'RRphyloList' object containing:
$RR.list a 'RRphyloList' including the results of each
RRphylo
performed within overfitSC
$SCnode.list a 'RRphyloList' including the results of each
search.conv - clade condition
performed within overfitSC
$SCstate.list a 'RRphyloList' including the results of each
search.conv - state condition
performed within overfitSC
$conv.results a list including results for
search.conv
performed under clade
and state
conditions. If a node pair is specified within conv.args
, the
$clade
object contains the percentage of simulations producing
significant p-values for convergence between the clades, and the proportion
of tested trees (i.e. where the clades identity was preserved; always 1 if
no phylo.list
is supplied). If a state vector is supplied within
conv.args
, the object $state
contains the percentage of
simulations producing significant p-values for convergence within (single
state) or between states (multiple states).
The output always has an attribute "Call" which returns an unevaluated call to the function.
Silvia Castiglione, Giorgia Girardi, Carmela Serio
Castiglione, S., Serio, C., Tamagnini, D., Melchionna, M., Mondanaro, A., Di Febbraro, M., Profico, A., Piras, P.,Barattolo, F., & Raia, P. (2019b). A new, fast method to search for morphological convergence with shape data. PLoS ONE, 14, e0226949. https://doi.org/10.1371/journal.pone.0226949
search.conv
vignette;
overfit
vignette;
Alternative-trees
vignette
## Not run:
require(phytools)
require(Morpho)
require(ape)
cc<- 2/parallel::detectCores()
DataFelids$treefel->treefel
DataFelids$statefel->statefel
DataFelids$landfel->feldata
# perform data reduction via Procrustes superimposition (in this case) and RRphylo
procSym(feldata)->pcafel
pcafel$PCscores->PCscoresfel
RRphylo(treefel,PCscoresfel,clus=cc)->RRfelids
# apply search.conv under nodes and state condition
search.conv(RR=RRfelids, y=PCscoresfel, min.dim=5, min.dist="time38", clus=cc)->sc.clade.time
search.conv(tree=treefel, y=PCscoresfel, state=statefel, declust=TRUE, clus=cc)->sc.state
# select converging clades returned in sc.clade.time
felnods<-rbind(c(85,155),c(85,145))
## overfitSC routine
# generate a list of subsampled and swapped phylogenies to test for search.conv
# robustness. Use as reference tree the phylogeny returned by RRphylo.
# Set the nodes and the categories under testing as arguments of
# resampleTree so that it maintains no less than 5 species at least in each
# clade/state.
treefel.list<-resampleTree(RRfelids$tree,s=0.15,nodes=unique(c(felnods)),categories=statefel,
nsim=15,swap.si=0.1,swap.si2=0.1)
# match the original data with each subsampled-swapped phylogeny in treefel.list
# and repeat data reduction
y.list<-lapply(treefel.list,function(k){
treedataMatch(k,feldata)[[1]]->ynew
procSym(ynew)$PCscores
})
# test for robustness of search.conv results by overfitSC
oSC<-overfitSC(RR=RRfelids,phylo.list=treefel.list,y.list=y.list,
nodes = felnods,state=statefel,clus=cc)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.