EvoWeaver-PPPreds | R Documentation |
EvoWeaver
incorporates four classes of prediction, each with multiple
methods and algorithms. Phylogenetic Profiling (PP) methods examine conservation
of gain/loss events within orthology groups using phylogenetic profiles
constructed from presence/absence patterns.
predict.EvoWeaver
currently supports seven PP methods:
'Jaccard'
'Hamming'
'MutualInformation'
'PAPV'
'CorrGL'
'ProfDCA'
'Behdenna'
'GainLoss'
Most PP methods are compatible with a EvoWeaver
object initialized
with any input type. See EvoWeaver
for more information on input data types.
All of these methods use presence/absence (PA) profiles, which are binary vectors such that 1 implies the corresponding genome has that particular gene, and 0 implies the genome does not have that particular gene.
Methods Hamming
and Jaccard
use Hamming and Jaccard distance
(respectively) of PA profiles to determine overall score.
MutualInformation
uses mutual information of PA profiels to determine
score, employing a weighting scheme such that 11
and 00
give
positive information, and 10
and 01
give negative information.
PAPV
calculates a p-value for PA profiles using Fisher's Exact Test. The returned score is provided as 1-p_value
so that larger scores indicate more significance, and smaller scores indicate less significance. This rescaling is consistent with the other similarity metrics in EvoWeaver
. This can be used with Jaccard
, Hamming
, or MutualInformation
to weight raw scores by statistical significance.
ProfDCA
uses the direct coupling analysis algorithm introduced by
Weigt et al. (2005) to determine direct information between PA profiles.
This approach has been validated on PA profiles in Fukunaga and Iwasaki (2022),
though the implementation in EvoWeaver
forsakes the persistent contrasive divergence method in favor of the the algorithm from
Lokhov et al. (2018) for increased speed and exact solutions. Note that this algorithm is still extremely slow relative to the other methods despite the aforementioned runtime improvements.
Behdenna
implements the method detailed in Behdenna et al. (2016) to
find statistically significant interactions using co-occurence of gain/loss
events mapped to ancestral states on a species tree. This method requires
a species tree as input. If the EvoWeaver
object is initialized with dendrogram
objects, SuperTree
will be used to infer a species tree.
GainLoss
uses a similar method to Behdenna
. This method uses Fitch Parsimony to infer where events were gained or lost on a species tree, and then looks for distance between these gain/loss events. Unlike Behdenna
, this method takes into account the types of events (ex. gain/gain and loss/loss are treated differently than gain/loss). This method requires
a species tree as input. If the EvoWeaver
object is initialized with dendrogram
objects, SuperTree
will be used to infer a species tree.
CorrGL
infers where events were gained or lost on a species tree as in method
GainLoss
, then uses a Pearson's correlation coefficient weighted by p-value to infer similarity.
None.
Aidan Lakshman ahl27@pitt.edu
Behdenna, A., et al., Testing for Independence between Evolutionary Processes. Systematic Biology, 2016. 65(5): p. 812-823.
Date, S.V. and E.M. Marcotte, Discovery of uncharacterized cellular systems by genome-wide analysis of functional linkages. Nature Biotechnology, 2003. 21(9): p. 1055-1062.
Fukunaga, T. and W. Iwasaki, Inverse Potts model improves accuracy of phylogenetic profiling. Bioinformatics, 2022.
Lokhov, A.Y., et al., Optimal structure and parameter learning of Ising models. Science advances, 2018. 4(3): p. e1700791.
Pellegrini, M., et al., Assigning protein function by comparative genome analysis: Protein phylogenetic profiles. Proceedings of the National Academy of Sciences, 1999. 96(8) p. 4285-4288
Weigt, M., et al., Identification of direct residue contacts in protein-protein interaction by message passing. Proceedings of the National Academy of Sciences, 2009. 106(1): p. 67-72.
EvoWeaver
predict.EvoWeaver
EvoWeaver Phylogenetic Structure Predictors
EvoWeaver Gene Organization Predictors
EvoWeaver Sequence-Level Predictors
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.