ExoLabel
is much much faster and does a better job cleaning up when aborted earlyExoLabel
now has fewer argumentsExoLabel
to increase computational speed and decrease disk
usage.ExoLabel
will no longer crash if given relative paths.EstimateExoLabel
to reflect new changes.ExoLabel
will no longer brick R during sorts on large files.ExoLabel
reports more progress during some lengthy processing sections when verbose=TRUE
ExoLabel
now allows an inflation
argument to control application of inflationpredict.EvoWeaver
now supports returning p-values separately from raw score for some algorithms.EvoWeaver
algorithms that support it has been fixedRandForest
function added to train random forest modelsRandForest
and DecisionTree
objectsDecisionTree
objects to plot and coerce to dendrogram
subset.dendrogram
EvoWeaver
:predict.EvoWeaver
now returns a data.frame
by defaultMethod
arguments are updated to match their names in the associated EvoWeaver manuscriptPhylogeneticProfiling
, PhylogeneticStructure
, GeneOrganization
, SequenceLevel
for predict.EvoWeaver
ExoLabel
for better status printingFastLabelOOM
function to find communities in graphs/networks on disk space.PrepareSeqs
function, beginning the process of deprecating PairSummaries
in favor of more cohesive and user friendly functions.PhyloDistance
causing Method='JRF'
to return similarity rather than the distanceTreeDistance.EvoWeaver
resulting in an inconsistent calculation of score when using TreeMethods='JRF'
ProtWeaver
and ProtWeb
have been renamed to EvoWeaver
and EvoWeb
, respectivelyEvoWeaver
EvoWeaver
SelectByK
and vignettePAPV.ProtWeaver
to calculate p-values for presence/absence profiles.ContextTree
now uses MirrorTree
with species tree correction and p/a overlap correctionpredict.ProtWeaver
now supports multiple algorithms at once (ex. predict(ew, Method=c("Jaccard", "Hamming"))
)ProtWeaver
and associated methods has been updated to match recent updates.FastQFromSRR
function added as a convenience wrapper for the SRAtoolkit function fastq-dump
.SuperTree
now works directly with dist
objects, providing better performance and scalingsimMat
objectsNVDC.ProtWeaver
DNAseqs=FALSE
argumentMakeBlastDb
function to create a BLAST database from R, plus associated documentation updatesProtWeaver
methodspredict.ProtWeaver
no longer returns using invisible
(this was annoying and unneccessary)MutualInformation.ProtWeaver
removed to allow for parallelizationMirrorTree.ProtWeaver
now works correctly with MTCorrection="speciestree"
CorrGL.ProtWeaver
now uses Fisher's Exact Test for p-values rather than the R value of spearman correlationProtWeaver
almost entirely uses dist
objects rather than matrix
, saving significantly on memoryCophenetic
function implemented internally.Call('cophenetic')
from DECIPHER
to SynExtend
to avoid potential namespace issuesBiocCheck::BiocCheck()
BiocCheck
MoransI
ProtWeaver
objectsProtWeaver
has new attribute speciesTree
, can be initialized with a dendrogram object
SpeciesTree
to get species tree from a ProtWeaver
object (or compute one, if it doesn't exist)dendrapply
implementation (overloads stats::dendrapply
)HungarianAlgorithm
for optimal solving of the linear assignment problem (O(n^3) complexity)Ancestral.ProtWeaver
algorithm for calculating coevolution from correlated residue changesAncestral.ProtWeaver
GRF
method to be called CI
(for Clustering Information Distance)Method="CI"
in PhyloDistance
now calculates an approximate p-value using simulated data from Smith (2020)NVDT
using gene sequence Natural Vector with Dinucleotide and Trinucleotide frequency.Call()
not using PACKAGE="SynExtend"
ColocMoran
, uses Coloc
with MoransI
to correct for phylogenetic signalTranscripMI
, uses mutual information of transcriptional directionMoransI
to calculate Moran's I for a set of spatially distributed signalsShuffleC
now supports reproducibility using R's set.seed
ShuffleC
now support sampling with replacement, performance is around 2.25x faster than sample
TreeDistance
predictor for ProtWeaver
, incorporating all tree distance metrics; these metrics are bundled due to some backend optimizations that improve performancePhyloDistance
MirrorTree
predictor to solve memory problems and increase accuracysample()
HammingGL
changed to CorrGL
, now uses Pearson's R weighted by p-valueShuffleC
function to replicate sample
functionality with 2-6x speedupGainLoss
now uses bootstrapping to estimate a p-valuePhyloDistance
functionrapply
instead of dendrapply
to avoid stack overflow issues due to R recursionRFDist
function to calculate Robinson-Foulds DistanceGeneralizedRF
to make the distance between 0 and 1GeneralizedRF
function to calculated information-theoretic Generalized Robinson-Foulds distance between two dendrograms.GeneralizedRF
metricGeneralizedRF
DPhyloStatistic
function to calculate the D-statistic for a binary state against a phylogeny following Fritz and Purvis (2009).DPhyloStatistic
GainLoss
ProtWeaver
methods Behdenna
and GainLoss
can now infer a species tree when possibleJaccard
and Hamming
methods to use C implementations for distance calculationHammingGL
method to calculate Hamming distance of gain/loss eventsProtWeaver
methods relating to subsettingman
pagesflatdendrapply
, function was already included in SynExtendProtWeaver
SelectByK
, function can work as intended, but is still too conservative at false positive removal.flatdendrapply
for more options to apply functions to dendrograms. Function is used in SuperTree
.SuperTree
to construct a species tree from a set of gene trees.SuperTreeEx
for SuperTree
and flatdendrapply
examples.SelectByK
function argument ClusterSelect
switched to ClusterScalar
. Cluster number selection now performed by fitting sum of total within cluster sum of squares to a right hyperbola and taking the ceiling of the half-max. Scalar allows a user to pick different tolerances to select more, or less clusters. Plotting behavior updated.simMat
class now supports empty indexing (s[]
)simMat
class now supports logical accession (s[c(T,F,T),]
)SelectByK
that allows for quick removal of false positive predicted pairs based on a relatively simple k-means approach. Function is currently designed for use on the single genome-to-genome pairwise comparison, and not on an all-vs-all many genomes scale, though it may provide acceptable results on that scale.simMat
class for dist
-like similarity matrices that can be manipulated like base matricesProtWeaver
internalssimMat
objects whenever possible to decrease memory footprintContextTree
and ProfDCA
require matrices internallyProtWeb
objects now inherit from simMat
ProtWeb.show
and ProtWeb.print
now display predictions in a more natural wayGetProtWebData()
deprecated; ProtWeb
now inherits as.matrix.simMat
and as.data.frame.simMat
simMat
classGetProtWebData
documentation page reworked into ProtWeb
documentation file.Method='Hamming'
introduced in SynExtend 1.9.9Method='Hamming'
ProtWeaver
to make individual files more manageablepredict.ProtWeaver
BlockReconciliation
now returns a an object of class PairSummaries
.src/
(originally added by mistake)ResidueMI.ProtWeaver
predict.ProtWeaver
now correctly labels rows/columns with gene names, not numberspredict.ProtWeaver
now correctly handles Subset
argumentspredict.ProtWeaver(..., Subset=3)
will correctly predict for all pairs involving gene 3
(or for any gene x
, as long as Subset
is a length 1 character or integer vector).ProtWeaver
ProtWeaver
GenRearrScen
, improves consistency and output formattingProtWeaver
methods using dendrogram objectsProtWeaver
now correctly guards against non-bifurcating dendrograms in methods that expect itProtWeaver
class to predict functional association of genes from COGs or gene trees. This implements many algorithms commonly used in the literature, such as MirrorTree and Inverse Potts Models.predict(ProtWeaverObject)
returns a ProtWeb
class with information on predicted associations.BlastSeqs
to run BLAST queries on sequences stored as an XStringSet
or FASTA
file.ExtractBy
function. Methods and inputs simplified and adjusted, and significant improvements to speed.NucleotideOverlaps
to now correctly registers hits in genes with a large degree of overlap with the immediately preceding gene.BlockExpansion
where contigs with zero features could cause an error in expansion attempts.BlockReconciliation
now allows for setting either block size or mean PID for reconciliation precedence.BlockReconciliation
.BlockExpansion
cases corrected for zero added rows.BlockExpansion
and BlockReconciliation
functions.DECIPHER
's ScoreAlignment
function.PairSummaries
function.BlockExpansion
function.PairSummaries
handles default translation tables and GFF derived gene calls.PairSummaries
.OffSetsAllowed
argument now defaults to FALSE
. This argument may be dropped in the future in favor of a more complex function post-summary.SequenceSimilarity
SubSetPairs
that allows for easy trimming of predicted pairs based on conflicting predictions and / or prediction statistics.EstimageGenomeRearrangements
that generates rearrangement scenarios of large scale genomic events using the double cut and join model.SequenceSimilarity
and made improvements to runtime in DisjointSet
.PairSummaries
where features facing on different strands had their score computed incorrectly.PairSummaries
.PairSummaries
function and minor changes to NucleotideOverlaps
, ExtractBy
, and FindSets
. Adjustments to the model that PairSummaries
calls on to predict PIDs.ExtractBy
function has been added. Allows extraction of feature sequences into XStingSet
s organized by the a PairSummaries
object or the single linkage clusters implied by pairings within the PairSummaries
objects.DisjointSet
function added to extract single linkage clusters from a PairSummaries
object.PairSummaries
now computes 4-mer distance between predicted pairs.PairSummaries
now returns a column titled Adjacent that provides the number of directly adjacent neighbor pairs to a predicted pair. Gap filling code adjusted.FindSets
has been added and performs single linkage clustering on a pairs list as represented by vectors of integers using the Union-Find algorithm. Long term this function will have a larger wrapper function for user ease of access but will remain exposed.NucleotideOverlap
now passes it's GeneCalls object forward, allowing PairSummaries
to forego inclusion of that object as an argument.PairSummaries
now allows users to fill in specific matching gaps in blocks of predicted pairs with the arguments AllowGaps
and OffSetsAllowed
.PairSummaries
and NucleotideOverlap
.PairSummaries
adjusted.AcceptContigNames
, but ensuring that the correct contigs in GeneCalls objects are matched to the appropriate contigs in Synteny objects are then the user's responsibility.PairSummaries
now translates sequences based on transl_table
attributes provided by gene callsPairSummaries
now uses a generic model for predicting PIDgffToDataFrame
now parses out the transl_table
attributeNucleotideOverlap
PairSummaries
- can now take in objects of class Genes
build by the DECIPHER function FindGenes()
SynExtend
submitted to BioconductorgffToDataFrame
NucleotideOverlap
PairSummaries
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.