View source: R/PairSummaries.R
PairSummaries | R Documentation |
Takes in a “LinkedPairs” object and gene calls, and returns a data.frame of paired features.
PairSummaries(SyntenyLinks,
DBPATH,
PIDs = FALSE,
Score = FALSE,
IgnoreDefaultStringSet = FALSE,
Verbose = FALSE,
Model = "Generic",
DefaultTranslationTable = "11",
AcceptContigNames = TRUE,
OffSetsAllowed = NULL,
Storage = 1,
...)
SyntenyLinks |
A |
DBPATH |
A SQLite connection object or a character string specifying the path to the database file constructed from DECIPHER's |
PIDs |
Logical indicating whether to provide a PID for each pair. If |
Score |
Logical indicating whether to provide a length normalized score with DECIPHER's |
IgnoreDefaultStringSet |
Logical indicating alignment type preferences. If |
Verbose |
Logical indicating whether or not to display a progress bar and print the time difference upon completion. |
Model |
A character string specifying a model to use to predict PIDs without performing an alignment. By default this argument is “Generic” specifying a generic PID prediction model based on PIDs computed from a randomly selected set of genomes. Currently no other models are included. Users may also supply their own model of type “glm” if they so desire in the form of an RData file. This model will need to take in some, or of the columns of statistics per pair that PairSummaries supplies. |
DefaultTranslationTable |
A character used to set the default translation table for |
AcceptContigNames |
Match names of contigs between gene calls object and synteny object. Where relevant, the first white space and everything following are removed from contig names. If |
OffSetsAllowed |
Defaults to |
Storage |
Numeric indicating the approximate size a user wishes to allow for holding |
... |
Arguments to be passed to |
The LinkedPairs
object generated by NucleotideOverlap
is a container for raw data that describes possible orthologous relationships, however ultimate assignment of orthology is up to user discretion. PairSummaries
generates a clear table with relevant statistics for a user to work with as they choose. The option to align all pairs, though onerous can allow users to apply a hard threshold to predictions by PID, while built in models can allow more expedient thresholding from predicted PIDs.
A data.frame of class “data.frame” and “PairSummaries” of paired genes that are connected by syntenic hits. Contains columns describing the k-mers that link the pair. Columns “p1” and “p2” give the location ids of the the genes in the pair in the form “DatabaseIdentifier_ContigIdentifier_GeneIdentifier”. “ExactMatch” provides an integer representing the exact number of nucleotides contained in the linking k-mers. “TotalKmers” provides an integer describing the number of distinct k-mers linking the pair. “MaxKmer” provides an integer describing the largest k-mer that links the pair. A column titled “Consensus” provides a value between zero and 1 indicating whether the kmers that link a pair of features are in the same position in each feature, with 1 indicating they are in exactly the same position and 0 indicating they are in as different a position as is possible. The “Adjacent” column provides an integer value ranging between 0 and 2 denoting whether a feature pair's direct neighbors are also paired. Gap filled pairs neither have neighbors, or are included as neighbors. The “TetDist” column provides the euclidean distance between oligonucleotide - of size 4 - frequences between predicted pairs. “PIDType” provides a character vector with values of “NT” where either of the pair indicates it is not a translatable sequence or “AA” where both sequences are translatable. If users choose to perform pairwise alignments there will be a “PID” column providing a numeric describing the percent identity between the two sequences. If users choose to predict PIDs using their own, or a provided model, a “PredictedPID” column will be provided.
Nicholas Cooley npc19@pitt.edu
FindSynteny
, Synteny-class
, NucleotideOverlap
# this function will be deprecated soon,
# please see the new SummarizePairs() function.
DBPATH <- system.file("extdata",
"Endosymbionts_v02.sqlite",
package = "SynExtend")
data("Endosymbionts_LinkedFeatures", package = "SynExtend")
Pairs <- PairSummaries(SyntenyLinks = Endosymbionts_LinkedFeatures,
PIDs = FALSE,
DBPATH = DBPATH,
Verbose = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.