SummarizePairs: Provide summaries of hypothetical orthologs.

View source: R/SummarizePairs.R

SummarizePairsR Documentation

Provide summaries of hypothetical orthologs.

Description

Given LinkedPairs object and a DECIPHER database, return a data.frame of summarized genomic feature pairs. SummarizePairs will collect all the linked genomic features in the supplied LinkedPairs-class object and return descriptions of the alignments of those features.

Usage

SummarizePairs(SynExtendObject,
               DataBase01,
               IncludeIndexSearch = TRUE,
               AlignmentFun = "AlignPairs",
               RetainAnchors = TRUE,
               DefaultTranslationTable = "11",
               KmerSize = 5,
               IgnoreDefaultStringSet = FALSE,
               Verbose = FALSE,
               ShowPlot = FALSE,
               Processors = 1,
               Storage = 2,
               IndexParams = list("K" = 6),
               SearchParams = list("perPatternLimit" = 1),
               ...)

Arguments

SynExtendObject

An object of class LinkedPairs-class.

DataBase01

A character string pointing to a SQLite database, or a connection to a DECIPHER database.

IncludeIndexSearch

A logical determining whether to include SearchIndex results in the initial inference.

AlignmentFun

A character string specifying a link{DECIPHER} alignment function. Currently only supports AlignProfiles and AlignPairs.

RetainAnchors

An argument that only affects AlignPairs; provide the kmer hits supplied by FindSynteny as alignment anchors.

DefaultTranslationTable

A character vector of length 1 identifying the translation table to use if one is not supplied in the GeneCalls attribute.

KmerSize

An integer specifying what Kmer size to collect Kmer distance between sequences at.

IgnoreDefaultStringSet

Translate all sequences in nucleotide space.

Verbose

Logical indicating whether or not to display a progress bar and print the time difference upon completion.

ShowPlot

Logical indicating whether or not to provide a plot of features collected by the function. Currently not implemented.

Processors

An integer value indicating how many processors to supply to AlignPairs. Supplying NULL will cause detection and use of all available cores.

Storage

A soft memory limit for how much sequence data from the database to retain in memory while running. In Gb.

IndexParams

Arguments to be passed to IndexSeqs.

SearchParams

Arguments to be passed to SearchIndex.

...

Additional arguments to pass to interior functions. Currently not implemented.

Details

SummarizePairs collects features describing each linked feature pair. These include an alignment PID, an alignment Score, a Kmer distance, a concensus score for the linking hits –or whether or not linking hits are in similar places in each feature– and a few other features.

Value

An object of class PairSummaries.

Author(s)

Nicholas Cooley npc19@pitt.edu

See Also

PrepareSeqs, NucleotideOverlap, FindSynteny, LinkedPairs-class

Examples

library(RSQLite)
DBPATH <- system.file("extdata",
                      "Endosymbionts_v02.sqlite",
                      package = "SynExtend")
tmp <- tempfile()
system(command = paste("cp",
                       DBPATH,
                       tmp))
DBCONN <- dbConnect(SQLite(), tmp)
                      
data("Endosymbionts_LinkedFeatures", package = "SynExtend")
PrepareSeqs(SynExtendObject = Endosymbionts_LinkedFeatures,
            DataBase01 = DBCONN,
            Verbose = TRUE)
SummarizedPairs <- SummarizePairs(SynExtendObject = Endosymbionts_LinkedFeatures,
                                  DataBase01 = DBCONN,
                                  Verbose = TRUE)
dbDisconnect(DBCONN)
unlink(tmp)
                           

npcooley/SynExtend documentation built on Dec. 20, 2024, 4:03 p.m.