NucleotideOverlap: Tabulating Pairs of Genomic Sequences

View source: R/NucleotideOverlap.R

NucleotideOverlapR Documentation

Tabulating Pairs of Genomic Sequences

Description

A function for concisely tabulating where genomic features are connected by syntenic hits.

Usage

NucleotideOverlap(SyntenyObject,
                  GeneCalls,
                  LimitIndex = FALSE,
                  AcceptContigNames = TRUE,
                  Verbose = FALSE)

Arguments

SyntenyObject

An object of class “Synteny” built from the FindSynteny in the package DECIPHER.

GeneCalls

A named list of objects of class “DFrame” built from gffToDataFrame, objects of class “GRanges” imported from rtracklayer::import, or objects of class “Genes” created from the DECIPHER function FindGenes. “DFrame”s built by “gffToDataFrame” can be used directly, while “GRanges” objects may also be used with limited functionality. Using a “GRanges” object will force all alignments to nucleotide alignments. Objects of class “Genes” generated by FindGenes function equivalently to those produced by gffToDataFrame. Using a “GRanges” object will force LimitIndex to TRUE.

LimitIndex

Logical indicating whether to limit which indices in a synteny object to query. FALSE by default, when TRUE only the first sequence in all selected identifiers will be used. LimitIndex can be used to skip analysis of plasmids, or solely query a single chromosome.

AcceptContigNames

Match names of contigs between gene calls object and synteny object. Where relevant, the first white space and everything following are removed from contig names. If “TRUE”, NucleotideOverlap assumes that the contigs at each position in the synteny object and “GeneCalls” object are in the same order. Is automatically set to TRUE when “GeneCalls” are of class “GRanges”.

Verbose

Logical indicating whether or not to display a progress bar and print the time difference upon completion.

Details

Builds a matrix of lists that contain information about linked pairs of genomic features.

Value

An object of class “LinkedPairs”. “LinkedPairs” is fundamentally just a list in the form of a matrix. The lower triangle of the matrix is populated with matrices that contain all kmer hits from the “Synteny” object that link features from the “GeneCalls” object. The upper triangle is populated by matrices of the summaries of those hits by feature. The diagonal is populated by named vectors of the lengths of the contigs, much like in the “Synteny” object. The “LinkedPairs” object also contains a “GeneCalls” attribute that contains the user supplied features in a slightly more trimmed down form. This allows users to only need to supply gene calls once and not again in the “PairSummaries” function.

Author(s)

Nicholas Cooley npc19@pitt.edu

See Also

FindSynteny, Synteny-class

Examples

data("Endosymbionts_GeneCalls", package = "SynExtend")
data("Endosymbionts_Synteny", package = "SynExtend")

Links <- NucleotideOverlap(SyntenyObject = Endosymbionts_Synteny,
                           GeneCalls = Endosymbionts_GeneCalls,
                           LimitIndex = FALSE,
                           Verbose = TRUE)

npcooley/SynExtend documentation built on Nov. 15, 2024, 3:02 p.m.