Description DataMap Objects Results Objects Gene Categories S3 Methods
In order to assess the quality of a set of (predicted) genes for a genome, evidence must first be mapped to that genome.
Next, each gene must be categorized based on how strong the evidence is for that gene or against that gene. Class Assessment
furnishes objects that can store the necessary information for assessing a set of genes for a genome and also provides
functions for viewing and visualizing assessment information. Specifically, class Assessment objects utilize proteomic hits
and evolutionarily conserved start & stop codons as evidence to determine the correctness for each gene in a given set.
DataMap ObjectsObjects of class Assessment and subclass DataMap are used to store the mapping of proteomics and evolutionary
conservation to the genome of interest (central genome). They are generated through the function MapAssessmentData,
and they have a list structure containing the following elements:
StrainIDEqual to strainID if it was specified; otherwise ""
SpeciesEqual to speciesName if it was specified; otherwise ""
GenomeLengthLength of the central genome
StopsByFrameWhere the stops are in each frame, used to bound open reading frames in downstream functions
N-TermProteomicsLogical describing whether or not the proteomics hits are from N-terminal proteomics
FwdProtHitsProteomic hit information that maps to the three forward frames of the central genome
RevProtHitsProteomic hit information that maps to the three reverse frames of the central genome
FwdCoverageCoverage of the forward strand of the central genome
FwdConStartsStart codon conservation of the forward strand of the central genome
FwdConStopsStop codon conservation of the forward strand of the central genome
RevCoverageCoverage of the reverse strand of the central genome
RevConStartsStart codon conservation of the reverse strand of the central genome
RevConStopsStop codon conservation of the reverse strand of the central genome
NumRelatedGenomesFinal number of related genomes that were mapped to the central genome
HasProteomicsLogical describing whether or not proteomics evidence has been mapped to the central genome
HasConservationLogical describing whether or not evolutionary conservation evidence has been mapped to the central genome
Results ObjectsObjects of class Assessment and subclass Results are used to store how correct a set of genes for a given genome.
The function AssessGenes generates Results using a DataMap object and information on a set of genes
for the genome corresponding to the DataMap object. Results objects have a list structure containing the following
elements:
StrainIDEqual to the strainID of the corresponding DataMap object
SpeciesEqual to speciesName of the corresponding DataMap object
GenomeLengthLength of the genome
GeneLeftPosLeft positions of the given set of genes (in forward strand terms)
GeneRightPosRight positions of the given set of genes (in forward strand terms)
GeneStrandStrand information of the given set of genes ("+" or "-")
GeneSourceThe source of the given set of genes
NumGenesNumber of genes given
N_CS-_PE+_ORFsData for open reading frames with no gene start but with proteomics evidence
N_CS<_PE+_ORFsData for open reading frames with no gene start but with proteomics evidence and at least one valid evolutionarily conserved start
CategoryAssignmentsA character vector that stores the category assignment for each of the given genes in the same order as the gene information (please see below for a list of all possible categories, their descriptions, and their character string codes)
The CategoryAssignments vector in Results objects describes how the proteomics evidence and evolutionarily
conserved start/stop codon evidence support or disprove the corresponding set of genes. In the vector, each gene is assigned
a character string code that has the following format: "Y CS[_] PE[_]". The first part, the "Y", signifies that for this ORF
contains a predicted gene. The second part, the "CS[_]", describes how the conserved start(s) lines up with the given gene
start. The third part, the "PE[_]", describes how the proteomics hits line up with the given gene start.
Y CS+ PE+There is a good conserved start aligned with the gene start with protein evidence downstream.
Y CS+ PE-There is a good conserved start aligned with the gene start without protein evidence downstream.
Y CS- PE+There is no good conserved start aligned with the predicted start, and there is protein evidence downstream of the gene start.
Y CS- PE-There is no good conserved start aligned with the predicted start, and there is no protein evidence downstream of the gene start.
Y CS! PE-There are either multiple good conserved stops in the middle of the gene, or the most downstream, good conserved stop is followed by a good conserved start. There is no protein evidence downstream of the gene start
Y CS! PE+The most downstream, good conserved stop is followed by a good conserved start, and there is protein evidence downstream of the gene start.
Y CS< PE!The protein evidence disagrees with/is upstream of the gene start, and there is a good conserved start upstream of the protein evidence.
Y CS- PE!The protein evidence disagrees with/is upstream of the gene start, and there is no good conserved start upstream of the protein evidence.
Y CS> PE+The best conserved starts are downstream of the predicted start, and there is protein evidence downstream of the gene start.
Y CS> PE-The best conserved starts are downstream of the predicted start, and there is no protein evidence downstream of the gene start.
Y CS< PE+At least one of the best conserved starts is upstream of the predicted start, and there is protein evidence downstream of the gene start.
Y CS< PE-At least one of the best conserved starts is upstream of the predicted start, and there is no protein evidence downstream of the gene start.
as.matrix.Assessment (only works with objects of class Results)
print.Assessment
plot.Assessment
mosaicplot.Assessment (only works with objects of class Results)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.