CompareAssessmentResults: Compare Assessment Results

Description Usage Arguments Details Value See Also Examples

View source: R/CompareAssessmentResults.R

Description

Compare two objects of class Assessment. subclass Results to determine how their gene sets and the corresponding category assignments vary

Usage

1
2
3
4
CompareAssessmentResults(obj1,
                         obj2,
                         printSummary = TRUE,
                         returnDetails = FALSE)

Arguments

obj1, obj2

Objects of class Assessment and subclass Results to compare against each other. Alternatively, either obj1 or obj2 (or both) can be a two-element character vector that specifies one of such objects from AssessORFData. The first element in the vector should be the strain identifier, and the second element should be the gene source identifier. Both objects should have been generated from the same mapping object.

printSummary

Logical indicating whether or not to print out a summary of the comparison analysis.

returnDetails

Logical indicating whether or not to return a list of details from the comparison analysis. See the next section for what items are in the list.

Details

Since the same mapping object (an object class Assessment and subclass DataMap) can be used to assess multiple sets of genes for genome, it is meaningful to compare how those gene sets and their category assignments from AssessGenes vary from one another. To make describing this function easier, let us assume that one set of genes consists of the complete set of predictions made by a gene-finding program on a particular strain's genome and that the other set of genes consists of the complete set of predictions from a second gene-finding program.

When gene-finding programs predict genes for a genome, they make a decision on which regions of the genome code for proteins. There is (usually) only one option for the stop codon that ends a particular coding region, but there are typically multiple options available for the start codon that will mark the beginning of a region. It is therefore useful to find out which (general) coding regions the two programs agree on by determining which stops are found in both sets of predicted genes. From there, the starts each program picked for those shared coding regions can be compared to see whether they agree or not. If the same start is chosen by both programs for a particular shared stop / coding region, then that is an example of a gene predicted by both programs. If the starts chosen by the two programs for a particular shared stop / coding region are different, then that it is an example of both programs agreeing that that particular region of the genome codes for protein but disagreeing on where in the genome that region starts. It would be interesting to see what category was assigned to each start, especially if one start has evolutionary conservation and the other does not.

This function compares the set of genes and corresponding category assignments in the object specified by obj1 (object 1) to the set of genes and corresponding category assignments in the object specified by obj2 (object 2). It then reports the results of the comparison analysis in the format specified by the logical parameters printSummary and returnDetails.

If printSummary is true, the function prints out the following information: the number of shared coding regions (i.e., the number of stops in both gene sets), the number of shared genes (i.e., the number of times both a start and its corresponding stop are found in both sets), and the number of instances where a stop is found in both gene sets but the corresponding starts in each set disagree. For the shared stop - different start set, the function also prints the number of instances where the start from one object has conservation evidence while the corresponding start in the other object does not.

If returnDetails is true, the function returns a 11-item list. Each item of the list is described below. The contents of the object 1 and object 2 gene vectors correspond to the ordering of the genes inside object 1 or object 2, respectively. For the category assignment matrix for shared stop - different start set, it is possible for the gene in object 1 to be assigned to the same category as the corresponding gene from object 2, and the table reflects that.

Please ensure that obj1 and obj2 come from the same strain / mapping object. The function will do its best to make sure the identifying information for obj1 and obj2 match.

printSummary and returnDetails cannot both be FALSE.

Value

If returnDetails is true, the function returns a 11-item list. Otherwise, the function invisibly returns object 1.

See Also

Assessment-class, AssessGenes

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
## Example showing how to use the function with the AssessORFData package:

## Not run: 
compare1 <- CompareAssessmentResults(obj1 = c("MGAS5005", "Prodigal"),
                                     obj2 = c("MGAS5005", "GeneMarkS2"),
                                     printSummary = TRUE,
                                     returnDetails = TRUE)

## End(Not run)

resObj1 <- readRDS(system.file("extdata",
                               "MGAS5005_PreSaved_ResultsObj_Prodigal.rds",
                               package = "AssessORF"))
                               
resObj2 <- readRDS(system.file("extdata",
                               "MGAS5005_PreSaved_ResultsObj_GeneMarkS2.rds",
                               package = "AssessORF"))
                               
compare2 <- CompareAssessmentResults(obj1 = resObj1,
                                     obj2 = resObj2,
                                     printSummary = TRUE,
                                     returnDetails = TRUE)

DRK248/AssessORF documentation built on Jan. 30, 2020, 7:05 p.m.