getANNOTATION: General annotation function

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/getANNOTATION.R

Description

Takes a vector of identifiers and an annotation table and matches the identifiers in the table to retrieve the corresponding annotation. Only the first occurence of each identifier in the annotation table is considered.

Usage

1
getANNOTATION(identifier, annot, diagnose = FALSE, identifierCol = 1, annotationCol = 15, noAnnotationSymbol = NA, noAnnotationProvidedSymbol = "---", sep = " /// ")

Arguments

identifier

vector containing identifiers to be annotated.

annot

annotation table (data frame) where each row is a record and each column is an annotation field.

diagnose

logical. If TRUE, 3 (logical) vectors used for diagnostic purpose are returned in addition to the annotation. If FALSE (default) only the annotation is returned.

identifierCol

column in annotation table where the provided identifiers are to be looked up.

annotationCol

column in annotation table containing the desired annotation.

noAnnotationSymbol

character string to be used in output list 'annotation' if no annotation is found or provided.

noAnnotationProvidedSymbol

character string used in annotation table and indicating missing annotation.

sep

character string used in annotation table to separate multiple annotation of a single identifier.

Details

The annotation is returned as elements of list 'annotation'. If a single annotation is given for a particular identifier, the corresponding element of 'annotation' has length 1. If multiple annotation is provided for a single identifier (i.e. character string with 'sep' separating multiple annotations), the mulitple annotation is split and the corresponding vector is returned as an element of list 'annotation'.

Value

annotation

list of length 'length(identifier)' the 'i'-th element of which contains the annotation for 'identifier[i]'.

empty

logical vector of length 'length(identifier)'. 'empty[i]' is TRUE if 'identifier[i]' is empty or NA.

noentry

locial vector of length 'length(identifier)'. 'noentry[i]' is TRUE if 'identifier[i]' cannot be found in 'annot[,identifierCol]'.

noannotation

locial vector of length 'length(identifier)'. 'noannotation[i]' is TRUE if 'a[i]==noAnnotationProvidedSymbol' is TRUE.

Note

Use getMULTIANNOTATION if the identifiers occur on more than one line in the annotation table.

Author(s)

Alexandre Kuhn

References

Kuhn et al. Cross-species and cross-platform gene expression studies with the Bioconductor-compliant R package 'annotationTools'. BMC Bioinformatics, 9:26 (2008)

See Also

getMULTIANNOTATION

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
##example annotation table
annotation<-cbind(gene=c('gene_1a, gene_1b','gene_2','gene_3','gene_4'),probe=c('probe_1','probe_2','probe_3','probe_4'),sequence=c('sequence_1','sequence_2a, sequence_2c','sequence_3',''))
print(annotation)

##get sequences for probe_2, probe_3, probe_4 and probe_100
myProbes<-c('probe_2','probe_3','probe_4','probe_100',NA)
getANNOTATION(myProbes,annotation,identifierCol=2,annotationCol=3,noAnnotationProvidedSymbol='',sep=', ')

##track origin of annotation failure for the last 3 probes
getANNOTATION(myProbes,annotation,identifierCol=2,annotationCol=3,noAnnotationProvidedSymbol='',sep=', ',diagnose=TRUE)

Example output

     gene               probe     sequence                  
[1,] "gene_1a, gene_1b" "probe_1" "sequence_1"              
[2,] "gene_2"           "probe_2" "sequence_2a, sequence_2c"
[3,] "gene_3"           "probe_3" "sequence_3"              
[4,] "gene_4"           "probe_4" ""                        
Warning: one or more empty identifers in input
Warning: one or more identifers not found in annotation
[[1]]
[1] "sequence_2a" "sequence_2c"

[[2]]
[1] "sequence_3"

[[3]]
[1] NA

[[4]]
[1] NA

[[5]]
[1] NA

Warning message:
In getANNOTATION(myProbes, annotation, identifierCol = 2, annotationCol = 3,  :
  One or more identifers with no annotation provided
Warning: one or more empty identifers in input
Warning: one or more identifers not found in annotation
[[1]]
[[1]][[1]]
[1] "sequence_2a" "sequence_2c"

[[1]][[2]]
[1] "sequence_3"

[[1]][[3]]
[1] NA

[[1]][[4]]
[1] NA

[[1]][[5]]
[1] NA


[[2]]
[1] FALSE FALSE FALSE FALSE  TRUE

[[3]]
[1] FALSE FALSE FALSE  TRUE FALSE

[[4]]
[1] FALSE FALSE  TRUE FALSE FALSE

Warning message:
In getANNOTATION(myProbes, annotation, identifierCol = 2, annotationCol = 3,  :
  One or more identifers with no annotation provided

annotationTools documentation built on Nov. 8, 2020, 6:58 p.m.