Description Usage Arguments Details Value Note Author(s) References See Also Examples
Takes a vector of gene IDs (or identifiers of other types) and an annotation table and looks up the gene IDs in the table to retrieve the corresponding probe set identifiers. Each gene ID can occur multiple times (i.e. on mulitple lines) in the annotation table.
1 | getPROBESET(geneid, annot, uniqueID = FALSE, diagnose = FALSE, idCol = 19, noPSsymbol = NA, noPSprovidedSymbol = "---")
|
geneid |
character vector containing the gene IDs. |
annot |
annotation table (data frame) where each row is a record and each column is an annotation field. |
uniqueID |
logical. If TRUE, only probe set IDs annotated with a single gene ID are returned. If FALSE, probe set IDs annotated with multiple gene IDs are returned too. |
diagnose |
logical. If TRUE, 3 (logical) vectors used for diagnostic purpose are returned in addition to the annotation. If FALSE (default) only the annotation is returned. |
idCol |
column in annotation table containing the gene identifiers. |
noPSsymbol |
character string to be used in output list 'ps' if no probe set ID is found or provided in the annotation table. |
noPSprovidedSymbol |
character string used in annotation table and indicating missing probe set ID. |
This function can be used with Affymetrix annotation files (e.g. 'HG-U133\_Plus\_2\_annot.csv'). It retrieves probe set IDs corresponding to particular gene identifiers. By default, the function takes gene IDs but any type of identifier (e.g. gene symbol) can be used (set 'idCol' accordingly).
Probe set IDs are returned as elements of list 'ps'. If multiple probe set IDs are found for 'geneid[i]', a vector containing all probe set IDs is returned as the 'i-th' element of list 'ps'.
The default values for 'idCol', 'noPSsymbol', and 'noPSprovidedSymbol' are chosen to suit the format of Affymetrix annotation files. However, options can be set to look up any annotation table, provided the probe set identifiers are in the first column.
ps |
list of length 'length(geneid)' the 'i'-th element of which contains the probe set IDs for 'geneid[i]'. |
empty |
logical vector of length 'length(geneid)'. 'empty[i]' is TRUE if 'geneid[i]' is empty or NA. |
noentry |
locial vector of length 'length(geneid)'. 'noentry[i]' is TRUE if 'geneid[i]' cannot be found in column 'idCol' (default is column 19) of the annotation table. |
noid |
locial vector of length 'length(geneid)'. 'noid[i]' is TRUE if 'ps[i]==noIDprovidedSymbol' is TRUE. |
getMULTIANNOTATION
provides a more flexible solution that can be used with arbitrary annotation tables.
Alexandre Kuhn
Kuhn et al. Cross-species and cross-platform gene expression studies with the Bioconductor-compliant R package 'annotationTools'. BMC Bioinformatics, 9:26 (2008)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | ##example Affymetrix annotation file and its location
annotationFile<-system.file('extdata','HG-U133_Plus_2_annot_part.csv',package='annotationTools')
##load annotation file
annotation<-read.csv(annotationFile,colClasses='character',comment.char='#')
##genes of interest
myGenes<-c('DDR1','GUCA1A','HSPA6',NA,'XYZ')
##column 15 in annotation contains gene symbols
colnames(annotation)
##find probe sets probing for particular genes
getPROBESET(myGenes,annotation,idCol=15)
##find probe sets probing only for the genes of interest (i.e. with unique annotation)
getPROBESET(myGenes,annotation,idCol=15,uniqueID=TRUE)
##track origin of annotation failure for the 2 last probe set IDs
getPROBESET(myGenes,annotation,idCol=15,diagnose=TRUE)
|
[1] "Probe.Set.ID" "GeneChip.Array"
[3] "Species.Scientific.Name" "Annotation.Date"
[5] "Sequence.Type" "Sequence.Source"
[7] "Transcript.ID.Array.Design." "Target.Description"
[9] "Representative.Public.ID" "Archival.UniGene.Cluster"
[11] "UniGene.ID" "Genome.Version"
[13] "Alignments" "Gene.Title"
[15] "Gene.Symbol" "Chromosomal.Location"
[17] "Unigene.Cluster.Type" "Ensembl"
[19] "Entrez.Gene" "SwissProt"
[21] "EC" "OMIM"
[23] "RefSeq.Protein.ID" "RefSeq.Transcript.ID"
[25] "FlyBase" "AGI"
[27] "WormBase" "MGI.Name"
[29] "RGD.Name" "SGD.accession.number"
[31] "Gene.Ontology.Biological.Process" "Gene.Ontology.Cellular.Component"
[33] "Gene.Ontology.Molecular.Function" "Pathway"
[35] "Protein.Families" "Protein.Domains"
[37] "InterPro" "Trans.Membrane"
[39] "QTL" "Annotation.Description"
[41] "Annotation.Transcript.Cluster" "Transcript.Assignments"
[43] "Annotation.Notes"
[[1]]
[1] "1007_s_at" "207169_x_at"
[[2]]
[1] "1255_g_at"
[[3]]
[1] "117_at"
[[4]]
[1] NA
[[5]]
[1] NA
Warning messages:
1: In getPROBESET(myGenes, annotation, idCol = 15) :
one or more empty gene ID in input
2: In getPROBESET(myGenes, annotation, idCol = 15) :
one or more gene ID not found in annotation
[[1]]
[1] "1007_s_at" "207169_x_at"
[[2]]
[1] "1255_g_at"
[[3]]
[1] NA
[[4]]
[1] NA
[[5]]
[1] NA
Warning messages:
1: In getPROBESET(myGenes, annotation, idCol = 15, uniqueID = TRUE) :
one or more empty gene ID in input
2: In getPROBESET(myGenes, annotation, idCol = 15, uniqueID = TRUE) :
one or more gene ID not found in annotation
[[1]]
[[1]][[1]]
[1] "1007_s_at" "207169_x_at"
[[1]][[2]]
[1] "1255_g_at"
[[1]][[3]]
[1] "117_at"
[[1]][[4]]
[1] NA
[[1]][[5]]
[1] NA
[[2]]
[1] FALSE FALSE FALSE TRUE FALSE
[[3]]
[1] FALSE FALSE FALSE FALSE TRUE
[[4]]
[1] FALSE FALSE FALSE FALSE FALSE
Warning messages:
1: In getPROBESET(myGenes, annotation, idCol = 15, diagnose = TRUE) :
one or more empty gene ID in input
2: In getPROBESET(myGenes, annotation, idCol = 15, diagnose = TRUE) :
one or more gene ID not found in annotation
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.