prefix2dataset: Table mapping Ensembl gene identifier prefixes to BioMart...

Description Usage Details Value Source Examples


The species corresponding to an Ensembl gene identifier can typically be identified from the prefix of the identifier (e.g. ENSBTAG corresponds to Bos taurus). This table maps each known unique prefix to the corresponding species.




C. elegans, D. melanogaster, and S. cerevisiae have atypical identifier pattern and prefixes in their Ensembl gene identifiers. However, the automatically extracted prefix for C. elegans and D. melanogaster – respectively "WBgene" and "FBgn" – can be used as such to identify datasets from those species. On the oher hand, prefixes used for the S. cerevisiae include "YHR", "YAL", and many others. Consequently, expression data from S. cerevisiae species is identified without referring to the "prefix2dataset"" table; instead, such datasets are identified if the first gene identifier in the dataset starts with "Y".


A data frame with 69 rows and 4 columns. Each row refers to one dataset in the Ensembl BioMart. The columns are described below:


The method stored in the toolkit.R script was used to query the Ensembl BioMart server and build this table.



Example output

'data.frame':	69 obs. of  4 variables:
 $ dataset: chr  "amelanoleuca_gene_ensembl" "aplatyrhynchos_gene_ensembl" "acarolinensis_gene_ensembl" "amexicanus_gene_ensembl" ...
 $ species: chr  "Ailuropoda melanoleuca" "Anas platyrhynchos" "Anolis carolinensis" "Astyanax mexicanus" ...
 $ prefix : chr  "ENSAMEG" "ENSAPLG" "ENSACAG" "ENSAMXG" ...
 $ sample : chr  "ENSAMEG00000000001" "ENSAPLG00000000001" "ENSACAG00000000002" "ENSAMXG00000000001" ...
                          dataset                    species   prefix
44      amelanoleuca_gene_ensembl     Ailuropoda melanoleuca  ENSAMEG
64    aplatyrhynchos_gene_ensembl         Anas platyrhynchos  ENSAPLG
49     acarolinensis_gene_ensembl        Anolis carolinensis  ENSACAG
18        amexicanus_gene_ensembl         Astyanax mexicanus  ENSAMXG
68           btaurus_gene_ensembl                 Bos taurus  ENSBTAG
14          celegans_gene_ensembl     Caenorhabditis elegans   WBGene
11          cjacchus_gene_ensembl         Callithrix jacchus  ENSCJAG
69       cfamiliaris_gene_ensembl           Canis familiaris  ENSCAFG
2         cporcellus_gene_ensembl            Cavia porcellus  ENSCPOG
15          csabaeus_gene_ensembl        Chlorocebus sabaeus  ENSCSAG
6         choffmanni_gene_ensembl        Choloepus hoffmanni  ENSCHOG
24     cintestinalis_gene_ensembl         Ciona intestinalis  ENSCING
7          csavignyi_gene_ensembl             Ciona savignyi ENSCSAVG
41            drerio_gene_ensembl                Danio rerio  ENSDARG
28     dnovemcinctus_gene_ensembl       Dasypus novemcinctus  ENSDNOG
59            dordii_gene_ensembl            Dipodomys ordii  ENSDORG
53     dmelanogaster_gene_ensembl    Drosophila melanogaster     FBgn
23         etelfairi_gene_ensembl          Echinops telfairi  ENSETEG
38         ecaballus_gene_ensembl             Equus caballus  ENSECAG
20        eeuropaeus_gene_ensembl        Erinaceus europaeus  ENSEEUG
8             fcatus_gene_ensembl                Felis catus  ENSFCAG
21       falbicollis_gene_ensembl        Ficedula albicollis  ENSFALG
63           gmorhua_gene_ensembl               Gadus morhua  ENSGMOG
36           ggallus_gene_ensembl              Gallus gallus  ENSGALG
3         gaculeatus_gene_ensembl     Gasterosteus aculeatus  ENSGACG
57          ggorilla_gene_ensembl            Gorilla gorilla  ENSGGOG
32          hsapiens_gene_ensembl               Homo sapiens     ENSG
5  itridecemlineatus_gene_ensembl Ictidomys tridecemlineatus  ENSSTOG
42        lchalumnae_gene_ensembl        Latimeria chalumnae  ENSLACG
55         loculatus_gene_ensembl       Lepisosteus oculatus  ENSLOCG
4          lafricana_gene_ensembl         Loxodonta africana  ENSLAFG
45          mmulatta_gene_ensembl             Macaca mulatta  ENSMMUG
67          meugenii_gene_ensembl           Macropus eugenii  ENSMEUG
62        mgallopavo_gene_ensembl        Meleagris gallopavo  ENSMGAG
54          mmurinus_gene_ensembl         Microcebus murinus  ENSMICG
48        mdomestica_gene_ensembl      Monodelphis domestica  ENSMODG
61         mmusculus_gene_ensembl               Mus musculus  ENSMUSG
34             mfuro_gene_ensembl      Mustela putorius furo  ENSMPUG
31        mlucifugus_gene_ensembl           Myotis lucifugus  ENSMLUG
25       nleucogenys_gene_ensembl        Nomascus leucogenys  ENSNLEG
58         oprinceps_gene_ensembl          Ochotona princeps  ENSOPRG
16        oniloticus_gene_ensembl      Oreochromis niloticus  ENSONIG
1          oanatinus_gene_ensembl   Ornithorhynchus anatinus  ENSOANG
27        ocuniculus_gene_ensembl      Oryctolagus cuniculus  ENSOCUG
56          olatipes_gene_ensembl            Oryzias latipes  ENSORLG
52        ogarnettii_gene_ensembl         Otolemur garnettii  ENSOGAG
60            oaries_gene_ensembl                 Ovis aries  ENSOARG
22      ptroglodytes_gene_ensembl            Pan troglodytes  ENSPTRG
47           panubis_gene_ensembl               Papio anubis  ENSPANG
10         psinensis_gene_ensembl        Pelodiscus sinensis  ENSPSIG
19          pmarinus_gene_ensembl         Petromyzon marinus  ENSPMAG
33          pformosa_gene_ensembl           Poecilia formosa  ENSPFOG
39           pabelii_gene_ensembl               Pongo abelii  ENSPPYG
29         pcapensis_gene_ensembl          Procavia capensis  ENSPCAG
46         pvampyrus_gene_ensembl          Pteropus vampyrus  ENSPVAG
9        rnorvegicus_gene_ensembl          Rattus norvegicus  ENSRNOG
13       scerevisiae_gene_ensembl   Saccharomyces cerevisiae         
66         sharrisii_gene_ensembl       Sarcophilus harrisii  ENSSHAG
65          saraneus_gene_ensembl              Sorex araneus  ENSSARG
26           sscrofa_gene_ensembl                 Sus scrofa  ENSSSCG
30          tguttata_gene_ensembl        Taeniopygia guttata  ENSTGUG
17         trubripes_gene_ensembl          Takifugu rubripes  ENSTRUG
51         tsyrichta_gene_ensembl           Tarsius syrichta  ENSTSYG
43     tnigroviridis_gene_ensembl     Tetraodon nigroviridis  ENSTNIG
35        tbelangeri_gene_ensembl           Tupaia belangeri  ENSTBEG
12        ttruncatus_gene_ensembl         Tursiops truncatus  ENSTTRG
50            vpacos_gene_ensembl              Vicugna pacos  ENSVPAG
37       xtropicalis_gene_ensembl         Xenopus tropicalis  ENSXETG
40        xmaculatus_gene_ensembl      Xiphophorus maculatus  ENSXMAG
44  ENSAMEG00000000001
64  ENSAPLG00000000001
49  ENSACAG00000000002
18  ENSAMXG00000000001
68  ENSBTAG00000000005
14      WBGene00000001
11  ENSCJAG00000000001
69  ENSCAFG00000000001
2   ENSCPOG00000000004
15  ENSCSAG00000000001
6   ENSCHOG00000000002
24  ENSCING00000000010
7  ENSCSAVG00000000001
41  ENSDARG00000000001
28  ENSDNOG00000000001
59  ENSDORG00000000002
53         FBgn0000003
23  ENSETEG00000000001
38  ENSECAG00000000001
20  ENSEEUG00000000001
8   ENSFCAG00000000001
21  ENSFALG00000000001
63  ENSGMOG00000000001
36  ENSGALG00000000003
3   ENSGACG00000000002
57  ENSGGOG00000000002
32     ENSG00000000003
5   ENSSTOG00000000001
42  ENSLACG00000000001
55  ENSLOCG00000000001
4   ENSLAFG00000000001
45  ENSMMUG00000000001
67  ENSMEUG00000000001
62  ENSMGAG00000000001
54  ENSMICG00000000002
48  ENSMODG00000000001
61  ENSMUSG00000000001
34  ENSMPUG00000000002
31  ENSMLUG00000000002
25  ENSNLEG00000000001
58  ENSOPRG00000000002
16  ENSONIG00000000001
1   ENSOANG00000000110
27  ENSOCUG00000000001
56  ENSORLG00000000001
52  ENSOGAG00000000002
60  ENSOARG00000000001
22  ENSPTRG00000000001
47  ENSPANG00000000001
10  ENSPSIG00000000001
19  ENSPMAG00000000001
33  ENSPFOG00000000001
39  ENSPPYG00000000001
29  ENSPCAG00000000002
46  ENSPVAG00000000002
9   ENSRNOG00000000001
13            15S_rRNA
66  ENSSHAG00000000001
65  ENSSARG00000000002
26  ENSSSCG00000000001
30  ENSTGUG00000000001
17  ENSTRUG00000000001
51  ENSTSYG00000000002
43  ENSTNIG00000000002
35  ENSTBEG00000000002
12  ENSTTRG00000000002
50  ENSVPAG00000000002
37  ENSXETG00000000002
40  ENSXMAG00000000002

