Convert ensembl ids to HGNC gene ids

Share:

Description

Retrieve the gene IDs (HGNC) corresponding to a list of ensembl gene ids. Note that this will not find all IDs found on ensembl.org, as it uses bioMart which seems to be incomplete, but this only pertains to a small minority of genes, so this function should have general utility for most applications. This is of course the case at the time of writing - bioMart is likely to be updated at some point.

Usage

1
2
ENS.to.GENE(ens, dir = NULL, build = NULL, name.dups = FALSE,
  name.missing = TRUE, ...)

Arguments

ens

character, a list of ensembl gene ids, of the form ENSG00xxxxxxxxx

dir

character, 'dir' is the location to download gene and cytoband information; if left as NULL, depending on the value of getOption("save.annot.in.current"), the annotation will either be saved in the working directory to speed-up subsequent lookups, or deleted after use.

build

character, "hg18" or "hg19" (or 36/37) to show which reference to retrieve. The default when build is NULL is to use the build from the current ChipInfo annotation

name.dups

logical, if TRUE then duplicates will have a suffix appended to force the list to be unique (e.g, so it would be usable as rownames, or in a lookup table). Otherwise duplicate entries will just appear in the list multiple times

name.missing

logical, if TRUE then missing values will be named as MISSING_n (n=1 to # of missing), ensuring a valid unique name if the results are to be used as rownames, etc. If FALSE then these will be left as NA.

...

further arguments to get.gene.annot()

Value

Returns a vector of HGNC gene ids corresponding to the 'ens' ensembl ids entered, any ids not found will be returned as MISSING_n (n=1 to # of missing), if name.missing=TRUE. If name.missing is FALSE then missing will be set to NA. Similarly with 'name.dups', if duplicates are found and name.dups is true, each will be appended with suffix _n; else their names will be left as is.

Author(s)

Nicholas Cooper nick.cooper@cimr.cam.ac.uk

See Also

GENE.to.ENS, rs.to.id, id.to.rs; eg2sym, sym2eg from package 'gage'

Examples

1
2
3
4
5
setwd(tempdir())
ENS.ids <- c("ENSG00000183214", "ENSG00000163599", "ENSG00000175354", "ENSG00000134460")
ENS.to.GENE(ENS.ids)
gene.ids <- c("HLA-B","IFIH1","fake_gene!","FUT2")
ENS.to.GENE(GENE.to.ENS(gene.ids)) # lookup fails for the fake id, gives warning

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.