idMap: Mapping between gene ID types
In lgeistlinger/EnrichmentBrowser: Seamless navigation through combined results of set-based and network-based enrichment analysis

idMap

R Documentation

Mapping between gene ID types

Description

Functionality to map between common gene ID types such as ENSEMBL and ENTREZ for gene expression datasets, gene sets, and gene regulatory networks.

Usage

idMap(
  obj,
  org = NA,
  from = "ENSEMBL",
  to = "ENTREZID",
  multi.to = "first",
  multi.from = "first"
)

idTypes(org)

Arguments

`obj`	The object for which gene IDs should be mapped. Supported options include Gene expression dataset. An object of class `SummarizedExperiment`. Expects the names to be of gene ID type given in argument `from`. Gene sets. Either a list of gene sets (character vectors of gene IDs) or a `GeneSetCollection` storing all gene sets. Gene regulatory network. A 3-column character matrix; 1st col = IDs of regulating genes; 2nd col = IDs of regulated genes; 3rd col = regulation effect; Use '+' and '-' for activation / inhibition.
`org`	Character. Organism in KEGG three letter code, e.g. ‘hsa’ for ‘Homo sapiens’. See references.
`from`	Character. Gene ID type from which should be mapped. Corresponds to the gene ID type of argument `obj`. Defaults to `ENSEMBL`.
`to`	Character. Gene ID type to which should be mapped. Corresponds to the gene ID type the argument `obj` should be updated with. If `obj` is an expression dataset of class `SummarizedExperiment`, `to` can also be the name of a column in the `rowData` slot to specify user-defined mappings in which conflicts have been manually resolved. Defaults to `ENTREZID`.
`multi.to`	How to resolve 1:many mappings, i.e. multiple to.IDs for a single from.ID? This is passed on to the `multiVals` argument of `mapIds` and can thus take several pre-defined values, but also the form of a user-defined function. However, note that this requires that a single to.ID is returned for each from.ID. Default is `"first"`, which accordingly returns the first to.ID mapped onto the respective from.ID.
`multi.from`	How to resolve many:1 mappings, i.e. multiple from.IDs mapping to the same to.ID? Only applicable if `obj` is an expression dataset of class `SummarizedExperiment`. Pre-defined options include: 'first' (Default): returns the first from.ID for each to.ID with multiple from.IDs, 'minp': selects the from.ID with minimum p-value (according to the `rowData` column `PVAL` of `obj`), 'maxfc': selects the from.ID with maximum absolute log2 fold change (according to the `rowData` column `FC` of `obj`). Note that a user-defined function can also be supplied for custom behaviors. This will be applied for each case where there are multiple from.IDs for a single to.ID, and accordingly takes the arguments `ids` and `obj`. The argument `ids` corresponds to the multiple from.IDs from which a single ID should be chosen, e.g. via information available in argument `obj`. See examples for a case where ids are selected based on a user-defined `rowData` column.

Details

The function 'idTypes' lists the valid values which the arguments 'from' and 'to' can take. This corresponds to the names of the available gene ID types for the mapping.

Value

idTypes: character vector listing the available gene ID types for the mapping;

idMap: An object of the same class as the input argument obj, i.e. a SummarizedExperiment if provided an expression dataset, a list of character vectors or a GeneSetCollection if provided gene sets, and a character matrix if provided a gene regulatory network.

Author(s)

Ludwig Geistlinger

References

KEGG Organism code http://www.genome.jp/kegg/catalog/org_list.html

Examples


    # (1) ID mapping for gene expression datasets 
    # create an expression dataset with 3 genes and 3 samples
    se <- makeExampleData("SE", nfeat = 3, nsmpl = 3)
    names(se) <- paste0("ENSG00000000", c("003", "005", "419"))
    idMap(se, org = "hsa")

    # user-defined mapping
    rowData(se)$MYID <- c("g1", "g1", "g2")
    idMap(se, to = "MYID")    

    # data-driven resolving of many:1 mappings
    
    ## e.g. select from.ID with lowest p-value
    pcol <- configEBrowser("PVAL.COL")
    rowData(se)[[pcol]] <- c(0.001, 0.32, 0.15)
    idMap(se, to = "MYID", multi.from = "minp") 
   
    ## ... or using a customized function
    maxScore <- function(ids, se)
    {
         scores <- rowData(se)[ids, "SCORE"]
         ind <- which.max(scores)
         return(ids[ind])
    }
    rowData(se)$SCORE <- c(125.7, 33.4, 58.6)
    idMap(se, to = "MYID", multi.from = maxScore) 
           
    # (2) ID mapping for gene sets 
    # create two gene sets containing 3 genes each 
    s2 <- paste0("ENSG00000", c("012048", "139618", "141510"))
    gs <- list(s1 = names(se), s2 = s2)
    idMap(gs, org = "hsa", from = "ENSEMBL", to = "SYMBOL")    

    # (3) ID mapping for gene regulatory networks
    grn <- cbind(FROM = gs$s1, TO = gs$s2, TYPE = rep("+", 3))
    idMap(grn, org = "hsa", from = "ENSEMBL", to = "ENTREZID")

lgeistlinger/EnrichmentBrowser documentation built on June 14, 2025, 6:36 p.m.