changeDataId: Change the data IDs of input omics data matrix

View source: R/mapping.utilities.R

changeDataIdR Documentation

Change the data IDs of input omics data matrix

Description

This function changes the IDs of input omics data from one type to another. It returns a data matrix with row names changed to the specified output ID type. To change a vector of input IDs to another type, use the changeIds function.

Usage

changeDataId(
  data.input.id = NULL,
  input.type = NULL,
  output.type = NULL,
  sum.method = "sum",
  org = "hsa",
  mol.type = NULL,
  id.mapping.table = NULL,
  SBGNview.data.folder = "./SBGNview.tmp.data"
)

Arguments

data.input.id

A matrix. Input omics data. Rows are genes or compounds, columns are measurements. Row names are the original IDs that need to be transformed.

input.type

A character string. The type of input IDs. Please check data('mapped.ids') for supported types.

output.type

A character string. The type of output IDs. Please check data('mapped.ids') for supported types.

sum.method

A character string. Default: "sum". In some cases multiple input IDs are mapped to one output ID. In this situation ,we may need to derive only one value from them. This parameter is a function that can derive a single numeric value from a vector of numeric values (e.g. 'sum','max','min','mean'), including a User Defined Function (UDF).

org

A character string. Default: "hsa". The species source of omics data. 'changeDataId' uses pathview to map between some gene ID types. Please use '?geneannot.map' to check the detail. Pathview needs species information to do the job. This parameter is a two-letter abbreviation of organism name, or KEGG species code, or the common species name, used to determine the gene annotation package. For all potential values check: data(bods); bods. Default org='Hs', and can also be 'hsa' or 'human' (case insensitive).

mol.type

A character string. Either 'cpd' or 'gene' – the type of input omics data.

id.mapping.table

A matrix. Mapping table between input.type and output.type. This matrix should have two columns for input.type and output.type, respectively. Column names should be the values of parameters 'input.type' and 'output.type'. See example section for an example.

SBGNview.data.folder

A character string. Default: "./SBGNview.tmp.data". The path to a folder that will hold downloaded ID mapping files and pathway information data files.

Details

This function maps between various gene/compound ID types.

1. Map other ID types to glyph IDs in SBGN-ML files of pathwayCommons database,and MetaCyc database: Use output.type = 'pathwayCommons' or output.type = 'metacyc.SBGN'. Please check data('mapped.ids') for supported input ID types.

2. Map between other ID types:

2.1 ID types pairs can be mapped by pathview. Currently SBGNview uses pathview to do this mapping. Please check pathview functions 'geneannot.map' and 'cpdidmap' for more details.

2.2 Other ID type pairs

In this case, users need to provide id.mapping.table.

Value

A matrix, row names are changed to IDs of 'output.type'. Note the number of rows may be different from input matrix, because multiple input IDs could be collapsed to a single output ID. Also a single input ID could map to multiple output IDs.

Examples

# Change gene ID
data(mapped.ids)
library(pathview)
data('gse16873.d')
gene.data = gse16873.d[c('7157','1032'),]
mapping.table = data.frame(ENTREZID = c('7157','1032'),
                           SYMBOL = c('TP53','CDKN2D'),
                           stringsAsFactors = FALSE)
new.dt = changeDataId(data.input.id = gene.data,
                      output.type = 'SYMBOL',
                      input.type = 'ENTREZID',
                      mol.type = 'gene',
                      id.mapping.table = mapping.table)
      

datapplab/SBGNview documentation built on June 20, 2022, 9:55 p.m.