map.datasets: Function to map a list of datasets through EntrezGene IDs in...

Description Usage Arguments Details Value Author(s) Examples

Description

This function maps a list of datasets through EntrezGene IDs in order to get the union of the genes.

Usage

1
2
map.datasets(datas, annots, do.mapping = FALSE, 
mapping.coln = "EntrezGene.ID", mapping, verbose = FALSE)

Arguments

datas

List of matrices of gene expressions with samples in rows and probes in columns, dimnames being properly defined.

annots

List of matrices of annotations with at least one column named "EntrezGene.ID", dimnames being properly defined.

do.mapping

TRUE if the mapping through Entrez Gene ids must be performed (in case of ambiguities, the most variant probe is kept for each gene), FALSE otherwise.

mapping.coln

Name of the column containing the biological annotation to be used to map the different datasets, default is "EntrezGene.ID".

mapping

Matrix with columns "EntrezGene.ID" and "probe.x" used to force the mapping such that the probes of platform x are not selected based on their variance.

verbose

TRUE to print informative messages, FALSE otherwise.

Details

In case of several probes representing the same EntrezGene ID, the most variant is selected if mapping is not specified. When a EntrezGene ID does not exist in a specific dataset, NA values are introduced.

Value

datas

List of datasets (gene expression matrices)

annots

List of annotations (annotation matrices)

Author(s)

Benjamin Haibe-Kains

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
## load VDX dataset
data(vdxs)
## load NKI dataset
data(nkis)
## reduce datasets
ginter <- intersect(annot.vdxs[ ,"EntrezGene.ID"], annot.nkis[ ,"EntrezGene.ID"])
ginter <- ginter[!is.na(ginter)][1:30]
myx <- unique(c(match(ginter, annot.vdxs[ ,"EntrezGene.ID"]),
  sample(x=1:nrow(annot.vdxs), size=20)))
data2.vdxs <- data.vdxs[ ,myx]
annot2.vdxs <- annot.vdxs[myx, ]
myx <- unique(c(match(ginter, annot.nkis[ ,"EntrezGene.ID"]),
  sample(x=1:nrow(annot.nkis), size=20)))
data2.nkis <- data.nkis[ ,myx]
annot2.nkis <- annot.nkis[myx, ]
## mapping of datasets
datas <- list("VDX"=data2.vdxs,"NKI"=data2.nkis)
annots <- list("VDX"=annot2.vdxs, "NKI"=annot2.nkis)
datas.mapped <- map.datasets(datas=datas, annots=annots, do.mapping=TRUE)
str(datas.mapped, max.level=2)

Example output

Loading required package: survcomp
Loading required package: survival
Loading required package: prodlim
Loading required package: mclust
Package 'mclust' version 5.3
Type 'citation("mclust")' for citing this R package in publications.
Loading required package: limma
Loading required package: biomaRt
Loading required package: iC10
Loading required package: pamr
Loading required package: cluster
Loading required package: iC10TrainingData
Loading required package: AIMS
Loading required package: e1071
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following object is masked from 'package:limma':

    plotMA

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colMeans, colSums, colnames, do.call,
    duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
    lapply, lengths, mapply, match, mget, order, paste, pmax, pmax.int,
    pmin, pmin.int, rank, rbind, rowMeans, rowSums, rownames, sapply,
    setdiff, sort, table, tapply, union, unique, unsplit, which,
    which.max, which.min

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

List of 2
 $ datas :List of 2
  ..$ VDX: num [1:150, 1:68] 10.5 10.8 10 11.2 11.6 ...
  .. ..- attr(*, "dimnames")=List of 2
  ..$ NKI: num [1:150, 1:68] 0.053 -0.387 0.004 -0.035 0.082 -0.082 -0.162 -0.397 -0.093 -0.135 ...
  .. ..- attr(*, "dimnames")=List of 2
 $ annots:List of 2
  ..$ VDX:'data.frame':	68 obs. of  7 variables:
  ..$ NKI:'data.frame':	68 obs. of  7 variables:

genefu documentation built on Nov. 1, 2018, 2:25 a.m.