prepareData: Combining Two Studies into an Expression Set
In OrderedList: Similarities of Ordered Gene Lists

Description Usage Arguments Details Value Author(s) References See Also Examples

The function prepares a collection of two expression sets (ExpressionSet) and/or Affy batches (AffyBatch) to be passed on to the main function OrderedList. For each data set, one has to specify the variable in the corresponding phenodata from which the grouping into two distinct classes is done. The data sets are then merged into one ExpressionSet together with the rearranged phenodata. If the studies were done on different platforms but a subset of genes can be mapped from one chip to the other, this information can be provided via the mapping argument.

Please note that both data sets have to be pre-processed beforehand, either together or independent of each other. In addition, the gene expression values have to be on an additive scale, that is logarithmic or log-like scale.

1	prepareData(eset1, eset2, mapping = NULL)

`eset1`	The main inputs are the distinct studies. Each study is stored in a named list, which has five elements: `data`, `name`, `var`, `out` and `paired`, see details below.
`eset2`	Same as `eset2` for the second data set.
`mapping`	Data frame containing one named vector for each study. The vectors are comprised of probe IDs that fit to the rownames of the corresponding expression set. For each study, the IDs are ordered identically. For example, the kth row of `mapping` provides the label of the kth gene in each single study. If all studies were done on the same chip, no mapping is needed (default).

Each study has to be stored in a list with five elements:

`data`	Object of class `ExpressionSet` or `AffyBatch`.
`name`	Character string with comparison label.
`var`	Character string with phenodata variable. Based on this variable, the samples for the two-sample testing will be extracted.
`out`	Vector of two character strings with the levels of `var` that define the two clinical classes. The order of the two levels must be identical for all studies. Ideally, the first entry corresponds to the bad and the second one to the good outcome level.
`paired`	Logical - `TRUE` if samples are paired (e.g. two measurements per patients) or `FALSE` if all samples are independent of each other. If data are paired, the paired samples need to be in (whatever) successive order. Thus, the first sample of one condition must match to the first sample of the second condition and so on.

An object of class ExpressionSet containing the joint data sets with appropriate phenodata.

Stefanie Scheid

Yang X, Bentink S, Scheid S, and Spang R (2006): Similarities of ordered gene lists, to appear in Journal of Bioinformatics and Computational Biology.

OL.data, OrderedList

data(OL.data)

### 'map' contains the appropriate mapping between 'breast' and 'prostate' IDs.
### Let's first concatenate two studies.
A <- prepareData(
                 list(data=OL.data$prostate,name="prostate",var="outcome",out=c("Rec","NRec"),paired=FALSE),
                 list(data=OL.data$breast,name="breast",var="Risk",out=c("high","low"),paired=FALSE),
                 mapping=OL.data$map
                 )

### We might want to examine the first 100 probes only.
B <- prepareData(
                 list(data=OL.data$prostate,name="prostate",var="outcome",out=c("Rec","NRec"),paired=FALSE),
                 list(data=OL.data$breast,name="breast",var="Risk",out=c("high","low"),paired=FALSE),
                 mapping=OL.data$map[1:100,]
                 )

Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colMeans, colSums, colnames, do.call,
    duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
    lapply, lengths, mapply, match, mget, order, paste, pmax, pmax.int,
    pmin, pmin.int, rank, rbind, rowMeans, rowSums, rownames, sapply,
    setdiff, sort, table, tapply, union, unique, unsplit, which,
    which.max, which.min

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: twilight
Loading required package: splines