loadMAdata: Load and preprocess microarray data

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/loadMAdata.r


Loads, preprocesses and annotates microarray data to be further used by downstream functions in the piano package.


loadMAdata(datadir = getwd(), setup = "setup.txt", dataNorm,
  platform = "NULL", annotation, normalization = "plier",
  filter = TRUE, verbose = TRUE, ...)



character string giving the directory in which to look for the data. Defaults to getwd().


character string giving the name of the file containing the experimental setup, or an object of class data.frame or similar containing the experimental setup. Defaults to "setup.txt", see details below for more information.


character string giving the name of the normalized data, or an object of class data.frame or similar containing the normalized data. Only to be used if the user wishes to start with normalized data rather then CEL files.


character string giving the name of the platform, can be either "yeast2" or NULL. See details below for more information.


character string giving the name of the annotation file, or an object of class data.frame or similar containing the annotation information. The annotation should consist of the columns Gene name, Chromosome and Chromosome location. Not required if platform="yeast2".


character string giving the normalization method, can be either "plier", "rma" or "mas5". Defaults to "plier".


should the data be filtered? If TRUE then probes not present in the annotation will be discarded. Defaults to TRUE.


verbose? Defaults to TRUE.


additional arguments to be passed to ReadAffy.


This function requires at least two inputs: (1) data, either CEL files in the directory specified by datadir or normalized data specified by dataNorm, and (2) experimental setup specified by setup.

The setup shold be either a tab delimited text file with column headers or a data.frame. The first column should contain the names of the CEL files or the column names used for the normalized data, please be sure to use names valid as column names, e.g. avoid names starting with numbers. Additional columns should assign attributes in some category to each array. (For an example run the example below and look at the object myArrayData$setup.)

The piano package is customized for yeast 2.0 arrays and annotation will work automatically, if the cdfName of the arrays equals Yeast_2. If using normalized yeast 2.0 data as input, the user needs to set the argument platform="yeast2" to tell the function to use yeast annotation. If other platforms than yeast 2.0 is used, set platform=NULL (default) and supply appropriate annotation by the argument annotation. Note that the cdfName will override platform, so it can still be set to NULL for yeast 2.0 CEL files. Note also that annotation overrides platform, so if the user wants to use an alternative annotation for yeast, this can be done simply by specifying this in annotation.

The annotation should have the column headers Gene name, Chromosome and Chromosome location. The Gene name is used in the heatmap in diffExp and the Chromosome and Chromosome location is used by the polarPlot. The rownames (or first column if using a text file) should contain the probe IDs. If using a text file the first column should have the header probeID or similar. The filtering step discards all probes not listed in the annotation.

Normalization is performed on all CEL file data using one of the Affymetrix methods: PLIER ("plier") as implemented by justPlier, RMA (Robust Multi-Array Average) ("rma") expression measure as implemented by rma or MAS 5.0 expression measure "mas5" as implemented by mas5.

It is possible to pass additional arguments to ReadAffy, e.g. cdfname as this might be required for some types of CEL files.


An ArrayData object (which is essentially a list) with the following elements:


raw data as an AffyBatch object


data.frame containing normalized expression values


data.frame containing experimental setup


data.frame containing annotation

Depending on input arguments the ArrayData object may not include dataRaw and/or annotation.


Leif Varemo piano.rpkg@gmail.com and Intawat Nookaew piano.rpkg@gmail.com


Gautier, L., Cope, L., Bolstad, B. M., and Irizarry, R. A. affy - analysis of Affymetrix GeneChip data at the probe level. Bioinformatics. 20, 3, 307-315 (2004).

See Also

piano, runQC, diffExp, ReadAffy, expresso, justPlier, yeast2.db


  # Get path to example data and setup files:
  dataPath <- system.file("extdata", package="piano")

  # Load normalized data:
  myArrayData <- loadMAdata(datadir=dataPath, dataNorm="norm_data.txt.gz", platform="yeast2")

  # Print to look at details:

piano documentation built on Nov. 8, 2020, 6:27 p.m.