loadGPR: Importing raw data from gpr files.

Description Usage Arguments Details Value Note Author(s) References Examples

View source: R/PAA.r

Description

Constructs an EListRaw object from a set of gpr files containing ProtoArray data or other protein microarray data.

Usage

1
2
3
4
5
loadGPR(gpr.path = NULL, targets.path = NULL, array.type = NULL, 
 aggregation = "none", array.columns = list(E = "F635 Median",
 Eb = "B635 Median"),
 array.annotation = c("Block", "Column", "Row", "Description", "Name", "ID"),
 description = NULL, description.features = NULL, description.discard = NULL)

Arguments

gpr.path

string indicating the path to a folder containing gpr files (mandatory).

targets.path

string indicating the path to targets file (see limma, mandatory).

array.type

string indicating the microarray type of the imported gpr files. Only for ProtoArrays duplicate aggregation will be performed. The possible options are: "ProtoArray", "HuProt" and "other" (mandatory).

aggregation

string indicating which type of ProtoArray spot duplicate aggregation should be performed. If "min" is chosen, the value for the corresponding feature will be the minimum of both duplicate values. If "mean" is chosen, the arithmetic mean will be computed. Alternatively, no aggregation will be performed, if "none" is chosen. The default is "min" (optional).

array.columns

list containing the column names for foreground intensities (E) and background intensities (Eb) in the gpr files that is passed to limma's "read.maimages" function (optional).

array.annotation

string vector containing further mandatory column names that are passed to limma (optional).

description

string indicating the column name of an alternative column containing the information which spot is a feature, control or to be discarded for gpr files not providing the column "Description" (optional).

description.features

string containing a regular expression identifying feature spots. Mandatory when description has been defined.

description.discard

string containing a regular expression identifying spots to be discarded (e.g., empty spots). Mandatory when description has been defined.

Details

This function is partially a wrapper to limma's function read.maimages() featuring optional duplicate aggregation for ProtoArray data. Paths to a targets file and to a folder containing gpr files (all gpr files in that folder that are listed in the targets file will be read) are mandatory. The folder "R_HOME/library/PAA/extdata" contains an exemplary targets file that can be used as a template. If array.type (also mandatory) is set to "ProtoArray", duplicate spots can be aggregated. The corresponding method ("min", "mean" or "none") can be specified via the argument aggregation. As another ProtoArray-specific feature, control spot data and information will be stored in additional components of the returned object (see below). Arguments array.columns and array.annotation define the columns where read.maimages() will find foreground and background intensity values as well as other important columns. For array.annotation the default columns "Block", "Column", "Row", "Description", "Name" and "ID" are mandatory.

If the column "Description" is not provided by the gpr files for ProtoArrays a makeshift column will be constructed from the column "Name" automatically. For other microarrays the arguments description, description.features and description.discard can be used to provide the mandatory information (see the example below).

Value

An extended object of class EListRaw (see the documentation of limma for details) is returned. If array.type is set to "ProtoArray" (default), the object provides additional components for control spot data: C, Cb and cgenes which are analogous to the probe spot data E, Eb and genes. Moreover, the returned object always provides the additional component array.type indicating the type of the imported protein microarray data (e.g., "ProtoArray").

Note

Don't forget to check column names in your gpr files. They may differ from the default settings of loadGPR() and should be renamed to the default column names (see also the exemplary gpr files accompanying PAA as a reference for the default column names). At worst, important columns in your gpr files may be completely missing and should be added in order to provide all information needed by PAA.

Note that if array.type is not "ProtoArray", neither aggregation will be done nor controls components will be added to the returned object of class EListRaw.

Author(s)

Michael Turewicz, michael.turewicz@rub.de

References

The package limma by Gordon Smyth et al. can be downloaded from Bioconductor (http://www.bioconductor.org/).

Smyth, G. K. (2005). Limma: linear models for microarray data. In: Bioinformatics and Computational Biology Solutions using R and Bioconductor, R. Gentleman, V. Carey, S. Dudoit, R. Irizarry, W. Huber (eds.), Springer, New York, pages 397-420.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
gpr <- system.file("extdata", package="PAA") 
targets <- list.files(system.file("extdata", package="PAA"),
 pattern = "dummy_targets", full.names=TRUE)   
elist <- loadGPR(gpr.path=gpr, targets.path=targets, array.type="ProtoArray")

# Example showing how to use the arguments description, description.features and
# description.discard in order to construct a makeshift column 'Description'
# for gpr files without this column. Please see also the exemplary gpr files
# coming with PAA.  
targets2 <- list.files(system.file("extdata", package="PAA"),
 pattern = "dummy_no_descr_targets", full.names=TRUE)
elist2 <- loadGPR(gpr.path=gpr, targets.path=targets2, array.type="other",
 description="Name", description.features="^Hs~", description.discard="Empty") 

Example output

Loading required package: Rcpp
Read /usr/local/lib/R/site-library/PAA/extdata/dummy_GSM734833_PA41992_-_AD1.gpr 
Read /usr/local/lib/R/site-library/PAA/extdata/dummy_GSM734834_PA41994_-_AD2.gpr 
Read /usr/local/lib/R/site-library/PAA/extdata/dummy_GSM734835_PA42006_-AD3.gpr 
Read /usr/local/lib/R/site-library/PAA/extdata/dummy_GSM734836_PA42005_-_AD4.gpr 
Read /usr/local/lib/R/site-library/PAA/extdata/dummy_GSM734837_PA41957_-_AD5.gpr 
Read /usr/local/lib/R/site-library/PAA/extdata/dummy_GSM735203_PA42023_-_CO13.gpr 
Read /usr/local/lib/R/site-library/PAA/extdata/dummy_GSM735204_PA42025_-_CO14.gpr 
Read /usr/local/lib/R/site-library/PAA/extdata/dummy_GSM735205_PA42026_-_CO15.gpr 
Read /usr/local/lib/R/site-library/PAA/extdata/dummy_GSM735206_PA42028_-_CO16.gpr 
Read /usr/local/lib/R/site-library/PAA/extdata/dummy_GSM735207_PA42029_-_CO17.gpr 
No aggregation performed.
Read /usr/local/lib/R/site-library/PAA/extdata/dummy_GSM734833_PA41992_-_AD1_no_descr.gpr 
Read /usr/local/lib/R/site-library/PAA/extdata/dummy_GSM735203_PA42023_-_CO13_no_descr.gpr 
No aggregation performed.
Warning message:
In loadGPR(gpr.path = gpr, targets.path = targets2, array.type = "other",  :
  The following columns were not imported: Description. If one of these columns is mandatory (see the loadGPR() documentation to check whether a column is mandatory) this may cause serious errors.
Warning message:
system call failed: Cannot allocate memory 

PAA documentation built on Nov. 8, 2020, 8:30 p.m.