dataPreproc: Data preprocessing
In RPPanalyzer: Reads, Annotates, and Normalizes Reverse Phase Protein Array Data

dataPreproc

R Documentation

Data preprocessing

Description

Function for import, normalization and quality checks of data prior to the actual analysis. The preprocessing steps include subtraction of dilution series intercepts and FCF normalization. Additionally plots for quality checks are generated including dilutions and BLANK measurements.

Usage

  dataPreproc(dataDir=getwd(), blocks=12, spot="aushon",
  exportNo=3, correct="both", remove_flagged=NULL)

Arguments

`dataDir`	directory of gpr files, slidedescription.txt and sampledescription.txt, default is the current working directory
`blocks`	see `blocksperarray` in `read.Data`, default is 12
`spot`	see `spotter` in `read.Data`, default is "aushon"
`exportNo`	see `exportNo` in `correctDilinterc`, integer of 1-4 defining the linear fit to be used (1: constant, 2: antibody, 3: antibody + slide, 4: antibody + slide + sample), default is 3
`correct`	"both" applies `correctDilinterc` to all measurements, including FCF. "none" does not use this BG correction at all. "noFCF" applies `correctDilinterc` to all but not FCF measurements. The default is "both".
`remove_flagged`	Either NULL or an integer. If an integer, looks into column `Flags` of the gpr file and removes samples with flag value less than or equal `-remove_flagged` from the data tables.

Value

A list of 4 elements is returned.

`rawdat`	list of 4 raw data elements (`expression` and `background` matrices, `arraydescription` and `sampledescription` data frames) according to `read.Data`
`cordat`	list of 4 elements like `rawdat` with `expression` data corrected to dilution intercepts, in case of resulting negative values the absoulte minimum + 1 is added, `expression` data is without NAs and is reduced to the `measurement` sample type, `background` is not corrected to intercepts, as it is not used here. If `correct` is "noFCF", the FCF measurements stay as in rawdat. If `correct` is "none", the measurements stay as in rawdat.
`normdat`	list of 4 elements like `cordat` with `expression` as dilution intercept (`correct` "both" or "noFCF") and FCF normalized foreground data, the neglected background data are renamed here to `dummy` and should not be used
`DIR`	directory for storing the generated outputs

All output files are stored in an analysis folder labeled by the date of analysis. The txt files Dataexpression and Databackground result from write.Data and store the raw data. The pdf files getIntercepts_Output and anovaIntercepts_Output result from correctDilinterc. getIntercepts_Output shows the derived intercepts and smoothing splines of dilution series in dependence of the dilSeriesID column in sampledescription.txt and the slide/pad/incubationRun/spottingRun columns of the arraydescription matrix. anovaIntercepts_Output.pdf results from the ANOVA in correctDilinterc, comparing different linear models of the dilution series intercepts. The barplot displays the residual sum of squares (RSS) of the individual model fits. It helps to choose the appropriate exportNo parameter. As RSS decreases, the model fits better. Finally, three pdf files for quality checking are returned. QC_dilutioncurve_raw.pdf plots target and blank (2nd antibody only) signals from serially diluted control samples of the raw RPPA data set, see plotQC. QC_targetVSblank_normed.pdf plots blank signals vs. target specific signals of dilution intercept corrected and FCF normalized RPPA data, see plotMeasurementsQC. QC_qqPlot_normed.pdf contains qq-plots of dilution intercept corrected and FCF normalized RPPA data, see plotqq.

Author(s)

Silvia von der Heyde

Examples

## Not run: 
library(RPPanalyzer)



# get output list 
dataDir<-system.file("extdata",package="RPPanalyzer")
res<-dataPreproc(dataDir=dataDir,blocks=12,spot="aushon",exportNo=4,correct="both")

# get individual elements
# raw data
rawdat<-res$rawdat
# dilution intercept corrected data
cordat<-res$cordat
# dilution intercept corrected and FCF normalized data
normdat<-res$normdat
# output directory
DIR<-res$DIR


## End(Not run)

RPPanalyzer documentation built on May 29, 2024, 5:43 a.m.