dataPreproc: Data preprocessing

View source: R/dataPreproc.R

dataPreprocR Documentation

Data preprocessing

Description

Function for import, normalization and quality checks of data prior to the actual analysis. The preprocessing steps include subtraction of dilution series intercepts and FCF normalization. Additionally plots for quality checks are generated including dilutions and BLANK measurements.

Usage

  dataPreproc(dataDir=getwd(), blocks=12, spot="aushon",
  exportNo=3, correct="both", remove_flagged=NULL)

Arguments

dataDir

directory of gpr files, slidedescription.txt and sampledescription.txt, default is the current working directory

blocks

see blocksperarray in read.Data, default is 12

spot

see spotter in read.Data, default is "aushon"

exportNo

see exportNo in correctDilinterc, integer of 1-4 defining the linear fit to be used (1: constant, 2: antibody, 3: antibody + slide, 4: antibody + slide + sample), default is 3

correct

"both" applies correctDilinterc to all measurements, including FCF. "none" does not use this BG correction at all. "noFCF" applies correctDilinterc to all but not FCF measurements. The default is "both".

remove_flagged

Either NULL or an integer. If an integer, looks into column Flags of the gpr file and removes samples with flag value less than or equal -remove_flagged from the data tables.

Value

A list of 4 elements is returned.

rawdat

list of 4 raw data elements (expression and background matrices, arraydescription and sampledescription data frames) according to read.Data

cordat

list of 4 elements like rawdat with expression data corrected to dilution intercepts, in case of resulting negative values the absoulte minimum + 1 is added, expression data is without NAs and is reduced to the measurement sample type, background is not corrected to intercepts, as it is not used here. If correct is "noFCF", the FCF measurements stay as in rawdat. If correct is "none", the measurements stay as in rawdat.

normdat

list of 4 elements like cordat with expression as dilution intercept (correct "both" or "noFCF") and FCF normalized foreground data, the neglected background data are renamed here to dummy and should not be used

DIR

directory for storing the generated outputs

All output files are stored in an analysis folder labeled by the date of analysis. The txt files Dataexpression and Databackground result from write.Data and store the raw data. The pdf files getIntercepts_Output and anovaIntercepts_Output result from correctDilinterc. getIntercepts_Output shows the derived intercepts and smoothing splines of dilution series in dependence of the dilSeriesID column in sampledescription.txt and the slide/pad/incubationRun/spottingRun columns of the arraydescription matrix. anovaIntercepts_Output.pdf results from the ANOVA in correctDilinterc, comparing different linear models of the dilution series intercepts. The barplot displays the residual sum of squares (RSS) of the individual model fits. It helps to choose the appropriate exportNo parameter. As RSS decreases, the model fits better. Finally, three pdf files for quality checking are returned. QC_dilutioncurve_raw.pdf plots target and blank (2nd antibody only) signals from serially diluted control samples of the raw RPPA data set, see plotQC. QC_targetVSblank_normed.pdf plots blank signals vs. target specific signals of dilution intercept corrected and FCF normalized RPPA data, see plotMeasurementsQC. QC_qqPlot_normed.pdf contains qq-plots of dilution intercept corrected and FCF normalized RPPA data, see plotqq.

Author(s)

Silvia von der Heyde

Examples

## Not run: 
library(RPPanalyzer)



# get output list 
dataDir<-system.file("extdata",package="RPPanalyzer")
res<-dataPreproc(dataDir=dataDir,blocks=12,spot="aushon",exportNo=4,correct="both")

# get individual elements
# raw data
rawdat<-res$rawdat
# dilution intercept corrected data
cordat<-res$cordat
# dilution intercept corrected and FCF normalized data
normdat<-res$normdat
# output directory
DIR<-res$DIR


## End(Not run)

RPPanalyzer documentation built on Aug. 28, 2023, 5:07 p.m.