biascorrection: Correct PCR-Bias in Quantitative DNA Methylation Analyses.

View source: R/biascorrection.R

biascorrectionR Documentation

Correct PCR-Bias in Quantitative DNA Methylation Analyses.

Description

This function implements the algorithms described by Moskalev et. al in their article 'Correction of PCR-bias in quantitative DNA methylation studies by means of cubic polynomial regression', published 2011 in Nucleic acids research, Oxford University Press (doi: 10.1093/nar/gkr213).

Usage

biascorrection(
  experimental,
  calibration,
  samplelocusname,
  minmax = FALSE,
  correct_method = "best",
  selection_method = "SSE",
  type = 1,
  csvdir = paste0(tempdir(), "/csvdir/"),
  plotdir = paste0(tempdir(), "/plotdir/"),
  logfilename = paste0(tempdir(), "/log.txt"),
  plot_height = 5,
  plot_width = 7.5,
  plot_textsize = 16,
  seed = 1234,
  parallel = TRUE
)

Arguments

experimental

A character string. Path to the file containing the raw methylation values of the samples under investigation.

calibration

A character string. In type 1 data (one locus in many samples, e.g. pyrosequencing data): Path to the file containing the raw methylation values of the calibration samples. In type 2 data (many loci in one sample, e.g. next-generationsequencing data or microarray data): Path to the folder that contains at least 4 calibration files (one file per calibration step). Please refer to the FAQ for more detailed information on the specific file requirements (https://raw.githubusercontent.com/kapsner/PCRBiasCorrection/master/FAQ.md).

samplelocusname

A character string. In type 1 data: locus name - name of the gene locus under investigation. In type 2 data: sample name - name of the sample under investigation.

minmax

A logical, indicating which equations are used for BiasCorrection (default: FALSE). If TRUE, equations are used that include the respective minima and maxima of the provided data.

correct_method

A character string. Method used to correct the PCR- bias of the samples under investigation. One of "best" (default), "hyperbolic" or "cubic". If the method is set to "best" (short: "b"), the algorithm will automatically determine the best fitting type of regression for each CpG site based on selection_method (by default: sum of squared errors, SSE, https://en.wikipedia.org/wiki/Residual_sum_of_squares). If the method is set to "hyperbolic" (short: "h") or "cubic" (short: "c"), the PCR-bias correction of all samples under investigation will be performed with the hyperbolic or the cubic regression respectively.

selection_method

A character string. The method used to select the regression algorithm to correct the respective CpG site. This is by default the sum of squared errors ("SSE"). The second option is "RelError", which selects the regression method based on the theoretical relative error after correction. This metric is calculated by correcting the calibration data with both the hyperbolic regression and the cubic regression and using them again as input data to calculate the 'goodness of fit'-metrics.

type

A single integer. Type of data to be corrected: either "1" (one locus in many samples, e.g. pyrosequencing data) or "2" (many loci in one sample, e.g. next-generation sequencing data or microarray data).

csvdir

A character string. Directory to store the resulting tables. (default = paste0(tempdir(), "/plotdir/")). CAUTION: This directory will be newly created on every call of the function - any preexisting files will be deleted without a warning.

plotdir

A character string. Directory to store the resulting plots (default = paste0(tempdir(), "/plotdir/")). CAUTION: This directory will be newly created on every call of the function - any preexisting files will be deleted without a warning.

logfilename

A character string. Path to a file to save the log messages (default = paste0(tempdir(), "/log.txt")).

plot_height

A integer value. The height (unit: inch) of the resulting plots (default: 5).

plot_width

A integer value. The width (unit: inch) of the resulting plots (default: 7.5).

plot_textsize

A integer value. The textsize of the resulting plots (default: 16).

seed

A integer value. The seed used when solving the unknowns in the hyperbolic regression equation and the cubic regression equation. Important for reproducibility (default: 1234).

parallel

A boolean. If TRUE (the default value), initializing 'future::plan("multicore")' (on unix systems) or 'future::plan("multisession")' (on non-unix systems) before running the code.

Value

This function is a wrapper around all of ‘rBiasCorrection'’s included functions. When executing it, it performs the whole workflow of bias correction and writes resulting csv-files and plots, as well as a log file to the local file system (the respective directories can be specified with the function arguments). The return-value is TRUE, if the correction of PCR measurement biases succeeds. If the correction fails, an error message is returned.

Examples


data.table::fwrite(
  rBiasCorrection::example.data_experimental$dat,
  paste0(tempdir(), "/experimental_data.csv")
)
data.table::fwrite(
  rBiasCorrection::example.data_calibration$dat,
  paste0(tempdir(), "/calibration_data.csv")
)
experimental <- paste0(tempdir(), "/experimental_data.csv")
calibration <- paste0(tempdir(), "/calibration_data.csv")

results <- biascorrection(
  experimental = experimental,
  calibration = calibration,
  samplelocusname = "BRAF",
  parallel = FALSE
)



rBiasCorrection documentation built on June 21, 2022, 1:05 a.m.