qc_idats: Compute quality-control of methylation array from Illumina...

Description Usage Arguments

View source: R/qc_idats.R

Description

Compute quality-control of methylation array from Illumina using a rmarkdown template.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
qc_idats(
  csv_file,
  data_directory,
  array = "EPIC",
  annotation = "ilm10b4.hg19",
  cohort_name = "COHORT",
  output_file = paste(cohort_name, array, "QC.html", sep = "_"),
  output_directory = ".",
  filter_snps = TRUE,
  filter_non_cpg = TRUE,
  filter_xy = TRUE,
  filter_multihit = TRUE,
  filter_beads = TRUE,
  population = NULL,
  bead_cutoff = 0.05,
  detection_pvalues = 0.01,
  filter_callrate = TRUE,
  callrate_samples = 0.99,
  callrate_probes = 1,
  gender_threshold = -2,
  gender_colname = NULL,
  norm_background = "oob",
  norm_dye = "RELIC",
  norm_quantile = "quantile1",
  cell_tissue = NULL,
  pca = TRUE,
  pca_vars = c("Sample_Plate", "Sentrix_ID"),
  pca_threshold = 2,
  max_labels = 15,
  title = paste(array, "Array Quality-Control"),
  author_name = "Unknown",
  author_affiliation = NULL,
  author_email = NULL,
  encoding = "UTF-8",
  ...
)

Arguments

csv_file

A character. The path to a CSV file, i.e., a sample sheet describing the data.

data_directory

A character. The path to the data directory.

array

A character. The array name, i.e., "EPIC" or "450k".

annotation

A character. The name and version of the annotation package to be used.

cohort_name

A character. The name of the studied cohort / population.

output_file

A character. The name of the html file produced.

output_directory

A character. The path to the output directory.

filter_snps

A logical. Should the probes in which the probed CpG falls near a SNP (according to (Zhou et al., 2016)) be removed? Default is TRUE.

filter_non_cpg

A logical. Should the non-cg probes be removed?

filter_xy

A logical. Should the probes from X and Y chromosomes be removed? Default is TRUE.

filter_multihit

A logical. Should the probes which align to multiple locations (according to Nordlund et al., 2013) be removed? Default is TRUE.

filter_beads

A logical. Should the probes with a beadcount less than 3 be removed? Default is TRUE.

population

A character. Name of the ethnicity population to be used. Default is NULL for none.

bead_cutoff

A numeric. The threshold for beadcount. Default is 0.05.

detection_pvalues

A numeric. The threshold for the detection pvalues above which, values are considered as missing. Default is 0.01.

filter_callrate

A logical. Should the data be filtered based on call rate metric? Default is TRUE.

callrate_samples

A numeric. The call rate threshold for samples, under which samples are excluded. Default is 0.99.

callrate_probes

A numeric. The call rate threshold for probes, under which probes are excluded. Default is 1.

gender_threshold

A numeric. The threshold value to discrimate gender based on sexual chromosomes methylation. Default is -2.

gender_colname

A character. The name of the column containing the gender in the file provided in csv_file. Default is NULL.

norm_background

A character. Optional method to estimate background normal distribution parameters. This must be one of "oob", "est" or "neg". Default is "oob".

norm_dye

A character. Dye bias correction, "mean": correction based on averaged red/green ratio; or "RELIC": correction with RELIC method; or "none": no dye bias correction. Default is "RELIC".

norm_quantile

A character. The quantile normalisation to be used. This should be one of "quantile1", "quantile2", or "quantile3". Default is "quantile1".

cell_tissue

A character. The cell tissue to be used for cell composition estimation, using a reference panel (i.e., "blood" or "cordblood") or a mathematical deconvolution. Default is NULL.

pca

A logical. Whether or not a PCA should be performed on the dataset. Default is TRUE.

pca_vars

A vector(character). Variables to be used with factorial planes. Default is c("Sample_Plate", "Sentrix_ID").

pca_threshold

A numeric. The number of times the interquartile range from the upper and lower quartile at which sa sample is defined as an outlier. Default is 2.

max_labels

A numeric. The maximum number of labels to show on plots. Default is 15

title

A character. The report's title. Default is paste(array, "Array Quality-Control").

author_name

A character. The author's name to be printed in the report. Default is Unknown.

author_affiliation

A character. The affiliation to be printed in the report. Default is NULL.

author_email

A character. The email to be printed in the report. Default is NULL.

encoding

A character. The encoding to be used for the html report. Default is "UTF-8".

...

Parameters to pass to rmarkdown::render().


omicsr/dmapaq documentation built on Oct. 13, 2021, 1:08 p.m.