qtlSetOption: qtlSetOption

View source: R/options.R

qtlSetOptionR Documentation

qtlSetOption

Description

Change global options for methQTL calculation

Usage

qtlSetOption(
  rnbeads.options = NULL,
  meth.data.type = "idat.dir",
  geno.data.type = "plink",
  rnbeads.report = "temp",
  rnbeads.qc = FALSE,
  hdf5dump = FALSE,
  hardy.weinberg.p = 0.001,
  db.snp.ref = NULL,
  minor.allele.frequency = 0.05,
  missing.values.samples = 0.05,
  plink.geno = 0.1,
  impute.geno.data = FALSE,
  n.prin.comp = NULL,
  plink.path = NULL,
  fast.qtl.path = NULL,
  bgzip.path = NULL,
  tabix.path = NULL,
  correlation.type = "pearson",
  cluster.cor.threshold = 0.25,
  standard.deviation.gauss = 250,
  absolute.distance.cutoff = 5e+05,
  linear.model.type = "classial.linear",
  representative.cpg.computation = "row.medians",
  meth.qtl.type = "oneVSall",
  max.cpgs = 40000,
  cluster.architecture = "sge",
  cluster.config = c(h_vmem = "5G", mem_free = "5G"),
  n.permutations = 1000,
  compute.cor.blocks = TRUE,
  recode.allele.frequencies = FALSE,
  vcftools.path = NULL,
  imputation.user.token = NULL,
  imputation.reference.panel = "apps@hrc-r1.1",
  imputation.phasing.method = "shapeit",
  imputation.population = "eur"
)

Arguments

rnbeads.options

Path to an XML file specifying the RnBeads options used for data import. The default options are suitable for Illumina Beads Array data sets.

meth.data.type

Type of DNA methylation data used. Choices are listed in rnb.execute.import.

geno.data.type

The type of data to be imported. Can be either 'plink' for '.bed', '.bim', and '.fam' or '.dos' and 'txt' files or 'idat' for raw IDAT files.

rnbeads.report

Path to an existing directory, in which the preprocessing report of RnBeads is to be stored. Defaults to the temporary file.

rnbeads.qc

Flag indicating if the quality control module of RnBeads is to be executed.

hdf5dump

Flag indicating, if large matrices are to be stored on disk rather than in main memory using the HDF5Array package.

hardy.weinberg.p

P-value used for the markers to be excluded if they do not follow the Hardy-Weinberg equilibrium as implemented in PLINK.

db.snp.ref

Path to a locally stored version of dbSNP[3]. If this option is specified, the reference allele is determined from this file instead of from the allele frequencies of the dataset. This circumvents problems with some imputation methods. If NULL(default), recoding will not be performed.

minor.allele.frequency

Threshold for the minor allele frequency of the SNPs to be used in the analysis.

missing.values.samples

Threshold specifying how much missing values per SNP are allowed across the samples to be included in the analyis.

plink.geno

Threshold for missing values per SNP

impute.geno.data

Flag indicating if imputation of genotyping data is to be perfomed using the Michigan imputation server (https://imputationserver.sph.umich.edu/index.html)[2].

n.prin.comp

Number of principal components of the genetic data to be used as covariates in the methQTL calling. NULL means that no adjustment is conducted.

plink.path

Path to an installation of PLINK (also comes with the package)

fast.qtl.path

Path to an installation of fastQTL (comes with the package for Linux)

bgzip.path

Path to an installation of BGZIP (comes with the package for Linux)

tabix.path

Path to an installation of TABIX (comes with the package for Linux)

correlation.type

The type of correlation to be used. Please note that for type='pearson' (default) the more efficient implementation of correlation in the bigstatsr is used. Further available options are 'spearman' and 'kendall'.

cluster.cor.threshold

Threshold for CpG methylatin state correlation to be considered as connected in the distance graph used to compute the correlation clustering.

standard.deviation.gauss

Standard deviation of the Gauss distribution used to weight the correlation according to its distance.

absolute.distance.cutoff

Distance cutoff after which a CpG correlation is not considered anymore.

linear.model.type

Linear model type to be used. Can be either "categorical.anova" or "classical.linear". If 'meth.qtl.type'='fastQTL', this option is automatically set to 'fastQTL' see callMethQTLBlock for more informations.

representative.cpg.computation

Option specifying how reference CpGs per correlation block are to be computed. Available options are "row.medians" for the site that is the row median across the samples within the correlation block (for ties a random selection is performed), "mean.center" for an artifical site in the geometric center of the block with the average methylation level or "best.all" for the CpG with the best p-value across all of the CpGs in the correlation block.

meth.qtl.type

Option specifying how a methQTL interaction is computed. Since the package is based on correlation blocks, a single correlation block can be associated with either one SNP (meth.qtl.type='oneVSall'), with multiple SNPs (meth.qtl.type='allVSall'), or each correlation block can once be positively and once negatively correlated with a SNP genotype (meth.qtl.type='twoVSall'). Additionally, we provide the option to use (FastQTL)[1] as a methQTL mapping tool (option 'fastQTL').

max.cpgs

Maximum number of CpGs used in the computation (used to save memory). 40,000 is a reasonable default for machines with ~128GB of main memory. Should be smaller for smaller machines and larger for larger ones.

cluster.architecture

The type of HPC cluster architecture present. Currently supported are 'sge' and 'slurm'

cluster.config

Resource parameters needed to setup an SGE or SLURM cluster job. Includes h_vmem and mem_free for SGE and clock.limit and mem.size for SLURM. An example configuration for SLURM would be c("clock.limit"="1-0","mem.size"="10G") for 1 day of running time (format days:hours) and 10 GB of maximum memory usage. Additionally, 'n.cpus' can be specified as the SLURM option cpus-per-task

n.permutations

The number of permutations used to correct the p-values for multiple testing. See (http://fastqtl.sourceforge.net/) for further information.

compute.cor.blocks

Flag indicating if correlation blocks are to be called. If FALSE, each CpG is considered separately.

recode.allele.frequencies

Flag indicating if the reference allele is to be redefined according to the frequenciess found in the cohort investigated.

vcftools.path

Path to the installation of VCFtools. Necessary is the vcf-sort function in this folder.

imputation.user.token

The user token that is required for authorization with the Michigan imputation server. Please have a look at https://imputationserver.sph.umich.edu, create a user account and request a user token for access in your user profile.

imputation.reference.panel

The reference panel used for imputation. Please see https://imputationserver.readthedocs.io/en/latest/reference-panels/ for further information which panels are supported by the Michigan imputation server.

imputation.phasing.method

The phasing method employed by the Michigan imputation server. See https://imputationserver.readthedocs.io/en/latest/api/ for further information.

imputation.population

The population for the phasing method required by the Michigan imputation server. See https://imputationserver.readthedocs.io/en/latest/api/ for further information.

Value

None

Author(s)

Michael Scherer

References

1. Ongen, H., Buil, A., Brown, A. A., Dermitzakis, E. T., & Delaneau, O. (2016). Fast and efficient QTL mapper for thousands of molecular phenotypes. Bioinformatics, 32(10), 1479–1485. https://doi.org/10.1093/bioinformatics/btv722 2. Das S, Forer L, Schönherr S, Sidore C, Locke AE, et al. (2016). Next-generation genotype imputation service and methods. Nature Genetics 48, 1284–1287, https://doi.org/10.1038/ng.3656 3. Sherry, S. T. et al. (2001). dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311, https://doi.org/10.1093/nar/29.1.308.

Examples

qtlGetOption("rnbeads.report")
qtlSetOption(rnbeads.report=getwd())
qtlGetOption("rnbeads.report")

MPIIComputationalEpigenetics/MAGAR documentation built on Dec. 6, 2024, 2:30 p.m.