methyvim: Differential Methylation Statistics with Variable Importance...

Description Usage Arguments Value Examples

View source: R/methyvim.R

Description

Computes the Targeted Minimum Loss Estimate of a specified statistical target parameter, formally defined within models from causal inference. The variable importance measures currently supported are the Average Treatment Effect (ATE) and a Nonparametric Variable Importance Measure (NPVI, formally defined by Chambaz, Neuvial, and van der Laan <doi:10.1214/12-EJS703>).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
methyvim(
  data_grs,
  var_int,
  vim = c("ate", "rr", "npvi"),
  type = c("Beta", "Mval"),
  filter = c("limma"),
  filter_cutoff = 0.05,
  window_bp = 1000,
  corr_max = 0.75,
  obs_per_covar = 20,
  sites_comp = NULL,
  parallel = TRUE,
  future_param = NULL,
  bppar_type = NULL,
  return_ic = FALSE,
  shrink_ic = FALSE,
  tmle_type = c("glm", "sl"),
  tmle_args = list(g_lib = c("SL.mean", "SL.glm", "SL.bayesglm", "SL.gam"), Q_lib =
    c("SL.mean", "SL.glm", "SL.gam", "SL.earth"), cv_folds = 5, npvi_cutoff = 0.25,
    npvi_descr = NULL),
  tmle_backend = c("tmle", "drtmle", "tmle.npvi")
)

Arguments

data_grs

An object of class GenomicRatioSet, containing standard data structures for DNA Methylation experiments. Consult the documentation of minfi to construct such objects.

var_int

A numeric vector containing subject-level measurements of the variable of interest. The length of this vector must match the number of subjects exactly. If argument vim is set to "ate" or "rr", then the variable of interest is treated as an exposure, and the variable must be binary in such cases. If setting vim to target parameters assessing continuous treatment effects, then the variable need not be binary of course.

vim

Character indicating the variable importance measure to be used in the estimation procedure. Currently supported options are the ATE for discretized exposures and NPVI for continuous exposures. ATE and RR are the appropriate choices when the underlying scientific question is of the effect of an exposure on methylation, while NPVI (and other continuous treatment parameters) ought to be used when the effect of methylation on an outcome is sought.

type

Character indicating the particular measure of DNA methylation to be used as the observed data in the estimation procedure, either Beta values or M-values. The data are accessed via getBeta or getM.

filter

Character indicating the model to be implemented when screening the data_grs object for CpG sites. The only currently supported option is "limma".

filter_cutoff

Numeric indicating the p-value cutoff that defines which sites pass through the filter.

window_bp

Numeric indicating the maximum genomic distance (in base pairs) between two sites for them to be considered neighboring sites.

corr_max

Numeric indicating the maximum correlation that a neighboring site can have with the target site.

obs_per_covar

Numeric indicating the number of observations needed for for covariate included in W for downstream analysis. This ensures the data is sufficient to control for the covariates.

sites_comp

A numeric indicating the maximum number of sites for which a variable importance measure is to be estimated post-screening. This is not typically useful in scientific settings, but may be useful when a large number of CpG sites pass the initial screening phase.

parallel

Logical indicating whether parallelization ought to be used. See the documentation of set_parallel for more information, as this argument is passed directly to that internal function.

future_param

Character indicating the type of parallelization to be used from the list available via the future package. See the documentation for set_parallel for more information, as this argument is passed directly to that internal function.

bppar_type

Character specifying the type of backend to be used for parallelization via BiocParallel. See the documentation for set_parallel for more information, as this argument is passed directly to that internal function.

return_ic

Logical indicating whether an influence curve estimate should be returned for each site that passed through the filter.

shrink_ic

Logical indicating whether limma should be applied to reduce the variance in the ic based estimates in return_ic.

tmle_type

Character indicating the general class of regression models to be used in fitting the propensity score and outcome regressions. This is generally a shorthand and is overridden by tmle_args if that argument is changed from its default values.

tmle_args

List giving several key arguments to be passed to one of tmle or tmle.npvi, depending on the particular variable importance measure specified. This overrides tmle_type, which itself provides sensible defaults. Consider changing this away from default settings only if you have sufficient experience with software and the underlying theory for Targeted Learning. For more information, consider consulting the documentation of the tmle and tmle.npvi packages.

tmle_backend

A character indicating the package to be used in the estimation procedure. The user should only set this parameter if they have sufficient familiarity with the backend packages used for estimation. Current choices include tmle, drtmle, and tmle.npvi.

Value

An object of class methytmle, with all unique slots filled in, in particular, including indices of CpG sites that pass screening, cluster of neighboring CpG sites, and a matrix of the results of the estimation procedure performed for the given variable importance measure. Optionally, estimates of the propensity score and outcome regressions, as well as the original data rotated into influence curve space may be returned, if so requested.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
library(methyvimData)
suppressMessages(library(SummarizedExperiment))
data(grsExample)
var_int <- colData(grsExample)[, 1]
# TMLE procedure for the ATE parameter over M-values with Limma filtering
methyvim_out_ate <- suppressWarnings(
  methyvim(
    data_grs = grsExample, sites_comp = 1, var_int = var_int,
    vim = "ate", type = "Mval", filter = "limma", filter_cutoff = 0.05,
    parallel = FALSE, tmle_type = "sl"
  )
)

nhejazi/methyvim documentation built on April 30, 2020, 7:14 p.m.