biomarkertmle: Biomarker Evaluation with Targeted Minimum Loss Estimation of...

Description Usage Arguments Value Examples

View source: R/biotmle.R

Description

Computes the causal target parameter defined as the difference between the biomarker expression values under treatment and those same values under no treatment, using Targeted Minimum Loss Estimation.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
biomarkertmle(
  se,
  varInt,
  normalized = TRUE,
  ngscounts = FALSE,
  bppar_type = BiocParallel::MulticoreParam(),
  bppar_debug = FALSE,
  cv_folds = 1,
  g_lib = c("SL.mean", "SL.glm", "SL.bayesglm"),
  Q_lib = c("SL.mean", "SL.bayesglm", "SL.earth", "SL.ranger"),
  ...
)

Arguments

se

A SummarizedExperiment containing microarray expression or next-generation sequencing data in the assays slot and a matrix of phenotype-level data in the colData slot.

varInt

A numeric indicating the column of the design matrix corresponding to the treatment or outcome of interest (in the colData slot of the SummarizedExperiment argument "se").

normalized

A logical indicating whether the data included in the assay slot of the input SummarizedExperiment object has been normalized externally. The default is set to TRUE with the expectation that an appropriate normalization method has been applied. If set to FALSE, median normalization is performed for microarray data.

ngscounts

A logical indicating whether the data are counts generated from a next-generation sequencing experiment (e.g., RNA-seq). The default setting assumes continuous expression measures as generated by microarray platforms.

bppar_type

A parallelization option specified by BiocParallel. Consult the manual page for BiocParallelParam for possible types and their descriptions. The default for this argument is MulticoreParam, for multicore evaluation.

bppar_debug

A logical indicating whether or not to rely upon pkgBiocParallel. Setting this argument to TRUE, replaces the call to bplapply by a call to lapply, which significantly reduces the overhead of debugging. Note that invoking this option overrides all other parallelization arguments.

cv_folds

A numeric scalar indicating how many folds to use in performing targeted minimum loss estimation. Cross-validated estimates have been demonstrated to allow relaxation of certain theoretical conditions and and accommodate the construction of more conservative variance estimates.

g_lib

A character vector specifying the library of machine learning algorithms for use in fitting the propensity score P(A = a | W).

Q_lib

A character vector specifying the library of machine learning algorithms for use in fitting the outcome regression E[Y | A,W].

...

Additional arguments to be passed to drtmle in computing the targeted minimum loss estimator of the average treatment effect.

Value

S4 object of class biotmle, inheriting from SummarizedExperiment, with additional slots tmleOut and call, among others, containing TML estimates of the ATE of exposure on biomarker expression.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
library(dplyr)
library(biotmleData)
library(SuperLearner)
library(SummarizedExperiment)
data(illuminaData)

colData(illuminaData) <- colData(illuminaData) %>%
  data.frame() %>%
  mutate(age = as.numeric(age > median(age))) %>%
  DataFrame()
benz_idx <- which(names(colData(illuminaData)) %in% "benzene")

biomarkerTMLEout <- biomarkertmle(
  se = illuminaData[1:2, ],
  varInt = benz_idx,
  bppar_type = BiocParallel::SerialParam(),
  g_lib = c("SL.mean", "SL.glm"),
  Q_lib = c("SL.mean", "SL.glm")
)

nhejazi/biotmle documentation built on Oct. 15, 2021, 5:46 p.m.