hima: High-dimensional Mediation Analysis

Description Usage Arguments Value References Examples

View source: R/hima.R

Description

hima is used to estimate and test high-dimensional mediation effects.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
hima(
  X,
  Y,
  M,
  COV.XM = NULL,
  COV.MY = COV.XM,
  family = c("gaussian", "binomial"),
  penalty = c("MCP", "SCAD", "lasso"),
  topN = NULL,
  parallel = FALSE,
  ncore = 1,
  verbose = FALSE,
  ...
)

Arguments

X

a vector of exposure.

Y

a vector of outcome. Can be either continuous or binary (0-1).

M

a data.frame or matrix of high-dimensional mediators. Rows represent samples, columns represent variables.

COV.XM

a data.frame or matrix of covariates dataset for testing the association M ~ X. Covariates specified here will not participate penalization. Default = NULL. If the covariates contain mixed data types, please make sure all categorical variables are properly formatted as factor type.

COV.MY

a data.frame or matrix of covariates dataset for testing the association Y ~ M. Covariates specified here will not participate penalization. If not specified, the same set of covariates for M ~ X will be applied. Using different sets of covariates is allowed but this needs to be handled carefully.

family

either 'gaussian' or 'binomial', depending on the data type of outcome (Y). See ncvreg

penalty

the penalty to be applied to the model. Either 'MCP' (the default), 'SCAD', or 'lasso'. See ncvreg.

topN

an integer specifying the number of top markers from sure independent screening. Default = NULL. If NULL, topN will be either ceiling(n/log(n)) if family = 'gaussian', or ceiling(n/(2*log(n))) if family = 'binomial', where n is the sample size. If the sample size is greater than topN (pre-specified or calculated), all mediators will be included in the test (i.e. low-dimensional scenario).

parallel

logical. Enable parallel computing feature? Default = TRUE.

ncore

number of cores to run parallel computing Valid when parallel == TRUE. By default max number of cores available in the machine will be utilized.

verbose

logical. Should the function be verbose? Default = FALSE.

...

other arguments passed to ncvreg.

Value

A data.frame containing mediation testing results of selected mediators.

References

Zhang H, Zheng Y, Zhang Z, Gao T, Joyce B, Yoon G, Zhang W, Schwartz J, Just A, Colicino E, Vokonas P, Zhao L, Lv J, Baccarelli A, Hou L, Liu L. Estimating and Testing High-dimensional Mediation Effects in Epigenetic Studies. Bioinformatics. 2016. DOI: 10.1093/bioinformatics/btw351. PubMed PMID: 27357171.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
n <- 200  # sample size
p <- 200 # the dimension of covariates

# the regression coefficients alpha (exposure --> mediators)
alpha <- rep(0, p) 

# the regression coefficients beta (mediators --> outcome)
beta1 <- rep(0, p) # for continuous outcome
beta2 <- rep(0, p) # for binary outcome

# the first four markers are true mediators
alpha[1:4] <- c(0.45,0.5,0.6,0.7)
beta1[1:4] <- c(0.55,0.6,0.65,0.7)
beta2[1:4] <- c(1.45,1.5,1.55,1.6)

# these are not true mediators
alpha[7:8] <- 0.5
beta1[5:6] <- 0.8
beta2[5:6] <- 1.7

# Generate simulation data
simdat_cont = simHIMA(n, p, alpha, beta1, seed=1029) 
simdat_bin = simHIMA(n, p, alpha, beta2, binaryOutcome = TRUE, seed=1029) 

# Run HIMA with MCP penalty by default
# When Y is continuous (default)
hima.fit <- hima(simdat_cont$X, simdat_cont$Y, simdat_cont$M, verbose = TRUE) 
hima.fit

# When Y is binary (should specify family)
hima.logistic.fit <- hima(simdat_bin$X, simdat_bin$Y, simdat_bin$M, 
family = "binomial", verbose = TRUE) 
hima.logistic.fit

HIMA documentation built on May 15, 2021, 9:06 a.m.