hima2: Advanced High-dimensional Mediation Analysis

View source: R/hima2.R

hima2R Documentation

Advanced High-dimensional Mediation Analysis

Description

hima2 is an upgraded version of hima for estimating and testing high-dimensional mediation effects.

Usage

hima2(
  formula,
  data.pheno,
  data.M,
  outcome.family = c("gaussian", "binomial", "survival", "quantile"),
  mediator.family = c("gaussian", "negbin", "compositional"),
  penalty = c("DBlasso", "MCP", "SCAD", "lasso"),
  topN = NULL,
  scale = TRUE,
  verbose = FALSE,
  ...
)

Arguments

formula

an object of class formula: a symbolic description of the overall effect model, i.e., outcome ~ exposure + covariates, to be fitted. Make sure the "exposure" is the variable of interest, which must be listed as the first variable in the right hand side of the formula. independent variable in the formula. The same covariates will be used in screening and penalized regression.

data.pheno

a data frame containing exposure and covariates that are listed in the right hand side of the formula. The variable names must match those listed in formula. By default hima2 will scale data.pheno.

data.M

a data.frame or matrix of high-dimensional mediators. Rows represent samples, columns represent variables. By default hima2 will scale data.M.

outcome.family

either 'gaussian' (default, for normally distributed continuous outcome), 'binomial' (for binay outcome), 'survival' (for time-to-event outcome), or 'quantile' (for quantile mediation analysis)

mediator.family

either 'gaussian' (default, for continuous mediators), 'negbin' (i.e., negative binomial, for RNA-seq data as mediators), or 'compositional' (for microbiome data as mediators), depending on the data type of high-dimensional mediators (data.M).

penalty

the penalty to be applied to the model. Either 'DBlasso' (De-biased LASSO, default), 'MCP', 'SCAD', or 'lasso'. Please note, survival HIMA and microbiome HIMA can be only performed with 'DBlasso'; Quantile HIMA cannot be performed with 'DBlasso'.

topN

an integer specifying the number of top markers from sure independent screening. Default = NULL. If NULL, topN will be ceiling(2 * n/log(n)), where n is the sample size. If the sample size is greater than topN (pre-specified or calculated), all mediators will be included in the test (i.e. a low-dimensional scenario).

scale

logical. Should the function scale the data (exposure, mediators, and covariates)? Default = TRUE.

verbose

logical. Should the function be verbose and shows the progression? Default = FALSE.

...

other arguments.

Value

A data.frame containing mediation testing results of selected mediators.

References

1. Zhang H, Zheng Y, Zhang Z, Gao T, Joyce B, Yoon G, Zhang W, Schwartz J, Just A, Colicino E, Vokonas P, Zhao L, Lv J, Baccarelli A, Hou L, Liu L. Estimating and Testing High-dimensional Mediation Effects in Epigenetic Studies. Bioinformatics. 2016. DOI: 10.1093/bioinformatics/btw351. PMID: 27357171; PMCID: PMC5048064

2. Zhang H, Zheng Y, Hou L, Zheng C, Liu L. Mediation Analysis for Survival Data with High-Dimensional Mediators. Bioinformatics. 2021. DOI: 10.1093/bioinformatics/btab564. PMID: 34343267; PMCID: PMC8570823

3. Zhang H, Chen J, Feng Y, Wang C, Li H, Liu L. Mediation effect selection in high-dimensional and compositional microbiome data. Stat Med. 2021. DOI: 10.1002/sim.8808. PMID: 33205470; PMCID: PMC7855955

4. Zhang H, Chen J, Li Z, Liu L. Testing for mediation effect with application to human microbiome data. Stat Biosci. 2021. DOI: 10.1007/s12561-019-09253-3. PMID: 34093887; PMCID: PMC8177450

5. Perera C, Zhang H, Zheng Y, Hou L, Qu A, Zheng C, Xie K, Liu L. HIMA2: high-dimensional mediation analysis and its application in epigenome-wide DNA methylation data. BMC Bioinformatics. 2022. DOI: 10.1186/s12859-022-04748-1. PMID: 35879655; PMCID: PMC9310002

6. Zhang H, Hong X, Zheng Y, Hou L, Zheng C, Wang X, Liu L. High-Dimensional Quantile Mediation Analysis with Application to a Birth Cohort Study of Mother–Newborn Pairs. Bioinformatics. 2024. DOI: 10.1093/bioinformatics/btae055. PMID: 38290773; PMCID: PMC10873903

Examples

## Not run: 
# Note: In the following examples, M1, M2, and M3 are true mediators.
data(himaDat)

# Example 1 (continous outcome): 
head(himaDat$Example1$PhenoData)

e1 <- hima2(Outcome ~ Treatment + Sex + Age, 
      data.pheno = himaDat$Example1$PhenoData, 
      data.M = himaDat$Example1$Mediator,
      outcome.family = "gaussian",
      mediator.family = "gaussian",
      penalty = "DBlasso",
      scale = FALSE) # Disabled only for example data
e1
attributes(e1)$variable.labels

# Example 2 (binary outcome): 
head(himaDat$Example2$PhenoData)

e2 <- hima2(Disease ~ Treatment + Sex + Age, 
      data.pheno = himaDat$Example2$PhenoData, 
      data.M = himaDat$Example2$Mediator,
      outcome.family = "binomial",
      mediator.family = "gaussian",
      penalty = "DBlasso",
      scale = FALSE) # Disabled only for example data
e2
attributes(e2)$variable.labels

# Example 3 (time-to-event outcome): 
head(himaDat$Example3$PhenoData)

e3 <- hima2(Surv(Status, Time) ~ Treatment + Sex + Age, 
      data.pheno = himaDat$Example3$PhenoData, 
      data.M = himaDat$Example3$Mediator,
      outcome.family = "survival",
      mediator.family = "gaussian",
      penalty = "DBlasso",
      scale = FALSE) # Disabled only for example data
e3
attributes(e3)$variable.labels

# Example 4 (compositional data as mediator, e.g., microbiome): 
head(himaDat$Example4$PhenoData)

e4 <- hima2(Outcome ~ Treatment + Sex + Age, 
      data.pheno = himaDat$Example4$PhenoData, 
      data.M = himaDat$Example4$Mediator,
      outcome.family = "gaussian",
      mediator.family = "compositional",
      penalty = "DBlasso",
      scale = FALSE) # Disabled only for example data
e4
attributes(e4)$variable.labels

#' # Example 5 (quantile mediation anlaysis): 
head(himaDat$Example5$PhenoData)

# Note that the function will prompt input for quantile level.
e5 <- hima2(Outcome ~ Treatment + Sex + Age, 
      data.pheno = himaDat$Example5$PhenoData, 
      data.M = himaDat$Example5$Mediator,
      outcome.family = "quantile",
      mediator.family = "gaussian",
      penalty = "MCP", # Quantile HIMA does not support DBlasso
      scale = FALSE, # Disabled only for example data
      tau = c(0.3, 0.5, 0.7)) # Specify multiple quantile level
e5
attributes(e5)$variable.labels

## End(Not run)
                  

YinanZheng/HMA documentation built on April 23, 2024, 4:55 a.m.