knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(LUCIDus)
The LUCIDus package is aiming to provide researchers in the genetic epidemiology community with an integrative tool in R to obtain a joint estimation of latent or unknown clusters/subgroups with multi-omics data and phenotypic traits.
This package is an implementation for the novel statistical method proposed in the research paper "A Latent Unknown Clustering Integrating Multi-Omics Data (LUCID) with Phenotypic Traits^[https://doi.org/10.1093/bioinformatics/btz667]" published by the Bioinformatics. LUCID improves the subtype classification which leads to better diagnostics as well as prognostics and could be the potential solution for efficient targeted treatments and successful personalized medicine.
Four main functions, including est_lucid
, boot_lucid
, sem_lucid
, and tune_lucid
, are currently available for model fitting and feature selection. The model outputs can be summarized and visualized using summary_lucid
and plot_lucid
respectively. Predictions could be made with pred_lucid
.
Here are the descriptions of LUCIDus functions:
| Function | Description |
|:-----:|:---------------------|
| est_lucid()
| Estimates latent clusters using multi-omics data with/without the outcome of interest, and producing an IntClust object; missing values in biomarker data (Z) are allowed |
| boot_lucid()
| Latent cluster estimation through bootstrapping for statistical inference |
| sem_lucid()
| Latent cluster estimation with supplemented E-M algorithm for statistical inference |
| tune_lucid()
| Grid search for tuning parameters using parallel computing to determine an optimal choice of three tuning parameters with minimum model BIC |
| summary_lucid()
| Summarizes the results of integrative clustering based on an IntClust
object |
| plot_lucid()
| Produces a Sankey diagram for the results of integrative clustering based on an IntClust
object |
| pred_lucid()
| Predicts latent clusters and outcomes with an IntClust
object and new data |
| def_initial()
| Defines initial values of model parameters in est_lucid()
, sem_lucid()
, and tune_lucid()
fitting |
| def_tune()
| Defines selection options and tuning parameters in est_lucid()
, sem_lucid()
, and tune_lucid()
fitting |
| def_tol()
| Defines tolerance settings in est_lucid()
, sem_lucid()
, and tune_lucid()
fitting |
For a simulated data with 10 genetic features (5 causal) and 4 biomarkers (2 causal)
est_lucid()
^[Parameter initials and stopping criteria can be specified using def_initial()
and def_tol()
]set.seed(10) IntClusFit <- est_lucid(G=G1,Z=Z1,Y=Y1,K=2,family="binary",Pred=TRUE)
summary_lucid()
summary_lucid(IntClusFit)
plot_lucid()
plot_lucid(IntClusFit)
tune_lucid()
GridSearch <- tune_lucid(G=G1, Z=Z1, Y=Y1, K=2, Family="binary", USEY = TRUE, NoCores = 2, LRho_g = 0.008, URho_g = 0.012, NoRho_g = 3, LRho_z_invcov = 0.04, URho_z_invcov = 0.06, NoRho_z_invcov = 3, LRho_z_covmu = 90, URho_z_covmu = 100, NoRho_z_covmu = 3)
GridSearch$Results
GridSearch$Optimal
est_lucid()
are defined by def_tune()
]set.seed(0.01*0.05*95) IntClusFit <- est_lucid(G=G1,Z=Z1,Y=Y1,K=2,family="binary",Pred=TRUE, tunepar = def_tune(Select_G=TRUE,Select_Z=TRUE, Rho_G=0.01,Rho_Z_InvCov=0.05,Rho_Z_CovMu=95))
summary_lucid(IntClusFit)$No0G; summary_lucid(IntClusFit)$No0Z colnames(G1)[summary_lucid(IntClusFit)$select_G]; colnames(Z1)[summary_lucid(IntClusFit)$select_Z]
if(!all(summary_lucid(IntClusFit)$select_G==FALSE)){ G_select <- G1[,summary_lucid(IntClusFit)$select_G] } if(!all(summary_lucid(IntClusFit)$select_Z==FALSE)){ Z_select <- Z1[,summary_lucid(IntClusFit)$select_Z] }
set.seed(10) IntClusFitFinal <- est_lucid(G=G_select,Z=Z_select,Y=Y1,K=2,family="binary",Pred=TRUE)
plot_lucid(IntClusFitFinal)
# Re-run the model with covariates in the G->X path set.seed(10) IntClusCoFit <- est_lucid(G=G1,CoG=CoG,Z=Z1,Y=Y1,K=2,family="binary",Pred=TRUE)
# Check important model outputs summary_lucid(IntClusCoFit) # Visualize the results plot_lucid(IntClusCoFit)
# Re-run feature selection with covariates in both paths set.seed(10) IntClusCoFit <- est_lucid(G=G1,CoG=CoG,Z=Z1,Y=Y1,CoY=CoY,K=2,family="binary",Pred=TRUE, initial=def_initial(), itr_tol=def_tol(), tunepar = def_tune(Select_G=TRUE,Select_Z=TRUE, Rho_G=0.02,Rho_Z_InvCov=0.1,Rho_Z_CovMu=93)) # Re-fit with selected features with covariates set.seed(10) IntClusCoFitFinal <- est_lucid(G=G_select,CoG=CoG,Z=Z_select,Y=Y1,CoY=CoY,K=2, family="binary",Pred=TRUE) # Visualize the results plot_lucid(IntClusCoFitFinal)
set.seed(10) BootCIs <- boot_lucid(G = G1, CoG = CoG, Z = Z1, Y = Y1, CoY = CoY, useY = TRUE, family = "binary", K = 2, R=50) BootCIs$Results
set.seed(10) sem_lucid(G=G2,Z=Z2,Y=Y2,useY=TRUE,K=2,Pred=TRUE,family="normal",Get_SE=TRUE, itr_tol = def_tol(tol_b = 1e-02, tol_m = 1e-02, tol_s = 1e-02, tol_g = 1e-02))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.