The LUCIDus R package is an integrative tool to obtain a joint estimation of latent or unknown clusters/subgroups with multi-omics data and phenotypic traits. This package is an implementation for the novel statistical method proposed in the research paper “A Latent Unknown Clustering Integrating Multi-Omics Data (LUCID) with Phenotypic Traits” published by the Bioinformatics.
Cheng Peng, Jun Wang, Isaac Asante, Stan Louie, Ran Jin, Lida Chatzi, Graham Casey, Duncan C Thomas, David V Conti, A Latent Unknown Clustering Integrating Multi-Omics Data (LUCID) with Phenotypic Traits, Bioinformatics, Volume 36, Issue 3, 1 February 2020, Pages 842–850, https://doi.org/10.1093/bioinformatics/btz667
You can install the released version of LUCIDus from CRAN directly with:
install.packages("LUCIDus")
Or, it can be installed from GitHub using the following codes:
install.packages("devtools")
devtools::install_github("USCbiostats/LUCIDus")
library(LUCIDus)
Three functions, including est_lucid()
, boot_lucid()
, and
tune_lucid()
, are currently available for model fitting and selection.
The model outputs can be summarized and visualized using
summary_lucid()
and plot_lucid()
respectively. Predictions could be
made with pred_lucid()
.
est_lucid()
Estimating latent clusters with multi-omics data, missing values in biomarker data are allowed, and information in the outcome of interest can be integrated
For a testing dataset with 10 genetic features (5 causal) and 4 biomarkers (2 causal)
set.seed(10)
IntClusFit <- est_lucid(G=G1,Z=Z1,Y=Y1,K=2,family="binary",Pred=TRUE)
summary_lucid()
summary_lucid(IntClusFit)
plot_lucid()
plot_lucid(IntClusFit)
IntClusCoFit <- est_lucid(G=G1,CoG=CoG,Z=Z1,Y=Y1,K=2,family="binary",Pred=TRUE)
summary_lucid(IntClusCoFit)
plot_lucid(IntClusCoFit)
boot_lucid()
Bootstrap method to achieve SEs for LUCID parameter estimates
set.seed(10)
boot_lucid(G = G1, CoG = CoG, Z = Z1, Y = Y1, CoY = CoY, useY = TRUE, family = "binary", K = 2, R=500)
tune_lucid()
Grid search for tuning parameters using parallel computing
# Better be run on a server or HPC
set.seed(10)
GridSearch <- tune_lucid(G=G1, Z=Z1, Y=Y1, K=2, Family="binary", USEY = TRUE,
LRho_g = 0.008, URho_g = 0.012, NoRho_g = 3,
LRho_z_invcov = 0.04, URho_z_invcov = 0.06, NoRho_z_invcov = 3,
LRho_z_covmu = 90, URho_z_covmu = 110, NoRho_z_covmu = 2)
GridSearch$Results
GridSearch$Optimal
Run LUCID with best tuning parameters and select informative features
set.seed(10)
IntClusFit <- est_lucid(G=G1,Z=Z1,Y=Y1,K=2,family="binary",Pred=TRUE,
tunepar = def_tune(Select_G=TRUE,Select_Z=TRUE,
Rho_G=0.01,Rho_Z_InvCov=0.06,Rho_Z_CovMu=90))
# Identify selected features
summary_lucid(IntClusFit)$No0G; summary_lucid(IntClusFit)$No0Z
colnames(G1)[summary_lucid(IntClusFit)$select_G]; colnames(Z1)[summary_lucid(IntClusFit)$select_Z]
# Select the features
if(!all(summary_lucid(IntClusFit)$select_G==FALSE)){
G_select <- G1[,summary_lucid(IntClusFit)$select_G]
}
if(!all(summary_lucid(IntClusFit)$select_Z==FALSE)){
Z_select <- Z1[,summary_lucid(IntClusFit)$select_Z]
}
set.seed(10)
IntClusFitFinal <- est_lucid(G=G_select,Z=Z_select,Y=Y1,K=2,family="binary",Pred=TRUE)
plot_lucid(IntClusFitFinal)
IntClusCoFit <- est_lucid(G=G1,CoG=CoG,Z=Z1,Y=Y1,K=2,family="binary",Pred=TRUE,
initial=def_initial(), itr_tol=def_tol(),
tunepar = def_tune(Select_G=TRUE,Select_Z=TRUE,Rho_G=0.02,Rho_Z_InvCov=0.1,Rho_Z_CovMu=93))
summary_lucid(IntClusCoFit)
IntClusCoFitFinal <- est_lucid(G=G_select,CoG=CoG,Z=Z_select,Y=Y1,K=2,family="binary",Pred=TRUE)
plot_lucid(IntClusCoFitFinal)
For more details, see documentations for each function in the R package.
The current version is 1.0.0.
For the versions available, see the Release on this repository.
This project is licensed under the GPL-2 License.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.