knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(CrIMMix) library(dplyr) library(stringr) library(ggplot2)
This vignette aims to show how to use the package named CrIMMix
. This package groups several R packages that perform integrating analysis of multi-omics datasets. In this package, we can also simulate data in order to perform a comparative analysis with existent packages.
It is possible to tune the number of clusters, the number of samples per clusters, the length of signals. Then, the users can define the distribution of the data between the normal, the binary, or the uniform distribution.
The simulation under normal distribution could represent for instance gene expression data or raw copy number signal. For each cluster, the user can choose the mean and the variance of the distribution that varies from a cluster to another. The user can also choose the proportion of variables affected by this modified distribution. To finish, it is possible to modify the background noise.
set.seed(34) c_1 <- simulateY(nclust=4, n_byClust=20, J=100, flavor="normal", params=list(c(mean=3,sd=1)), prop=0.1, noise=1)
heatmap(c_1$data)
The simulation under binary distribution could represent for instance mutations in tumor samples. For each cluster, the user can choose the mutation rate that varies from a cluster to another. The user can also choose the proportion of variables affected by this modified distribution. To finish, it is possible to modify the background noise (added random mutation at a proportion equal to the noise).
set.seed(34) c_2 <- simulateY(nclust=4, n_byClust=20, J=100, flavor="binary", params=list(c(p=0.7)), prop=0.1, noise=0.1)
heatmap(c_2$data)
The simulation under uniform distribution could represent for instance methylation data. The beta distribution simulates data between 0 and 1. The user can also choose the proportion of variables that are hypermethylated or hypomethylated for the same subgroup. To finish, it is possible to modify the background noise.
set.seed(34) params_beta <- list(c(mean1=-2, mean2=2, sd1=0.5, sd2=0.5)) c_3 <- simulateY(nclust=4, n_byClust=20, J=500,flavor="beta", params=params_beta, prop=0.2, noise=1)
heatmap(c_3$data)
We create a wrapper called IntMultiOmics
to run simply several methods dealing with multi-omics data. The user can call the method with the argument named method
in function IntMultiOmics
. It is possible to pass various arguments that are specific to each method. If nothing is specified, the default parameters of each method are used. Below, we give examples to run functions: SNF, RGCCA, and iCluster. For each method, we compute the ARI (Adjusted Rand Index).
data <- list(c_1$data, c_2$data, c_3$data)
An example of how to run Mocluster @meng2015mocluster methods is given below:
res_Mocluster <- IntMultiOmics(data, K=4, method="Mocluster", k=c(0.2, 0.2, 0.4))
Then, we run RGCCA @tenenhaus2011regularized.
res_RGCCA <- IntMultiOmics(data, K=4, method="RGCCA")
To finish we try iCluster @mo2013pattern.
res_icluster <- IntMultiOmics(data, K=4, method="iCluster", type=c("gaussian", "binomial", "gaussian"), lambda=c(0.01, 0.01, 0.005))
adjustedRIComputing(res_Mocluster,c_1$true.clusters)
adjustedRIComputing(res_RGCCA,c_1$true.clusters)
adjustedRIComputing(res_icluster,c_1$true.clusters)
trueDat1 <- c_1$positive %>% unlist %>% unique trueDat2 <- c_2$positive %>% unlist %>% unique trueDat3 <- c_3$positive %>% unlist %>% unique truth <- list(trueDat1,trueDat2, trueDat3)
roc_moclust <- roc_eval(truth= truth, fit = res_Mocluster$fit, method = "Mocluster") plot_roc_eval(roc_moclust)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.