knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
First, we load the 'GFM' package and the real data which can be downloaded here. This data is in the format of '.Rdata' that inludes a gene expression matrix 'X' with 3460 rows (cells) and 2000 columns (genes), a vector 'group' specifying two groups of variable types ('type' variable) including 'gaussian' and 'poisson' and a vector 'y' meaning the clusters of cells annotated by experts. We compare the performance of 'GFM' and 'LFM' in downstream clustering analysis based on the benchchmarked clusters 'y'.
githubURL <- "https://github.com/feiyoung/GFM/blob/main/vignettes_data/Brain76.Rdata?raw=true" download.file(githubURL,"Brain76.Rdata",mode='wb')
Then load to R
load("Brain76.Rdata") XList <- list(X[,group==1], X[,group==2]) types <- type str(XList)
library("GFM") #load("vignettes_data\\Brain76.Rdata") #ls() # check the variables set.seed(2023) # set a random seed for reproducibility.
We fit the GFM model using 'gfm' function.
q <- 15 system.time( gfm1 <- gfm(XList, types, q= q, verbose = TRUE) )
We conduct the clustering analysis based on the extracted factors by GFM and evaluate the adjusted rand index (ARI) value based on the annotated cluster labels by experts.
hH <- gfm1$hH library(mclust) set.seed(1) gmm1 <- Mclust(hH, G=7) ARI_gfm <- adjustedRandIndex(gmm1$classification, y)
We fit linear factor model using same number of factors.
fac <- Factorm(X, q=15) hH_lfm <- fac$hH set.seed(1) gmm2 <- Mclust(hH_lfm, G=7) ARI_lfm <- adjustedRandIndex(gmm2$classification, y)
Compare with the ARIs by visualization.
library(ggplot2) df1 <- data.frame(ARI= c(ARI_gfm,ARI_lfm), Method =factor(c('GFM', "LFM"))) ggplot(data=df1, aes(x=Method, y=ARI, fill=Method)) + geom_bar(position = "dodge", stat="identity",width = 0.5)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.