An R package predict unmeasured metabolites:
Metabolites are being used in studies of different cancer types. However, it is very hard to measure all the metbolites in patients. Thus, this tool can help you to predict unmeasured metabolites based on the measured ones.
You can install mpred
from GitHub using the following commands:
install.packages("devtools") library(devtools) devtools::install_github("czang97/mpred")
Metabolites data across 7 studies are hosted in this mpred
package.
RC12 <- RC12 RC18 <- RC18 RC12_sampleinfo <- RC12_sampleinfo RC18_sampleinfo <- RC18_sampleinfo
If you don't have a hold-out dataset, if you are only working with one dataset use this: prepare_data
. And then, you are going to split your dataset into training set and test set.
RC12_tumor <- prepare_data(df = RC12, df_sampleinfo = RC12_sampleinfo, df_sample_name = "SAMPLE_NAME", type = "tumor") # Tidy dataset t_RC12_tumor <- df_tidy( RC12_tumor_match, standardize = "z") # seperate dataset into train set and test set set.seed(58) t_df_tumor_train <- sample_frac(t_df_tumor, 0.7) #train t_df_tumor_test <- setdiff(t_df_tumor, t_df_tumor_train) #test
If you have a hold-out dataset, let's say, you are training one your first dataset and testing on your second dataset. And the datasets are from two different studies, you can use this fucntion: subset_data
list <- subset_data(df1 = RC12, df1_sampleinfo = RC12_sampleinfo, df2 = RC18, df2_sampleinfo = RC18_sampleinfo, df1_sample_name = "SAMPLE_NAME", df2_sample_name = "SAMPLE_NAME") RC12_tumor_match <- list[[1]] RC18_tumor_match <- list[[2]] # Tidy dataset t_RC12_tumor <- df_tidy( RC12_tumor_match, standardize = "z") t_RC18_tumor <- df_tidy( RC18_tumor_match, standardize = "z") # get metabolites id m_id_sort <- get_m_id_vector(RC12_tumor_match)
Run LASSO model and evaluate model fit, output a vector of MSE and a vector of r2
MSE <- c() #initiate a vector for MSE r2 <- c() #initiate a vector for r2 for (i in (1:length(m_id_sort))){ set.seed(1) MSE_m2[i] <- LASSO_model2_fit(t_df_tumor_train, t_df_tumor_test, m_id_sort, i, eval = "mse") r2_m2[i] <- LASSO_model2_fit(t_df_tumor_train, t_df_tumor_test, m_id_sort, i, eval = "r2") } # extract coefficient of LASSO model coef_list = list() coef_list <- LASSO_coef(df, m_id_sort)
You can also plot predicted versus actual
plot_list <- list() #initialize plot list where to store plots plot_list <- plot_actual_vs_fit(train, test, m_id_sort) #plot loop pdf("plot_list.pdf") #save all plots in pdf for (i in 1:length(m_id_sort)) { print(plot_list[[i]]) } dev.off()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.