An R package predict unmeasured metabolites: Metabolites are being used in studies of different cancer types. However, it is very hard to measure all the metbolites in patients. Thus, this tool can help you to predict unmeasured metabolites based on the measured ones.
You can install mpred
from GitHub using the following commands:
install.packages("devtools")
library(devtools)
devtools::install_github("czang97/mpred")
Metabolites data across 7 studies are hosted in this mpred
package.
RC12 <- RC12
RC18 <- RC18
RC12_sampleinfo <- RC12_sampleinfo
RC18_sampleinfo <- RC18_sampleinfo
If you don't have a hold-out dataset, if you are only working with one dataset use this: prepare_data
. And then, you are going to split your dataset into training set and test set.
RC12_tumor <- prepare_data(df = RC12, df_sampleinfo = RC12_sampleinfo, df_sample_name = "SAMPLE_NAME", type = "tumor")
# Tidy dataset
t_RC12_tumor <- df_tidy( RC12_tumor_match, standardize = "z")
# seperate dataset into train set and test set
set.seed(58)
t_df_tumor_train <- sample_frac(t_df_tumor, 0.7) #train
t_df_tumor_test <- setdiff(t_df_tumor, t_df_tumor_train) #test
If you have a hold-out dataset, let's say, you are training one your first dataset and testing on your second dataset. And the datasets are from two different studies, you can use this fucntion: subset_data
list <- subset_data(df1 = RC12, df1_sampleinfo = RC12_sampleinfo, df2 = RC18, df2_sampleinfo = RC18_sampleinfo, df1_sample_name = "SAMPLE_NAME", df2_sample_name = "SAMPLE_NAME")
RC12_tumor_match <- list[[1]]
RC18_tumor_match <- list[[2]]
# Tidy dataset
t_RC12_tumor <- df_tidy( RC12_tumor_match, standardize = "z")
t_RC18_tumor <- df_tidy( RC18_tumor_match, standardize = "z")
# get metabolites id
m_id_sort <- get_m_id_vector(RC12_tumor_match)
Run LASSO model and evaluate model fit, output a vector of MSE and a vector of r2
MSE <- c() #initiate a vector for MSE
r2 <- c() #initiate a vector for r2
for (i in (1:length(m_id_sort))){
set.seed(1)
MSE_m2[i] <- LASSO_model2_fit(t_df_tumor_train, t_df_tumor_test, m_id_sort, i, eval = "mse")
r2_m2[i] <- LASSO_model2_fit(t_df_tumor_train, t_df_tumor_test, m_id_sort, i, eval = "r2")
}
# extract coefficient of LASSO model
coef_list = list()
coef_list <- LASSO_coef(df, m_id_sort)
You can also plot predicted versus actual
plot_list <- list() #initialize plot list where to store plots
plot_list <- plot_actual_vs_fit(train, test, m_id_sort) #plot loop
pdf("plot_list.pdf") #save all plots in pdf
for (i in 1:length(m_id_sort)) {
print(plot_list[[i]])
}
dev.off()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.