knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(HLSGUtils)
wgs_adno_partitioner
function create data partitions from ADNI datasets.
To run mixed effect model on ADNI data we have three data source:
We also set the variables from each data source in function arguments: - clinical_variables: variables in ADNI dataset - pca_components: column related to PCA components
After fix input data path, we set number of partitions and partitions save directory.
wgs_adno_partitioner( partition_number = 300, clinical_variables = c("MMSE", "GENDER","AGE", "MMSE.bl", "PTEDUCAT", "PTID", "VISID"), pca_components = c("PC1", "PC2", "PC3"), wgs_path = "/data/WGScompletedQC_phenoed.raw", pca_path = "/data/ADNIprunedpostQCPCA.eigenvec", adni_path = "/data/adnimerge.RData", partitions_save_path = "/data/ADNI/data_partitions/")
We need a R script that runs on each core to execute LMER modelling in parallel.
function_to_Rscript
helps to generate an R script from a function in a package or
source from the local code. This script is designed to run independently on each core.
To create script we need set:
script_name
: the name of the created script.function_name
: the name of a function in a package or the address of a function's source file.packages
: list of packages that are loaded in the script.arguments
: includes function input argumentsarguments_class
: contains a vector of argument types (character, integer, numeric).function_to_Rscript( script_name = "/scripts/Parallel_Modeling.R", function_from_package = "lmer_modeling", packages = c("HLSGUtils"), arguments = c("data_path", "simulation_name", "formula", "save_model_path"), arguments_class = c("character", "character", "character", "character") )
lmer_modeling
function is written to run LMER model on each data partitions.
It needs :
data_path
: the path of partition data, formula
: lmer formula that contains random effect term, simulation_name
: is used in the save model file name.save_model_path
: The directory of saving model output.Finally we use parallel_rscripts to run scripts on parallel cores. The main arguments of function are:
rscript_path
: path to the script that is run concurrently.args
: script's input arguments.free_memory_treshold
: upper bound on memory usage percentagefree_cpu_treshold
: upper bound on CPU percentagelibrary(HLSGUtils) # `lmer_modeling` input arguments partitions_files = list.files("/data/data_partitions/", full.names = T) save_model_path = "/data/models/" formula = paste0("'","MMSE~GENDER+AGE+MMSE.bl+PC1+PC2+PC3+PTEDUCAT+copy_number+(1|PTID)+(1|VISID)","'") simulation_name = "full_model" parallel_rscripts( rscript_path = "/scripts/Parallel_Modeling.R", args = list(data_path = partitions_files, simulation_name = simulation_name, formula = formula, save_model_path = save_model_path), used_memory_treshold = 80, used_cpu_treshold = 80, sleep_time = 10 )
To aggregate models coefficients use aggregate_coefficients
function.
aggregate_coefficients( save_model_directory = "/data/models/", model_names_pattern = "full_model", save_model_path = "/data/aggregated_models/full_model.rds" )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.