knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Configuring a workstation to run mldash requires installing a few major components: R; Python; Java. Of course, mldash runs tidymodels natively in R. However, mldash will also run Python (e.g. Prophet) & Java (Weka) models so this guide will outline steps on how to build a working environment.
Depending on your operating system, you can use various system tools or package managers to ease the installation process.
At the end of this guide, you should have a workstation meets the following goals:
Note: these directions have been tested on macOS 12.6 (Monterey on a M1 MacBook, ARM processor) and Red Hat Enterprise Linux (RHEL) 9. These steps have not been tested on other platforms.
You can download and install the R binary from the Crane Project home page. Next you'll need to install RStudio, as your integrated development environment.
In an RStudio session, you can install required libraries in the console window:
install.packages("glmnet") install.packages("brulee") install.packages("fastDummies") install.packages("kknn") install.packages("plsmod") install.packages("remotes") install.packages("baguette") install.packages("libcoin") install.packages("earth") install.packages("dbarts") install.packages("xgboost") install.packages("forecast") install.packages("modeltime")
To install the mix0mics library, you need to install the BiocManager package.
if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("mixOmics")
Installing R dependencies is very similar:
install.packages(c("baguette","libcoin","earth","dbarts","xgboost","forecast","modeltime","glmnet","brulee","fastDummies","kknn","plsmod")) install.packages(c("BiocManager"),type="binary") install.packages(c("poissonreg","pscl","ranger","kernlab","mda","discrim","sda","sparsediscrim","klaR","LiblineaR","naivebayes","rules","MASS","baguette"),type="binary") install.packages(c("dbarts","discrim","earth","fastDummies","glmnet","keras","kernlab","kknn","klaR","mda","mgcv","mixOmics","naivebayes","nnet","parsnip", "plsmod","poissonreg","randomForest","ranger","rpart","rstanarm","rules","sda","sparsediscrim","stats","xgboost","xrf"),type="binary")
Many of the models will require Python which is executed using the reticulate
package. I, personally, have found the installation and configuration of Python to be frustrating, especially on a Mac M1. However, as of this writing, the following works (on my system). First, install these packages from Github to ensure the latest version.
remotes::install_github(sprintf("rstudio/%s", c("reticulate", "tensorflow", "keras", "torch")))
If you have previously installed Miniconda, it is helpful to start from a clean slate.
reticulate::miniconda_uninstall()
We can then install Miniconda using the following command:
reticulate::install_miniconda()
Once installed, we can create a conda environment:
reticulate::conda_create("mldash")
And then make it active (note sure if it is necessary to do this for all three packages, but it doesn't hurt):
reticulate::use_condaenv("mldash") tensorflow::use_condaenv("mldash") keras::use_condaenv("mldash")
Although there are utility functions to install keras
, tensorflow
, and torch
from their respective packages, I found them to not always work as expected. The conda_install
function will ensure the Python packages are installed into the correct environment. Note that as of this writing, pytorch
still does not have a Mac M1 native version so some predictive models will not work on that platform.
reticulate::conda_install("mldash", c("jupyterlab", "pandas", "statsmodels", "scipy", "scikit-learn", "matplotlib", "seaborn", "numpy", "pytorch", "tensorflow"))
Lastly, ensure that reticulate
uses the correct Python by setting the RETICULATE_PYTHON
environment variable (this can also be put in your .Renviron
file to be used across sessions, though I avoid doing that so I can use different Python paths for different projects).
Sys.setenv("RETICULATE_PYTHON" = "~/miniforge3/envs/mldash/bin/python")
On both macOS and RHEL, Python 3.x is pre-installed (either as python or python3) but you'll really need a lot of Python modules. You can get the Mambaforge installer from (Mambaforge's Github page)[https://github.com/conda-forge/miniforge#mambaforge].
Now you can use it to install the many model library dependencies.
```{sh mac_mamba, eval=FALSE} mamba activate mldash mamba install -y jupyterlab mamba install -y pandas mamba install -y statsmodels mamba install -y scipy mamba install -y scikit-learn mamba install -y matplotlib mamba install -y seaborn mamba install -y numpy mamba install -y pytorch mamba install -y tensorflow
As in the previous method, set an environment variable for your Mamba-Python binary: ```r Sys.setenv("RETICULATE_PYTHON" = "~/mambaforge/envs/mldash/bin/python")
You will need to install a Java version 8 development kit compiled for an ARM processor. One that I found to work is from Azul.
Follow the install directions and explicitly set an environment variable for JAVA_HOME, otherwise R will try to use the system's default pre-installed Java virtual machine.
Sys.setenv(JAVA_HOME='/Library/Java/JavaVirtualMachines/zulu-8.jdk/Contents/Home/jre/')
In Linux, you can use the regular yum package manager and install OpenJDK version 8 on the command line.
$ sudo yum install -y java-1.8.0-openjdk
Unlike with macOS, you shouldn't need to set a JAVA_HOME since Java is normally part of the system $PATH variable and normally linked to the correct version. Of course, if you have multiple Java versions, then go ahead and specify JAVA_HOME for safety.
Now that you have R, Python and Java installed can start Running Predictive Models.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.