inst/code_paper/README-reproduce.md

Reproduction guidelines

Below we describe how to reproduce the code and figure examples in the paper "shapr: Explaining Machine Learning Models with Conditional Shapley Values in R and Python"

The instructions assume a Linux-like environment and that the commands are executed from the directory containing this file. Execution from other locations or on different operating systems requires adjustments.

The code has been tested on Ubuntu 20.04.6 LTS with R 4.4.1 and Python 3.12.7 installed.

Folder content

The folder contains the following files:

Code files

Output files and folders

R

To reproduce the R examples and figures, make sure you have installed the shapr package, its required packages, in addition to the following packages from CRAN: xgboost, ctree, future, progressr and patchwork.

Then, from the command line, run

Rscript -e "knitr::spin('code_R.R')"

This will generate the file code_R.html containing the code from code_R.R accompanied with its output, as well as the figures in the paper_figures and html_figures folders.

Note 1: The html file displays the code and output of the code displayed in the paper. Additional code used to mildly customize and save the figures is provided in the code_R.R file and executed by knitr::spin(), but not shown in the html-file.

Note 2: The R_prep_data_and_model.R script generates the data and models files used by code_R.R. This is already done and the files are included in the data_and_models folder to ensure reproducibility across a broader range of environments. I.e., it is not necessary to run this script, but it is included for complete reproducibility.

Python

To reproduce the Python examples, make sure you installed the shaprpy Python library and its required packages (in addition to the shapr R package).

To simplify the reproducability, we have created a simple bash script executing the Python code in a manner similar to how the knitr::spin() function operates for the R code. The bash script requires the jupytext nbconvert and session_info libraries to run. They can installed with pip as follows:

pip install jupytext nbconvert session_info

Then, from the command line, run

bash code_py_to_html.sh

This will generate the file code_py.html containing the code from code_py.py accompanied with it's output and basic session information.



NorskRegnesentral/shapr documentation built on June 15, 2025, 6:18 a.m.