knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) options(pkgdown.max_print = Inf, width = 1000) library(cramR) library(data.table) library(glmnet) library(caret)
The Cram package provides a unified framework for:
π§ Cram Policy (cram_policy
): Learn and evaluate individualized binary treatment rules using Cram. Supports flexible models, including causal forests and custom learners. Common examples include whether to treat a patient, send a discount offer, or provide financial aid based on estimated benefit.
π Cram ML (cram_ml
): Learn and evaluate standard machine learning models using Cram. It estimates the expected loss at the population level, giving you a reliable measure of how well the final model is likely to generalize to new data. Supports flexible training via caret or custom learners, and allows evaluation with user-defined loss metrics. Ideal for classification, regression, and other predictive tasks.
π° Cram Bandit (cram_bandit
): Perform on-policy evaluation of contextual bandit algorithms using Cram. Supports both real data and simulation environments with built-in policies.
This vignette walks through these three core modules.
For reproducible use cases, see the example script provided in the Cram GitHub repository:
cram_policy()
β Binary Policy Learning & Evaluationgenerate_data <- function(n) { X <- data.table( binary = rbinom(n, 1, 0.5), discrete = sample(1:5, n, replace = TRUE), continuous = rnorm(n) ) D <- rbinom(n, 1, 0.5) treatment_effect <- ifelse(X$binary == 1 & X$discrete <= 2, 1, ifelse(X$binary == 0 & X$discrete >= 4, -1, 0.1)) Y <- D * (treatment_effect + rnorm(n)) + (1 - D) * rnorm(n) list(X = X, D = D, Y = Y) } set.seed(123) data <- generate_data(1000) X <- data$X; D <- data$D; Y <- data$Y
cram_policy()
with causal forestres <- cram_policy( X, D, Y, batch = 20, model_type = "causal_forest", learner_type = NULL, baseline_policy = as.list(rep(0, nrow(X))), alpha = 0.05 ) print(res)
Use caret
and choose a classification method outputting probabilities i.e. using the key word classProbs = TRUE
in trainControl
, see the following as an example with a Random Forest Classifier:
r
model_params <- list(formula = Y ~ ., caret_params = list(method = "rf", trControl = trainControl(method = "none", classProbs = TRUE)))
Also note that all data inputs needs to be of numeric types, hence for Y
categorical, it should contain numeric values representing the class of each observation. No need to use the type factor
for cram_policy()
.
cram_policy()
Set model_params
to NULL
and specify custom_fit
and custom_predict
.
custom_fit <- function(X, Y, D, n_folds = 5) { treated <- which(D == 1); control <- which(D == 0) m1 <- cv.glmnet(as.matrix(X[treated, ]), Y[treated], alpha = 0, nfolds = n_folds) m0 <- cv.glmnet(as.matrix(X[control, ]), Y[control], alpha = 0, nfolds = n_folds) tau1 <- predict(m1, as.matrix(X[control, ]), s = "lambda.min") - Y[control] tau0 <- Y[treated] - predict(m0, as.matrix(X[treated, ]), s = "lambda.min") tau <- c(tau0, tau1); X_all <- rbind(X[treated, ], X[control, ]) final_model <- cv.glmnet(as.matrix(X_all), tau, alpha = 0) final_model } custom_predict <- function(model, X, D) { as.numeric(predict(model, as.matrix(X), s = "lambda.min") > 0) } res <- cram_policy( X, D, Y, batch = 20, model_type = NULL, custom_fit = custom_fit, custom_predict = custom_predict ) print(res)
cram_ml()
β ML Learning & Evaluationcram_ml()
Specify formula
and caret_params
conforming to the popular caret::train()
and set an individual loss under loss_name
.
set.seed(42) data_df <- data.frame( x1 = rnorm(100), x2 = rnorm(100), x3 = rnorm(100), Y = rnorm(100) ) caret_params <- list( method = "lm", trControl = trainControl(method = "none") ) res <- cram_ml( data = data_df, formula = Y ~ ., batch = 5, loss_name = "se", caret_params = caret_params ) print(res)
cram_ml()
All data inputs needs to be of numeric types, hence for Y
categorical, it should contain numeric values representing the class of each observation. No need to use the type factor
for cram_ml()
.
In this case, the model outputs hard predictions (labels, e.g. 0, 1, 2 etc.), and the metric used is classification accuracyβthe proportion of correctly predicted labels.
loss_name = "accuracy"
classProbs = FALSE
in trainControl
classify = TRUE
in cram_ml()
set.seed(42) # Generate binary classification dataset X_data <- data.frame(x1 = rnorm(100), x2 = rnorm(100), x3 = rnorm(100)) Y_data <- rbinom(nrow(X_data), 1, 0.5) data_df <- data.frame(X_data, Y = Y_data) # Define caret parameters: predict labels (default behavior) caret_params_rf <- list( method = "rf", trControl = trainControl(method = "none") ) # Run CRAM ML with accuracy as loss result <- cram_ml( data = data_df, formula = Y ~ ., batch = 5, loss_name = "accuracy", caret_params = caret_params_rf, classify = TRUE ) print(result)
In this setup, the model outputs class probabilities, and the loss is evaluated using logarithmic loss (logloss
)βa standard metric for probabilistic classification.
loss_name = "logloss"
classProbs = TRUE
in trainControl
classify = TRUE
in cram_ml()
set.seed(42) # Generate binary classification dataset X_data <- data.frame(x1 = rnorm(100), x2 = rnorm(100), x3 = rnorm(100)) Y_data <- rbinom(nrow(X_data), 1, 0.5) data_df <- data.frame(X_data, Y = Y_data) # Define caret parameters for probability output caret_params_rf_probs <- list( method = "rf", trControl = trainControl(method = "none", classProbs = TRUE) ) # Run CRAM ML with logloss as the evaluation loss result <- cram_ml( data = data_df, formula = Y ~ ., batch = 5, loss_name = "logloss", caret_params = caret_params_rf_probs, classify = TRUE ) print(result)
In addition to using built-in learners via caret
, cram_ml()
also supports fully custom model workflows. You can specify your own:
custom_fit
)custom_predict
)custom_loss
)See the vignette "Cram ML" for more details.
cram_bandit()
β Contextual Bandits for On-policy Statistical EvaluationSpecify:
pi
: An array of shape (T Γ B, T, K) or (T Γ B, T), where:
( T ) is the number of learning steps (or policy updates)
( B ) is the batch size
( K ) is the number of arms
( T \times B ) is the total number of contexts
In the natural 3D version, pi[j, t, a]
gives the probability that the policy ( \hat{\pi}_t ) assigns arm a
to context ( X_j ). In the 2D version, we only keep the probabilities assigned to the chosen arm ( A_j ) for each context ( X_j ) in the historical data - and not the probabilities for all of the arms ( a ) under each context ( X_j ).
arm
: A vector of length ( T \times B ) indicating which arm was selected in each context.
reward
: A vector of observed rewards of length ( T \times B ).
batch
: (optional) Integer batch size ( B ). Default is 1.
alpha
: Significance level for confidence intervals.
set.seed(42) T <- 100; K <- 4 pi <- array(runif(T * T * K, 0.1, 1), dim = c(T, T, K)) for (t in 1:T) for (j in 1:T) pi[j, t, ] <- pi[j, t, ] / sum(pi[j, t, ]) arm <- sample(1:K, T, replace = TRUE) reward <- rnorm(T, 1, 0.5) res <- cram_bandit(pi, arm, reward, batch=1, alpha=0.05) print(res)
cram_policy()
: Learn and evaluate a binary policy.cram_ml()
: Learn and evaluate ML models.cram_bandit()
: Cramming contextual bandits for on-policy statistical evaluation.```r autograph_files <- list.files(tempdir(), pattern = "^__autograph_generated_file.*\.py$", full.names = TRUE) if (length(autograph_files) > 0) { try(unlink(autograph_files, recursive = TRUE, force = TRUE), silent = TRUE) }
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.