This vignette provides a comprehensive guide to using kerasnip to define sequential Keras models within the tidymodels ecosystem. kerasnip bridges the gap between the imperative, layer-by-layer construction of Keras models and the declarative, specification-based approach of tidymodels.
Here, we will focus on create_keras_sequential_spec(), which is ideal for models where layers form a plain stack, with each layer having exactly one input tensor and one output tensor.
We'll start by loading the necessary packages:
knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = reticulate::py_module_available("keras") ) # Suppress verbose Keras output for the vignette options(keras.fit_verbose = 0) set.seed(123)
library(kerasnip) library(tidymodels) library(keras3)
create_keras_sequential_spec()A Sequential model in Keras is appropriate for a plain stack of layers where each layer has exactly one input tensor and one output tensor. kerasnip's create_keras_sequential_spec() function is designed to define such models in a tidymodels-compatible way.
Instead of building the model layer-by-layer imperatively, you define a named, ordered list of R functions called layer_blocks. Each layer_block function takes a Keras model object as its first argument and returns the modified model. kerasnip then uses these blocks to construct the full Keras Sequential model.
For models with more complex, non-linear topologies (e.g., multiple inputs/outputs, residual connections, or multi-branch models), you should use create_keras_functional_spec().
kerasnip Sequential Model SpecificationLet's define a simple sequential model with three dense layers.
First, we define our layer_blocks:
# The first block must initialize the model. `input_shape` # is passed automatically. input_block <- function(model, input_shape) { keras_model_sequential(input_shape = input_shape) } # A reusable block for hidden layers. `units` will become a tunable parameter. hidden_block <- function(model, units = 32, activation = "relu") { model |> layer_dense(units = units, activation = activation) } # The output block. `num_classes` is passed automatically for classification. output_block <- function(model, num_classes, activation = "softmax") { model |> layer_dense(units = num_classes, activation = activation) }
Now, we use create_keras_sequential_spec() to generate our parsnip model specification function. We'll name our model my_simple_mlp.
create_keras_sequential_spec( model_name = "my_simple_mlp", layer_blocks = list( input = input_block, hidden_1 = hidden_block, hidden_2 = hidden_block, output = output_block ), mode = "classification" )
compile_keras_grid()In the original Keras guide, a common workflow is to incrementally add layers and call summary() to inspect the architecture. With kerasnip, the model is defined declaratively, so we can't inspect it layer-by-layer in the same way.
However, kerasnip provides a powerful equivalent: compile_keras_grid(). This function checks if your layer_blocks define a valid Keras model and returns the compiled model structure, all without running a full training cycle. This is perfect for debugging your architecture.
Let's see this in action with a CNN architecture:
# Define CNN layer blocks cnn_input_block <- function(model, input_shape) { keras_model_sequential(input_shape = input_shape) } cnn_conv_block <- function( model, filters = 32, kernel_size = 3, activation = "relu" ) { model |> layer_conv_2d( filters = filters, kernel_size = kernel_size, activation = activation ) } cnn_pool_block <- function(model, pool_size = 2) { model |> layer_max_pooling_2d(pool_size = pool_size) } cnn_flatten_block <- function(model) { model |> layer_flatten() } cnn_output_block <- function(model, num_classes, activation = "softmax") { model |> layer_dense(units = num_classes, activation = activation) } # Create the kerasnip spec function create_keras_sequential_spec( model_name = "my_cnn", layer_blocks = list( input = cnn_input_block, conv1 = cnn_conv_block, pool1 = cnn_pool_block, flatten = cnn_flatten_block, output = cnn_output_block ), mode = "classification" ) # Create a spec instance for a 28x28x1 image cnn_spec <- my_cnn( conv1_filters = 32, conv1_kernel_size = 5, compile_loss = "categorical_crossentropy", compile_optimizer = "adam" ) # Prepare dummy data with the correct shape. # We create a list of 28x28x1 arrays. x_dummy_list <- lapply( 1:10, function(i) array(runif(28 * 28 * 1), dim = c(28, 28, 1)) ) x_dummy_df <- tibble::tibble(x = x_dummy_list) y_dummy <- factor(sample(0:9, 10, replace = TRUE), levels = 0:9) y_dummy_df <- tibble::tibble(y = y_dummy) # Use compile_keras_grid to get the model summary compilation_results <- compile_keras_grid( spec = cnn_spec, grid = tibble::tibble(), x = x_dummy_df, y = y_dummy_df ) # Print the summary compilation_results |> select(compiled_model) |> pull() |> pluck(1) |> summary()
compilation_results |> select(compiled_model) |> pull() |> pluck(1) |> plot(show_shapes = TRUE)
{fig-alt="A picture showing the model shape"}
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.