sd2R is an R package that provides a native, GPU-accelerated Stable Diffusion pipeline by wrapping the C++ implementation from stable-diffusion.cpp and using ggmlR as the tensor backend.
sd2R exposes a high-level R interface for text-to-image and image-to-image generation, while all heavy computation (tokenization, encoders, denoiser, sampler, VAE, model loading) is implemented in C++. Supports SD 1.x, SD 2.x, SDXL, and Flux model families. Targets local inference on Linux with Vulkan-enabled AMD GPUs (with automatic CPU fallback via ggml), without relying on external Python or web APIs.
Flux without Python:
R → sd2R → ggmlR → ggml → Vulkan → GPU
src/sd/): tokenizers, text encoders (CLIP, Mistral, Qwen, UMT5), diffusion UNet/MMDiT denoiser, samplers, VAE encoder/decoder, and model loading for .safetensors and .gguf weights.LinkingTo) and libggml.a, reusing the same GGML/Vulkan stack that also powers llamaR and other ggmlR-based packages.sd_generate() — single entry point for all generation modes. Automatically selects the optimal strategy (direct, tiled sampling, or highres fix) based on output resolution and available VRAM (vram_gb parameter in sd_ctx()). Users don't need to think about tiling at all.verbose = FALSE by default — no console output unless explicitly enabled. Cross-platform build system with configure/configure.win generating Makevars from templates.vram_gb in sd_ctx() to override auto-detection.sd_generate_multi_gpu() distributes prompts across Vulkan GPUs via callr, one process per GPU, with progress reporting.device_layout parameter in sd_ctx() distributes sub-models across multiple Vulkan GPUs within a single process. Presets: "mono" (all on one GPU), "split_encoders" (CLIP/T5 on GPU 1, diffusion + VAE on GPU 0), "split_vae" (CLIP/T5 + VAE on GPU 1, diffusion on GPU 0), "encoders_cpu" (text encoders on CPU). Manual override via diffusion_gpu, clip_gpu, vae_gpu.sd_profile_start() / sd_profile_stop() / sd_profile_summary(). Tracks model loading, text encoding (with CLIP/T5 breakdown), sampling, and VAE decode/encode stages.vae_decode_only = FALSE in context.vae_mode = "auto" (default) queries free GPU memory before VAE decode and enables tiling only when estimated peak usage exceeds available VRAM (with a 50 MB safety reserve). Falls back to a pixel-area threshold (vae_auto_threshold) when Vulkan memory query is unavailable (CPU backend, no GPU). Supports per-axis relative tile sizing (vae_tile_rel_x, vae_tile_rel_y) for non-square aspect ratios.sd_system_info(), reporting GGML/Vulkan capabilities as detected by ggmlR at build time.sd_pipeline() + sd_node() for composable, sequential multi-step workflows (txt2img → upscale → img2img → save). Pipelines are serializable to JSON via sd_save_pipeline() / sd_load_pipeline().pipe <- sd_pipeline(
sd_node("txt2img", prompt = "a cat in space", width = 512, height = 512),
sd_node("upscale", factor = 2),
sd_node("img2img", strength = 0.3),
sd_node("save", path = "output.png")
)
# Save / load as JSON
sd_save_pipeline(pipe, "my_pipeline.json")
pipe <- sd_load_pipeline("my_pipeline.json")
# Run
ctx <- sd_ctx("model.safetensors")
sd_run_pipeline(pipe, ctx, upscaler_ctx = upscaler)
src/sd2R_interface.cpp defines a thin bridge between R and the C API in stable-diffusion.h, returning XPtr objects with custom finalizers for correct lifetime management of sd_ctx_t and upscaler_ctx_t.configure / configure.win generate Makevars from .in templates, resolving ggmlR paths, OpenMP, and Vulkan at configure time. Per-target -include r_ggml_compat.h applied only to sd/*.cpp sources to avoid macro conflicts with system headers.DESCRIPTION declares Rcpp and ggmlR in LinkingTo, and NAMESPACE is generated via roxygen2 with useDynLib and Rcpp imports..onLoad() initializes logging and registers constant values that mirror the underlying C++ enums using 0-based indices.verbose = FALSE by default — no output unless requested.-Winconsistent-missing-override, deprecated codecvt).# Install ggmlR first (if not already installed)
remotes::install_github("Zabis13/ggmlR")
# Install sd2R
remotes::install_github("Zabis13/sd2R")
During installation, the configure script automatically downloads tokenizer vocabulary files (~128 MB total) from GitHub Releases. This requires curl or wget.
If you don't have internet access during installation, download the vocabulary files manually and place them into src/sd/ before building:
# Download from https://github.com/Zabis13/sd2R/releases/tag/assets
# Files: vocab.hpp, vocab_mistral.hpp, vocab_qwen.hpp, vocab_umt5.hpp
wget https://github.com/Zabis13/sd2R/releases/download/assets/vocab.hpp -P src/sd/
wget https://github.com/Zabis13/sd2R/releases/download/assets/vocab_mistral.hpp -P src/sd/
wget https://github.com/Zabis13/sd2R/releases/download/assets/vocab_qwen.hpp -P src/sd/
wget https://github.com/Zabis13/sd2R/releases/download/assets/vocab_umt5.hpp -P src/sd/
R CMD INSTALL .
curl or wget (for downloading vocabulary files during installation)libvulkan-dev + glslc (Linux) or Vulkan SDK (Windows)CLIP-L + T5-XXL text encoders, VAE. sample_steps = 10.
| Test | AMD RX 9070 (16 GB) | Tesla P100 (16 GB) | 2x Tesla T4 (16 GB) | |---|---|---|---| | 1. 768x768 direct | 44.2 s | 94.0 s | 133.1 s | | 2. 1024x1024 tiled VAE | 163.6 s | 151.4 s | 243.6 s | | 3. 2048x1024 highres fix | 309.7 s | 312.5 s | 492.2 s | | 4. img2img 768x768 direct | 29.6 s | 51.0 s | 73.5 s | | 5. 1024x1024 direct | 163.0 s | 152.2 s | 243.3 s | | 6. Multi-GPU 4 prompts | -- | -- | 284.9 s (4 img) |
CLIP-L + T5-XXL (Q5_K_M) text encoders, VAE. sample_steps = 25.
| Test | AMD RX 9070 (16 GB) | 2x Tesla T4 (16 GB) | |---|---|---| | 768x768 direct | 110.8 s | -- | | 1024x1024 direct | -- | 553.1 s |
| | SD 1.5 | Flux Q4_K_S | |---|---|---| | Diffusion params | ~860 MB | ~6.5 GB | | Text encoders | CLIP ~240 MB | CLIP-L + T5-XXL ~3.9 GB | | Sampling per step (768x768) | ~0.1–0.3 s | ~3.9 s | | Architecture | UNet | MMDiT (57 blocks) |
For a live, runnable demo see the Kaggle notebook: Stable Diffusion in R (ggmlR + Vulkan GPU).
MIT
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.