README.md

sd2R

R-hub check on the R Consortium cluster

sd2R is an R package that provides a native, GPU-accelerated Stable Diffusion pipeline by wrapping the C++ implementation from stable-diffusion.cpp and using ggmlR as the tensor backend.

Overview

sd2R exposes a high-level R interface for text-to-image and image-to-image generation, while all heavy computation (tokenization, encoders, denoiser, sampler, VAE, model loading) is implemented in C++. Supports SD 1.x, SD 2.x, SDXL, and Flux model families. Targets local inference on Linux with Vulkan-enabled AMD GPUs (with automatic CPU fallback via ggml), without relying on external Python or web APIs.

Architecture

Flux without Python:

R  →  sd2R  →  ggmlR  →  ggml  →  Vulkan  →  GPU

Key Features

Pipeline Example

pipe <- sd_pipeline(
  sd_node("txt2img", prompt = "a cat in space", width = 512, height = 512),
  sd_node("upscale", factor = 2),
  sd_node("img2img", strength = 0.3),
  sd_node("save", path = "output.png")
)

# Save / load as JSON
sd_save_pipeline(pipe, "my_pipeline.json")
pipe <- sd_load_pipeline("my_pipeline.json")

# Run
ctx <- sd_ctx("model.safetensors")
sd_run_pipeline(pipe, ctx, upscaler_ctx = upscaler)

Implementation Details

CRAN Readiness

Installation

# Install ggmlR first (if not already installed)
remotes::install_github("Zabis13/ggmlR")

# Install sd2R
remotes::install_github("Zabis13/sd2R")

During installation, the configure script automatically downloads tokenizer vocabulary files (~128 MB total) from GitHub Releases. This requires curl or wget.

Offline / Manual Installation

If you don't have internet access during installation, download the vocabulary files manually and place them into src/sd/ before building:

# Download from https://github.com/Zabis13/sd2R/releases/tag/assets
# Files: vocab.hpp, vocab_mistral.hpp, vocab_qwen.hpp, vocab_umt5.hpp

wget https://github.com/Zabis13/sd2R/releases/download/assets/vocab.hpp -P src/sd/
wget https://github.com/Zabis13/sd2R/releases/download/assets/vocab_mistral.hpp -P src/sd/
wget https://github.com/Zabis13/sd2R/releases/download/assets/vocab_qwen.hpp -P src/sd/
wget https://github.com/Zabis13/sd2R/releases/download/assets/vocab_umt5.hpp -P src/sd/

R CMD INSTALL .

System Requirements

Benchmarks

FLUX.1-dev Q4_K_S — 10 steps

CLIP-L + T5-XXL text encoders, VAE. sample_steps = 10.

| Test | AMD RX 9070 (16 GB) | Tesla P100 (16 GB) | 2x Tesla T4 (16 GB) | |---|---|---|---| | 1. 768x768 direct | 44.2 s | 94.0 s | 133.1 s | | 2. 1024x1024 tiled VAE | 163.6 s | 151.4 s | 243.6 s | | 3. 2048x1024 highres fix | 309.7 s | 312.5 s | 492.2 s | | 4. img2img 768x768 direct | 29.6 s | 51.0 s | 73.5 s | | 5. 1024x1024 direct | 163.0 s | 152.2 s | 243.3 s | | 6. Multi-GPU 4 prompts | -- | -- | 284.9 s (4 img) |

FLUX.1-dev Q4_K_S — 25 steps

CLIP-L + T5-XXL (Q5_K_M) text encoders, VAE. sample_steps = 25.

| Test | AMD RX 9070 (16 GB) | 2x Tesla T4 (16 GB) | |---|---|---| | 768x768 direct | 110.8 s | -- | | 1024x1024 direct | -- | 553.1 s |

Model size comparison

| | SD 1.5 | Flux Q4_K_S | |---|---|---| | Diffusion params | ~860 MB | ~6.5 GB | | Text encoders | CLIP ~240 MB | CLIP-L + T5-XXL ~3.9 GB | | Sampling per step (768x768) | ~0.1–0.3 s | ~3.9 s | | Architecture | UNet | MMDiT (57 blocks) |

Examples

For a live, runnable demo see the Kaggle notebook: Stable Diffusion in R (ggmlR + Vulkan GPU).

See Also

License

MIT



Try the sd2R package in your browser

Any scripts or data that you put into this service are public.

sd2R documentation built on March 30, 2026, 5:08 p.m.