sd_ctx: Create a Stable Diffusion context

View source: R/pipeline.R

sd_ctxR Documentation

Create a Stable Diffusion context

Description

Loads a model and creates a context for image generation.

Usage

sd_ctx(
  model_path = NULL,
  vae_path = NULL,
  taesd_path = NULL,
  clip_l_path = NULL,
  clip_g_path = NULL,
  t5xxl_path = NULL,
  llm_path = NULL,
  diffusion_model_path = NULL,
  control_net_path = NULL,
  n_threads = 0L,
  wtype = SD_TYPE$COUNT,
  tensor_type_rules = NULL,
  vae_decode_only = TRUE,
  free_params_immediately = FALSE,
  keep_clip_on_cpu = FALSE,
  keep_vae_on_cpu = FALSE,
  offload_params_to_cpu = FALSE,
  max_vram = 0,
  stream_layers = FALSE,
  enable_mmap = FALSE,
  vae_conv_direct = TRUE,
  diffusion_conv_direct = FALSE,
  diffusion_flash_attn = TRUE,
  rng_type = RNG_TYPE$CUDA,
  prediction = NULL,
  lora_apply_mode = LORA_APPLY_MODE$AUTO,
  model_type = "sd1",
  vram_gb = NULL,
  device_layout = "mono",
  diffusion_gpu = -1L,
  clip_gpu = -1L,
  vae_gpu = -1L,
  meta_backend = FALSE,
  verbose = FALSE
)

Arguments

model_path

Path to the model file (safetensors, gguf, or checkpoint)

vae_path

Optional path to a separate VAE model

taesd_path

Optional path to TAESD model for preview

clip_l_path

Optional path to CLIP-L model

clip_g_path

Optional path to CLIP-G model

t5xxl_path

Optional path to T5-XXL model

llm_path

Optional path to an LLM text encoder (Qwen3 / Mistral-Small). Required for models that use an LLM conditioner, e.g. FLUX.2 Klein (Qwen3), FLUX.2 (Mistral-Small), Z-Image and Qwen-Image. Loaded into the text_encoders.llm slot.

diffusion_model_path

Optional path to separate diffusion model

control_net_path

Optional path to ControlNet model

n_threads

Number of CPU threads (0 = auto-detect)

wtype

Weight type for quantization (see SD_TYPE)

tensor_type_rules

Optional per-component weight type override, as a comma-separated string of pattern=type rules. Each pattern is a regex matched against tensor names; the first match wins. Use this to load specific model components at a different precision than wtype. Examples:

  • "first_stage_model=f16" — load VAE at F16

  • "first_stage_model=f16,model.diffusion_model=q8_0" — VAE F16, UNet Q8_0

Type names match ggml type names ("f16", "f32", "q8_0", etc.).

vae_decode_only

If TRUE, only load VAE decoder (saves memory)

free_params_immediately

Free model params after first computation. If TRUE, the context can only be used for a single generation — subsequent calls will crash. Set to TRUE only when you need to save memory and will not reuse the context. Default is FALSE.

keep_clip_on_cpu

Keep CLIP model on CPU even when using GPU

keep_vae_on_cpu

Keep VAE on CPU even when using GPU

offload_params_to_cpu

Keep model weights in CPU RAM and stream them to the GPU on demand during compute (default FALSE). Lowers VRAM usage at the cost of CPU<->GPU transfers each step. Use when the model does not fit in GPU memory.

max_vram

GiB budget for graph-cut segmented parameter offload (default 0 = disabled). A positive value caps GPU memory used by the compute graph; -1 means "auto" (free VRAM minus ~1 GiB). Required for stream_layers to take effect.

stream_layers

Enable residency + prefetch streaming of layers on top of max_vram (default FALSE). Has no effect unless max_vram is set (a non-zero budget); automatically disabled otherwise.

enable_mmap

Memory-map model weights from disk instead of reading them into a malloc'd buffer (default FALSE). Lowers RAM footprint for large models (e.g. Flux); pages are loaded on demand by the OS and shared across processes. Ignored for zip-archived weights. May slow the first generation slightly as pages fault in.

vae_conv_direct

Use direct Conv2d implementation in VAE (default TRUE). Faster on GPU; skips im2col and uses direct convolution kernels.

diffusion_conv_direct

Use direct Conv2d in diffusion model (default FALSE).

diffusion_flash_attn

Enable flash attention for diffusion model (default TRUE). Set to FALSE if you experience issues with specific GPU drivers or backends.

rng_type

RNG type (see RNG_TYPE)

prediction

Prediction type override (see PREDICTION), NULL = auto

lora_apply_mode

LoRA application mode (see LORA_APPLY_MODE)

model_type

Model architecture hint: "sd1", "sd2", "sdxl", "flux", "flux2", "sd3", or "auto". Used by sd_generate to determine native resolution and tile sizes. With "auto", the type is detected from a sibling config.json then the filename (GGUF-metadata detection is a future hook); detection errors with a hint if it cannot decide. Default "sd1".

vram_gb

Override available VRAM in GB. When set, disables auto-detection and uses this value for strategy routing. Default NULL (auto-detect from Vulkan device).

device_layout

GPU layout preset for multi-GPU systems. One of:

"mono"

All models on one GPU (default).

"split_encoders"

Text encoders (CLIP/T5) on GPU 1, diffusion + VAE on GPU 0.

"split_vae"

Text encoders + VAE on GPU 1, diffusion on GPU 0. Maximizes VRAM for diffusion.

"encoders_cpu"

Text encoders on CPU, diffusion + VAE on GPU. Saves GPU memory at the cost of slower text encoding.

Ignored when diffusion_gpu, clip_gpu, or vae_gpu are explicitly set (>= 0).

diffusion_gpu

Vulkan GPU device index for the diffusion model. Default -1 (use SD_VK_DEVICE env or device 0). Overrides device_layout.

clip_gpu

Vulkan GPU device index for CLIP/T5 text encoders. Default -1 (same device as diffusion). Overrides device_layout.

vae_gpu

Vulkan GPU device index for VAE encoder/decoder. Default -1 (same device as diffusion). Overrides device_layout.

meta_backend

Logical flag to run the diffusion model through the ggml meta backend ("second path", multi-GPU tensor split across all available GPUs). Requires meta-backend support compiled in at install time (ggmlR >= 0.7.8 exporting ggml_backend_meta_device); if the build lacks it, a warning is emitted and the normal single-backend path is used. Default FALSE keeps existing behaviour unchanged. Distinct from diffusion_gpu/vae_gpu (per-component placement) and sd_generate_multi_gpu() (per-prompt batch parallelism).

verbose

If TRUE, print model loading progress and sampling steps. Default FALSE.

Value

An external pointer to the SD context (class "sd_ctx") with attributes model_type, vae_decode_only, vram_gb, vram_total_gb, and vram_device.

Examples

## Not run: 
ctx <- sd_ctx("model.safetensors")
imgs <- sd_txt2img(ctx, "a cat sitting on a chair")
sd_save_image(imgs[[1]], "cat.png")

## End(Not run)

sd2R documentation built on June 19, 2026, 9:08 a.m.