sd_ctx: Create a Stable Diffusion context
In sd2R: Stable Diffusion Image Generation

sd_ctx

R Documentation

Create a Stable Diffusion context

Description

Loads a model and creates a context for image generation.

Usage

sd_ctx(
  model_path = NULL,
  vae_path = NULL,
  taesd_path = NULL,
  clip_l_path = NULL,
  clip_g_path = NULL,
  t5xxl_path = NULL,
  llm_path = NULL,
  diffusion_model_path = NULL,
  control_net_path = NULL,
  n_threads = 0L,
  wtype = SD_TYPE$COUNT,
  tensor_type_rules = NULL,
  vae_decode_only = TRUE,
  free_params_immediately = FALSE,
  keep_clip_on_cpu = FALSE,
  keep_vae_on_cpu = FALSE,
  offload_params_to_cpu = FALSE,
  max_vram = 0,
  stream_layers = FALSE,
  enable_mmap = FALSE,
  vae_conv_direct = TRUE,
  diffusion_conv_direct = FALSE,
  diffusion_flash_attn = TRUE,
  rng_type = RNG_TYPE$CUDA,
  prediction = NULL,
  lora_apply_mode = LORA_APPLY_MODE$AUTO,
  model_type = "sd1",
  vram_gb = NULL,
  device_layout = "mono",
  diffusion_gpu = -1L,
  clip_gpu = -1L,
  vae_gpu = -1L,
  meta_backend = FALSE,
  verbose = FALSE
)

Arguments

`model_path`	Path to the model file (safetensors, gguf, or checkpoint)
`vae_path`	Optional path to a separate VAE model
`taesd_path`	Optional path to TAESD model for preview
`clip_l_path`	Optional path to CLIP-L model
`clip_g_path`	Optional path to CLIP-G model
`t5xxl_path`	Optional path to T5-XXL model
`llm_path`	Optional path to an LLM text encoder (Qwen3 / Mistral-Small). Required for models that use an LLM conditioner, e.g. FLUX.2 Klein (Qwen3), FLUX.2 (Mistral-Small), Z-Image and Qwen-Image. Loaded into the `text_encoders.llm` slot.
`diffusion_model_path`	Optional path to separate diffusion model
`control_net_path`	Optional path to ControlNet model
`n_threads`	Number of CPU threads (0 = auto-detect)
`wtype`	Weight type for quantization (see `SD_TYPE`)
`tensor_type_rules`	Optional per-component weight type override, as a comma-separated string of `pattern=type` rules. Each pattern is a regex matched against tensor names; the first match wins. Use this to load specific model components at a different precision than `wtype`. Examples: `"first_stage_model=f16"` — load VAE at F16 `"first_stage_model=f16,model.diffusion_model=q8_0"` — VAE F16, UNet Q8_0 Type names match ggml type names (`"f16"`, `"f32"`, `"q8_0"`, etc.).
`vae_decode_only`	If TRUE, only load VAE decoder (saves memory)
`free_params_immediately`	Free model params after first computation. If TRUE, the context can only be used for a single generation — subsequent calls will crash. Set to TRUE only when you need to save memory and will not reuse the context. Default is FALSE.
`keep_clip_on_cpu`	Keep CLIP model on CPU even when using GPU
`keep_vae_on_cpu`	Keep VAE on CPU even when using GPU
`offload_params_to_cpu`	Keep model weights in CPU RAM and stream them to the GPU on demand during compute (default FALSE). Lowers VRAM usage at the cost of CPU<->GPU transfers each step. Use when the model does not fit in GPU memory.
`max_vram`	GiB budget for graph-cut segmented parameter offload (default 0 = disabled). A positive value caps GPU memory used by the compute graph; `-1` means "auto" (free VRAM minus ~1 GiB). Required for `stream_layers` to take effect.
`stream_layers`	Enable residency + prefetch streaming of layers on top of `max_vram` (default FALSE). Has no effect unless `max_vram` is set (a non-zero budget); automatically disabled otherwise.
`enable_mmap`	Memory-map model weights from disk instead of reading them into a malloc'd buffer (default FALSE). Lowers RAM footprint for large models (e.g. Flux); pages are loaded on demand by the OS and shared across processes. Ignored for zip-archived weights. May slow the first generation slightly as pages fault in.
`vae_conv_direct`	Use direct Conv2d implementation in VAE (default TRUE). Faster on GPU; skips im2col and uses direct convolution kernels.
`diffusion_conv_direct`	Use direct Conv2d in diffusion model (default FALSE).
`diffusion_flash_attn`	Enable flash attention for diffusion model (default TRUE). Set to FALSE if you experience issues with specific GPU drivers or backends.
`rng_type`	RNG type (see `RNG_TYPE`)
`prediction`	Prediction type override (see `PREDICTION`), NULL = auto
`lora_apply_mode`	LoRA application mode (see `LORA_APPLY_MODE`)
`model_type`	Model architecture hint: `"sd1"`, `"sd2"`, `"sdxl"`, `"flux"`, `"flux2"`, `"sd3"`, or `"auto"`. Used by `sd_generate` to determine native resolution and tile sizes. With `"auto"`, the type is detected from a sibling `config.json` then the filename (GGUF-metadata detection is a future hook); detection errors with a hint if it cannot decide. Default `"sd1"`.
`vram_gb`	Override available VRAM in GB. When set, disables auto-detection and uses this value for strategy routing. Default `NULL` (auto-detect from Vulkan device).
`device_layout`	GPU layout preset for multi-GPU systems. One of: `"mono"` All models on one GPU (default). `"split_encoders"` Text encoders (CLIP/T5) on GPU 1, diffusion + VAE on GPU 0. `"split_vae"` Text encoders + VAE on GPU 1, diffusion on GPU 0. Maximizes VRAM for diffusion. `"encoders_cpu"` Text encoders on CPU, diffusion + VAE on GPU. Saves GPU memory at the cost of slower text encoding. Ignored when `diffusion_gpu`, `clip_gpu`, or `vae_gpu` are explicitly set (>= 0).
`diffusion_gpu`	Vulkan GPU device index for the diffusion model. Default `-1` (use `SD_VK_DEVICE` env or device 0). Overrides `device_layout`.
`clip_gpu`	Vulkan GPU device index for CLIP/T5 text encoders. Default `-1` (same device as diffusion). Overrides `device_layout`.
`vae_gpu`	Vulkan GPU device index for VAE encoder/decoder. Default `-1` (same device as diffusion). Overrides `device_layout`.
`meta_backend`	Logical flag to run the diffusion model through the ggml meta backend ("second path", multi-GPU tensor split across all available GPUs). Requires meta-backend support compiled in at install time (ggmlR >= 0.7.8 exporting `ggml_backend_meta_device`); if the build lacks it, a warning is emitted and the normal single-backend path is used. Default `FALSE` keeps existing behaviour unchanged. Distinct from `diffusion_gpu`/`vae_gpu` (per-component placement) and `sd_generate_multi_gpu()` (per-prompt batch parallelism).
`verbose`	If `TRUE`, print model loading progress and sampling steps. Default `FALSE`.

Value

An external pointer to the SD context (class "sd_ctx") with attributes model_type, vae_decode_only, vram_gb, vram_total_gb, and vram_device.

Examples

## Not run: 
ctx <- sd_ctx("model.safetensors")
imgs <- sd_txt2img(ctx, "a cat sitting on a chair")
sd_save_image(imgs[[1]], "cat.png")

## End(Not run)

sd2R documentation built on June 19, 2026, 9:08 a.m.