dot-estimate_vae_vram: Estimate peak VAE VRAM usage in bytes

.estimate_vae_vramR Documentation

Estimate peak VAE VRAM usage in bytes

Description

Analytic upper bound on the peak compute-buffer size of the VAE decoder. The peak occurs in the ResNet block that runs at full pixel resolution (W x H) with the decoder's base channel width. The per-pixel cost is derived from architecture (base channels x dtype bytes); the only empirical constant is live_tensors — how many such full-res tensors ggml's graph allocator keeps alive simultaneously. That value is calibrated against an observed Flux failure: a 2048x1024 decode requested 19238223904 bytes, i.e. 19238223904 / (2048*1024) ~= 9175 B/px, and 9175 / (128 ch * 4 B) ~= 17.9 live full-res tensors. We round up to 18 for a safe over-estimate (tiling should engage rather than OOM).

Usage

.estimate_vae_vram(width, height, model_type = "sd1", batch = 1L)

Arguments

width

Image width in pixels

height

Image height in pixels

model_type

Model type string ("sd1", "sd2", "sdxl", "flux", etc.)

batch

Batch size (default 1)

Value

Estimated peak VRAM in bytes


sd2R documentation built on June 19, 2026, 9:08 a.m.