| .estimate_vae_vram | R Documentation |
Analytic upper bound on the peak compute-buffer size of the VAE decoder.
The peak occurs in the ResNet block that runs at full pixel resolution
(W x H) with the decoder's base channel width. The per-pixel cost is
derived from architecture (base channels x dtype bytes); the only
empirical constant is live_tensors — how many such full-res
tensors ggml's graph allocator keeps alive simultaneously. That value
is calibrated against an observed Flux failure: a 2048x1024 decode
requested 19238223904 bytes, i.e. 19238223904 / (2048*1024) ~= 9175
B/px, and 9175 / (128 ch * 4 B) ~= 17.9 live full-res tensors. We round
up to 18 for a safe over-estimate (tiling should engage rather than OOM).
.estimate_vae_vram(width, height, model_type = "sd1", batch = 1L)
width |
Image width in pixels |
height |
Image height in pixels |
model_type |
Model type string ("sd1", "sd2", "sdxl", "flux", etc.) |
batch |
Batch size (default 1) |
Estimated peak VRAM in bytes
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.