## Problem
Wan 2.2 I2V on a 12 GB VRAM / 16 GB RAM rig. Pipeline happily renders a 121-frame clip in ~18 minutes. You change the starter image, queue the next job, go make coffee, come back — and the ETA says 50 minutes. No error. No warning.
The log shows this exact fingerprint, every single time (!!):
```
loaded partially; 0.00 MB usable, 0.00 MB loaded,
8475.46 MB offloaded, 1131.60 MB buffer reserved
```
That 1131.60 MB buffer is deterministic. Same machine, same workflow, same number. It's not a memory-pressure accident — it's ComfyUI's partial-load heuristic locking into a pathological partition where almost the entire model streams from system RAM. Per-step time goes from ~70 s to ~210 s. The bug also contaminates queued runs until ComfyUI is restarted. Cancel + re-queue doesn't help; same inputs = same bad math.
## Fix A — the resize workaround
Earlier approach: force ImageResizeKJv2 down to a small target (e.g., 512×768). This does get sampling moving, but at a cost:
1) Per-step ~197 s
2) Job total ~45 min
3) Buffer 232 MB, 6.5 GB offloaded — still mostly RAM-streaming, just slightly less tragic
Result: it works, barely.
## Fix B — match the starter to what Wan actually wants
The real fix isn't the resize target. It's the input image dimensions. Feed Wan a landscape starter near the SDXL-native bucket (we use 1344×768), and the allocator lands in a completely different partition.
## Evidence
| Metric | Fix A (portrait input) | Fix B (1344×768 input) |
|---------------|--------------------------|------------------------|
| Buffer | 232 MB | 70.36 MB |
| MB loaded | 1953 | 5801 |
| MB offloaded | 6522 | 2674 |
| Per-step | 197 s | 68–72 s |
| Job total | ~45 min | ~18–20 min |
Fix B is ~**3× faster end-to-end**, and the fingerprint 5801/2674/70.36 is bit-for-bit stable across back-to-back warm jobs. Feed it one portrait starter and the 1131 trap snaps shut again, instantly, reproducibly.
Takeaway: if your Wan workflow suddenly takes forever, don't tune the sampler. Check your starter image dimensions. One badly-shaped input can derail your whole evening. Landscape near native = smiles. Portrait = tears.

