Sign In

TeaCache and DeepCache Now Available on Diffucore

0

TeaCache and DeepCache Now Available on Diffucore

As someone still rocking a GPU released in 2021, with an architecture that dates back to 2019, my RTX 2060 12GB has already had a long run. Faster image generation has started to feel like a dream. But with GPU and VRAM prices continuing to rise, that dream feels farther away than ever.

For the past week, I have been looking for ways to optimize generation speed in my own inference engine, Diffucore and its UI, Diffucore UI. Along the way, I came across two interesting approaches: TeaCache and DeepCache.

TeaCache is a cache-based acceleration method that skips redundant computations by reusing previously computed results when the model behavior between steps is similar. In practice, this can reduce inference time with minimal quality loss, especially when tuned carefully.

DeepCache takes a different approach. Instead of caching only small parts of the computation, it reuses deeper hidden representations from earlier steps to avoid repeating expensive work. This can lead to larger speed gains, though the exact tradeoff depends on the model and settings.

The results have been promising.

With Anima, enabling TeaCache at 0.3, together with secant_anneal and a beta scheduler, improved inference speed by 8% on my RTX 2060 when generating at 1024×1536 with 32 steps, with almost no visible quality degradation compared to running without TeaCache.

image.png

Up: No TeaCache, Down: TeaCache 0.3

If you want even more speed, TeaCache at 0.4 can deliver around 25% faster generation, while still preserving about 95% of the output quality.

image.png

Up: TeaCache 0.4, Down: No TeaCache

With SDXL, using DeepCache at 2 increased inference speed by 43% with almost no quality loss, apart from a slight change in seed interpretation.

image.png

Up: No DeepCache, Down: DeepCache 2

These are not just incremental improvements. They are a real step toward making high-quality image generation faster and more accessible on modest hardware.

I don't know how to interpret the speed improvement data on this post properly, but I'd like to invite you to try Diffucore UI directly and feel the difference.

0