Download
1 variant available
License:
# ERNIE Image Turbo — Nunchaku W4A4 Quantized Inference
[中文](#chinese) | [English](#english)
---
### Introduction
This adds **W4A4 quantized inference** support for [ERNIE Image Turbo](https://huggingface.co/baidu/ERNIE-Image-Turbo) to [**Nunchaku**](https://github.com/nunchaku-ai/nunchaku), delivering significant speedup and memory reduction with minimal quality loss.
Built on [Nunchaku](https://github.com/nunchaku-ai/nunchaku). We gratefully acknowledge their excellent work on efficient diffusion model inference.
### Installation
```bash
# This fork adds ERNIE Image support to Nunchaku
git clone https://github.com/Hzj199/nunchaku.git
cd nunchaku
git submodule update --init --recursive
pip install build
NUNCHAKU_BUILD_WHEELS=1 python -m build --wheel --no-isolation
pip install dist/nunchaku-*.whl
```
### Quick Start
```python
import torch
from diffusers.pipelines.ernie_image.pipeline_ernie_image import ErnieImagePipeline
from nunchaku import NunchakuErnieImageTransformer2DModel
from nunchaku.utils import get_precision
precision = get_precision() # auto-detect: "int4" or "fp4"
rank = 64
transformer = NunchakuErnieImageTransformer2DModel.from_pretrained(
f"ZJMuYun97/ERNIE-Image-Nunchaku/svdq-{precision}_r{rank}-ernie-image.safetensors",
torch_dtype=torch.bfloat16,
device="cuda",
)
pipe = ErnieImagePipeline.from_pretrained(
"baidu/ERNIE-Image-Turbo",
transformer=transformer,
torch_dtype=torch.bfloat16,
pe=None, pe_tokenizer=None,
)
image = pipe(
prompt="a cute orange cat sitting on a sunlit windowsill",
height=1024, width=1024,
num_inference_steps=8,
guidance_scale=1.0,
generator=torch.Generator().manual_seed(42),
).images[0]
image.save("ernie-image.png")
```
### Performance (Reference)
Tested on a single A800 GPU, 1024×1024 resolution, 8 inference steps:
| Model | Avg Latency | Speedup |
|-------|-------------|---------|
| Original BF16 | 4.89s | 1.0x |
| **Nunchaku W4A4** | **2.81s** | **1.74x** |
### Notes
- Only `batch_size=1` is supported (same as typical inference use case).
---
### 简介
为 [**Nunchaku**](https://github.com/nunchaku-ai/nunchaku) 添加了对 [ERNIE Image Turbo](https://huggingface.co/baidu/ERNIE-Image-Turbo) 的 **W4A4 量化推理**支持,在保持图像质量的前提下显著提升推理速度、降低显存占用。
本实现基于 [Nunchaku](https://github.com/nunchaku-ai/nunchaku),感谢其在高效扩散模型推理方面的出色工作。
### 安装
```bash
# 本 fork 基于 Nunchaku 添加了对 ERNIE Image 的支持
git clone https://github.com/Hzj199/nunchaku.git
cd nunchaku
git submodule update --init --recursive
pip install build
NUNCHAKU_BUILD_WHEELS=1 python -m build --wheel --no-isolation
pip install dist/nunchaku-*.whl
```
### 快速开始
```python
import torch
from diffusers.pipelines.ernie_image.pipeline_ernie_image import ErnieImagePipeline
from nunchaku import NunchakuErnieImageTransformer2DModel
from nunchaku.utils import get_precision
precision = get_precision() # 自动检测:int4 或 fp4
rank = 64
transformer = NunchakuErnieImageTransformer2DModel.from_pretrained(
f"ZJMuYun97/ERNIE-Image-Nunchaku/svdq-{precision}_r{rank}-ernie-image.safetensors",
torch_dtype=torch.bfloat16,
device="cuda",
)
pipe = ErnieImagePipeline.from_pretrained(
"baidu/ERNIE-Image-Turbo",
transformer=transformer,
torch_dtype=torch.bfloat16,
pe=None, pe_tokenizer=None,
)
image = pipe(
prompt="一只可爱的橘色猫咪坐在阳光照射的窗台上,旁边放着一盆绿色植物",
height=1024, width=1024,
num_inference_steps=8,
guidance_scale=1.0,
generator=torch.Generator().manual_seed(42),
).images[0]
image.save("ernie-image.png")
```
### 性能参考
A800 单卡测试,1024×1024 分辨率,8 步推理:
| 模型 | 平均延迟 | 加速比 |
|------|---------|--------|
| 原始 BF16 | 4.89s | 1.0x |
| **Nunchaku W4A4** | **2.81s** | **1.74x** |
### 注意事项
- 仅支持 `batch_size=1`(符合常见推理场景)。