home models images videos 3D Models articles comics challenges updates shop

svdq-int4_r64-ernie-image

Name: svdq-int4_r64-ernie-image
Rating: 5 (6 reviews)
Author: huangzj1997306

Updated: May 20, 2026

base model

svdq

Download

1 variant available

nf4 SafeTensor

svdq-int4_r64-ernie-image.safetensors

4-bit normalized • 4.53 GB

Verified: 2 months ago

Download (4.53 GB)

Details

Type

Checkpoint Trained

Stats

Reviews

Positive

(6)

Published

May 20, 2026

Base Model

Ernie

Hash

AutoV2

E9F86F31B1

Tensors

default creator card background decoration

huangzj1997306

Joined Apr 6, 2023

License:

# ERNIE Image Turbo — Nunchaku W4A4 Quantized Inference

[中文](#chinese) | [English](#english)

---

### Introduction

This adds **W4A4 quantized inference** support for [ERNIE Image Turbo](https://huggingface.co/baidu/ERNIE-Image-Turbo) to [**Nunchaku**](https://github.com/nunchaku-ai/nunchaku), delivering significant speedup and memory reduction with minimal quality loss.

Built on [Nunchaku](https://github.com/nunchaku-ai/nunchaku). We gratefully acknowledge their excellent work on efficient diffusion model inference.


### Installation

```bash
# This fork adds ERNIE Image support to Nunchaku
git clone https://github.com/Hzj199/nunchaku.git
cd nunchaku
git submodule update --init --recursive

pip install build
NUNCHAKU_BUILD_WHEELS=1 python -m build --wheel --no-isolation
pip install dist/nunchaku-*.whl
```

### Quick Start

```python
import torch
from diffusers.pipelines.ernie_image.pipeline_ernie_image import ErnieImagePipeline
from nunchaku import NunchakuErnieImageTransformer2DModel
from nunchaku.utils import get_precision

precision = get_precision()  # auto-detect: "int4" or "fp4"
rank = 64

transformer = NunchakuErnieImageTransformer2DModel.from_pretrained(
    f"ZJMuYun97/ERNIE-Image-Nunchaku/svdq-{precision}_r{rank}-ernie-image.safetensors",
    torch_dtype=torch.bfloat16,
    device="cuda",
)

pipe = ErnieImagePipeline.from_pretrained(
    "baidu/ERNIE-Image-Turbo",
    transformer=transformer,
    torch_dtype=torch.bfloat16,
    pe=None, pe_tokenizer=None,
)

image = pipe(
    prompt="a cute orange cat sitting on a sunlit windowsill",
    height=1024, width=1024,
    num_inference_steps=8,
    guidance_scale=1.0,
    generator=torch.Generator().manual_seed(42),
).images[0]
image.save("ernie-image.png")
```

### Performance (Reference)

Tested on a single A800 GPU, 1024×1024 resolution, 8 inference steps:

| Model | Avg Latency | Speedup |
|-------|-------------|---------|
| Original BF16 | 4.89s | 1.0x |
| **Nunchaku W4A4** | **2.81s** | **1.74x** |

### Notes

- Only `batch_size=1` is supported (same as typical inference use case).
---


### 简介

为 [**Nunchaku**](https://github.com/nunchaku-ai/nunchaku) 添加了对 [ERNIE Image Turbo](https://huggingface.co/baidu/ERNIE-Image-Turbo) 的 **W4A4 量化推理**支持，在保持图像质量的前提下显著提升推理速度、降低显存占用。

本实现基于 [Nunchaku](https://github.com/nunchaku-ai/nunchaku)，感谢其在高效扩散模型推理方面的出色工作。

### 安装

```bash
# 本 fork 基于 Nunchaku 添加了对 ERNIE Image 的支持
git clone https://github.com/Hzj199/nunchaku.git
cd nunchaku
git submodule update --init --recursive

pip install build
NUNCHAKU_BUILD_WHEELS=1 python -m build --wheel --no-isolation
pip install dist/nunchaku-*.whl
```

### 快速开始

```python
import torch
from diffusers.pipelines.ernie_image.pipeline_ernie_image import ErnieImagePipeline
from nunchaku import NunchakuErnieImageTransformer2DModel
from nunchaku.utils import get_precision

precision = get_precision()  # 自动检测：int4 或 fp4
rank = 64

transformer = NunchakuErnieImageTransformer2DModel.from_pretrained(
    f"ZJMuYun97/ERNIE-Image-Nunchaku/svdq-{precision}_r{rank}-ernie-image.safetensors",
    torch_dtype=torch.bfloat16,
    device="cuda",
)

pipe = ErnieImagePipeline.from_pretrained(
    "baidu/ERNIE-Image-Turbo",
    transformer=transformer,
    torch_dtype=torch.bfloat16,
    pe=None, pe_tokenizer=None,
)

image = pipe(
    prompt="一只可爱的橘色猫咪坐在阳光照射的窗台上，旁边放着一盆绿色植物",
    height=1024, width=1024,
    num_inference_steps=8,
    guidance_scale=1.0,
    generator=torch.Generator().manual_seed(42),
).images[0]
image.save("ernie-image.png")
```


### 性能参考

A800 单卡测试，1024×1024 分辨率，8 步推理：

| 模型 | 平均延迟 | 加速比 |
|------|---------|--------|
| 原始 BF16 | 4.89s | 1.0x |
| **Nunchaku W4A4** | **2.81s** | **1.74x** |

### 注意事项

- 仅支持 `batch_size=1`（符合常见推理场景）。