home models images videos posts articles bounties challenges events updates shop

SD1.5 Direct Preference Optimization - DPO

636

Updated: Dec 23, 2023

base model

basemodel dpo

Download (1.99 GB)

Verified: 2 years ago

SafeTensor

This checkpoint recommends a VAE, download and place it in the VAE folder.

Details

Type	Checkpoint Trained
Stats	636 1
Reviews	Positive (37)
Published	Dec 22, 2023
Base Model	SD 1.5
Training	Steps: 2,000
Usage Tips	Clip Skip: 1
Hash	AutoV2 D294A157D5

3 Files

default creator card background decoration

pyn

License:

CreativeML Open RAIL-M Addendum

Not my model, from the huggingface repo. This is an excellent merge model, particularly in the middle blocks. Try it yourself - take your favorite model, and block merge this at about 10% input, and 20% middle, and adjust from there.

Original U-Net: https://huggingface.co/mhdang/dpo-sd1.5-text2image-v1

bdsqlz's release: https://huggingface.co/bdsqlsz/dpo-sd-text2image-v1-fp16

bdsqlz released the sdxl model here: https://civitai.com/models/237681/dpo-sdxl-fp16 but us poor 1.5 users were left in the dark ages.

I had to do some hacking to get the fp32 version, so you will have to bring your own VAE.

Diffusion Model Alignment Using Direct Preference Optimization

Direct Preference Optimization (DPO) for text-to-image diffusion models is a method to align diffusion models to text human preferences by directly optimizing on human comparison data. Please check paper at Diffusion Model Alignment Using Direct Preference Optimization.

SD1.5 model is fine-tuned from stable-diffusion-v1-5 on offline human preference data pickapic_v2.

SDXL model is fine-tuned from stable-diffusion-xl-base-1.0 on offline human preference data pickapic_v2.