Pickapicv2 dataset
https://huggingface.co/datasets/yuvalkirstain/pickapic_v2
Collected 10k subset from pickapicv2 training split with hpsv2 score filter (above 2.8 score)
Download script
https://github.com/lrzjason/DownloadPickAPic/tree/main
async download
allow to pause anytime
include hpsv2 filter
Filtering
After downloading images, I had done the following filtering
Manully filtered low quality images
Correct the caption
Adjust high and low selection
pickapicv2_filtered_4k (6.21 GB)
https://mega.nz/file/fgsxhbIa#QSNcjVxm4vY2f68PyOzmlIMHQCQOe93EyyFK1rmRkEc
Content
a few NSFW
text on sign
wide composition
etc
Training
https://github.com/lrzjason/sliders-image
This modified version would take captions folder to train with image pairs.
Only modifed the SDXL part.