Collected 10k subset from pickapicv2 training split with hpsv2 score filter (above 2.8 score)
allow to pause anytime
include hpsv2 filter
After downloading images, I had done the following filtering
Manully filtered low quality images
Correct the caption
Adjust high and low selection
pickapicv2_filtered_4k (6.21 GB)
a few NSFW
text on sign
This modified version would take captions folder to train with image pairs.
Only modifed the SDXL part.