Just testing some ideas with Flux :)
For V4 pretty much as V3, 6-10 steps.
For V3 use Euler Beta or Simple (DPM++ 2M SGM uniform seems fine too).
8-10 steps
This is a Unet only, you need to add and load those https://huggingface.co/lllyasviel/flux_text_encoders/tree/main
If you use Compfy UI this node works for NF4 https://github.com/DenkingOfficial/ComfyUI_UNet_bitsandbytes_NF4