Sharing the workflow I used to make comparison I posted on reddit. It's quite demanding to do it all in one go. It works on a 4090 but might work on other GPU, IDK. You NEED to "unload the models" after each generation or else, in my PC, it fills the VRAM and will get stuck.
Lumina is the most demanding. Even with the flash_attn installed. I get 2.5 it/s. It also don't accept a lot of resolutions.
I won't write right now how to install, look at each respective GitHub page. But if anyone needs, I can assist in the comments.
I updated the file with a more fixed version of the workflow. Now you can toggle what models you want to generate and the ones you don't.