Type | |
Stats | 413 |
Reviews | (40) |
Published | Dec 20, 2024 |
Base Model | |
Hash | AutoV2 D2E7423A12 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Check out my SDXL Models:
Stylized model: RAYBURN
Realistic model: RAYMNANTS
Painterly model: RAYCTIFIER
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Introducing RAYFLUX1.0
RAYFLUX is my attempt at bringing the learnings from my SDXL model to Flux1.D. It has been trained on some of the same images RAYMNANTS was trained plus synthetic outputs from my 3 models. The main method has been via block by block selective LoRA merging on LoRAs I've trained on the aforementioned assets.
Somewhere down the road I've done a couple tests and decided to settle on fp8 as my weapon of choice. It performs virtually the same as the fp16 version, just faster and with a smaller memory footprint.
All in all, while it does still suffer from the Flux chin disease, it's a pretty versatile model I'm relatively happy with and that is relatively easy to prompt for.
Hope you'll enjoy,
R.
HOW TO PROMPT
Be methodical!
What I found works well is a sentence or two that describe the media or type of asset before even starting to describe the subject. After the subject, you can end with additional details about the rendering, i.e. how special is it? is it dirty, grainy, worn down, etc.
Quick tip: Separate your different prompt elements with a line break for more clarity and ease of reuse. Flux doesn't mind at all, and you can easily isolate things you want to reuse (like the media description) for easier tweaking later.
EXAMPLES
A cinematic movie shot, shot with a 35mm camera, filmic grain, vibrant colors.
A ford GT40 racing car parked in a dark alleyway at night.
The car is bright glossy orange, with black stripes on the hood and doors. The interior is black leather.
The shot looks epic, directly from a high production value blockbuster sci-fi movie.
1990s style analog flash photography.
A space hovertank emerging from behind a dune on the Planet Arrakis. It is black, with a angular shape, and a massive laser cannon mounted on its turret. The trapezoidal-shaped mouth of the cannon is lighting up in purple as the hovertank readies to fire. The hover engines of the hovertank send sand flying in its trail. In the background, a bright, perfectly blue sky. The camera angle is slanted, seen from below, in a very dynamic fashion. There's some motion blur and sand particles flying around.
It looks like it has been taken with a cheap disposable camera, with visible film grain and slight lens distortion.
1980s-1990s TV series, retro broadcast aesthetic. High-contrast lighting with dramatic shadows. Soft grainy textures with filmic imperfections, giving the visuals a vintage, analog feel. Dynamic camera angles. The look is gritty yet colorful, capturing the essence of 80s and 90s TV's blend of excitement and bold visual design.
An anthropomorphic brown dog with black ears and tired eyes wearing an very small brown hat sitting at a table in front of a yellow mug of coffee. the whole room is on fire, flames licking the wall dramatically and heavy black smoke accumulating on the ceiling. There's a text bubble next to the man/dog that says "THIS IS FINE".
SETTINGS
RAYFLUX can work with a wide range of settings, but I found an unusual combo to be what works best, for all styles, always.
Keep both max shift and base shift at 1.0
Flux conditioning stays at 3.5
Heun Beta at 10 or 14 steps (yes, it's a fast model). This is the most important stuff. Other combos might work, but I found Heun Beta to be systematically the best choice in 99% of the cases. Literally first time I actually get good mileage out of this combo, but here you go.
I load the model with weight types fp8_e5m2, but e4m3fn or e4m3fn_fast work too.
My favorite encoders combo for this model are: t5xxl_fp8_e4m3fn_scaled and ViT-L-14-TEXT-detail-improved-hiT-GmP-TE-only-HF. I use the Text version, but the Smooth version is great too!
I usually do a second pass with a SD upscale at 2 steps with denoise at 0.3. I switch the upscaling model depending on the style I'm looking for e.g. 1x_ITF_SkinDiffDetail_lite for grainy portraits (works great at x2 too!), 4xUltrasharp for generic illustration, 4x_Foolhardy_Remacri for softer, cleaner digital illustration style. The prompt you use on the SD upscale is very important with Flux in general, don't hesitate to abuse it to nudge the upscale in the direction you want (i.e. write grainy for more grain, soft for less grain, add details about your protagonist, etc.)
Quick tip: While I recommend sticking to my default settings as I've run a lot of grids before zeroing in on those, if you find a seed that's almost what you want but has some imperfections or you want to try to squeeze some more detail out, just play around with max and base shift values. Higher shift will tend to be noisier but also more detailed, lower shift give you softer results.
COMFY WORKFLOW
Here's a very simple ComfyUI workflow with SD upscale I made for RAYFLUX. Just drag and drop the image in your Comfy and get creative! It uses a few custom nodes but the Comfy Manager should handle that for you.
KNOWN ISSUES/QUIRKS
With the recommended settings, the outputs are a bit unsharp, but I found this + 2 steps SD upscale is just the best combo
Not very good at nudes and below the belt anatomy, but that's default Flux behavior for you.
I couldn't get rid of the iconic Flux cleft chin that creeps up in women portraits. You can prompt out of it, but still.