santa hat
deerdeer nosedeer glow
Sign In

Tenofas FLUX Modular Workflow - User guide

Tenofas FLUX Modular Workflow - User guide

FLUX – A guide to my FLUX modular Workflow

Last update: October 25th, 2024 (updated to v.4.2)

FLUX came out on August the 1st, it was unexpected.
It was incredible. Everybody was on it, testing, writing about it, trying to understand how it works and how to make it more usable like SD 1.5 or SDXL.

At the beginning it seemed like it was impossible to have LoRA’s, Controlnets and make it lighter to run on older (or smaller) computers.
Then someone decided to try…

“Everyone knew it was impossible, until a fool who didn’t know came along and did it.” — Albert Einstein.

So, LoRA’s arrived, Controlnets too and also small and light checkpoints, in place of the 24Gb Unet files, that could be run on a PC that did not have a Rtx 4090 with 64Gb Ram on it. Flux was available to everyone, and you could do anything with it!

I never used ComfyUI before, but on the 1st of August that was the only way to try FLUX. I had to install Comfy and started to learn how to use it. Now I would never go back!

I started to create a nice and easy workflow for FLUX, and then I added little things like LoRA’s manager, img2img prompt generator, upscaler, facedetailer… after a few days it was a huge workflow. This post wants to be a short guide to the last version (v.4.2) of my workflow. At the end of this post you can find what files you need to run this workflow and the links for downloading them.

Please, before using the workflow, make sure you update ComfyUI and all the Custom Nodes that are used in the workflow. This will avoid many possible errors.

Modular FLUX workflow v.4.2

My complete ComfyUI workflow looks like this:

You have several groups of nodes, that I call “Modules”, with different colors that indicate different activities in the workflow. I will go into details of each module later on. For now, I want to give a quick overview of the workflow. 

The yellow nodes you can find around the workflow are just instructions and links to the files you need to run the workflow. Then you have the blue group, which is the main part of the core groups of the workflow, where FLUX is loaded and you can set up its parameters. On the left of the workflow, there is a red group, these are the switches and selectors for the various modules: here you will choose what kind of prompt you want to use (will write more on this later) and if you want to activate the various modules: Latent Noise Injector, Adetailer (if you are not generating portraits of human beings you can just turn this off as it won’t work and it will waste time and memory), Faceswap, Expression editor, the Upscaler (same thing: if you are not planning to Upscale your images keep this turned off) and the Post-processor. 

Next to the main core group, the blue one, there is a bright-green group with a strange name: “Latent space magic dimension”.  This is where all the magic of AI image generation takes place. The dark-green nodes are the output nodes, these will save the generated images and will allow you to compare the various outputs.

On the top of the workflow, there are three orange groups (and also, with different colors, the Inpaint and Faceswap modules, which are kind of Input groups). These are the prompts options if you don't want to use the classic txt2img prompt ("Input 1") in the core section of the workflow: "Input 2" is a img2img prompt generator that uses Florence 2 model to convert the uploaded image to a text prompt (Input 2 on the prompt selector); "Input 3" is the LLM prompt generator, just write a short instruction or just a few keywords, and the LLM model will generate a colloquial-english prompt (Input 3 on the prompt selector), chained to this group there is a Portrait Master module to help you generate the keywords for LLM prompt generation if you want to create a portrait image; "Input 4" allows you to "batch-prompt" many different prompts and generate them in batch with one click.
To the right of the core modules of the workflow, there is the Latent Noise Injection module, followed on the right by the new ADetailer module (that will enhance and try to add details to face, eyes and hands.  Feet detailers are not working fine yet, but I keep an eye on the web to see if a valid detector comes out). 

Right below the Latent noise module, there is the new Expression Editor module.  This tool is mostly used for portrait images.  Then we have to the far right the Upscaler and to the bottom right the last module the post processing module.

Let’s see each part of the workflow in more detail.

The Core groups

My workflow uses the original model files released on August 1st by Black Forest Lab (the developer of FLUX), I use the Dev version, but you could also run the Schnell one. This group is the core of the workflow for generating FLUX images. 

You can also use the GGUF Flux models that are more lightweight and come in different sizes (my suggestion: 8Q for top quality, 6Q_K or 4Q_K for low VRAM GPU). 

Warning: if you are not going to download and use the FLUX GGUF model files, you have to remove the "Load FLUX GGUF Model" and the "Model Switch" nodes, and connect the "Load FLUX original Model" directly to the "FLUX LoRA's Loader" node.

The red node “Model Switch” allows you to choose the model (original Unet or GGUF) you want to use.

You have the basic txt2img prompt node (orange one), then the blue group that will load the FLUX model and let you choose the settings.

You can set the image size in the node “Basic Image size”, the sizes are preset on the SDXL standard for better results (they apparently are the best combination also for FLUX). You can set the Flux Guidance (usually set at 3.5, the higher the value the higher the adherence to the prompt), the Sampler/Scheduler (there are a few suggestions you can try in a note node), the Steps and the Seed (it’s set to Random seed, but you can select Fixed if you want to). Important note about Seed: the Seed node here will control all the seeds in the workflow, so once you set it to “fixed” the workflow will set it to fixed everywhere.  This is very important as it allows you to modify just limited parts of the workflow, without regenerating the whole image from scratch every time you modify a setting anywhere in the modules you are using.  

In this version I also added the “Sigma Modifier” node, ranging from 0.950 to 1.050, the default is 1.000, which allows you to set the starting noise level and how each step will remove it.  Numbers below 1.000 may increase the details.  Remember that changing this setting even by 0.001 may drastically change your image.

Right after the model loader nodes, there is the FLUX LoRA’s Loader node, you can select the LoRA’s you want to use (you can have multiple LoRA’s active at once, but I suggest not to activate too many as it may slow the generation and give OOM problems in the workflow), set each LoRA’s strength, and turn on/off each one.

The FLUX LoRA's Loader allows you to retrieve all the information you need from CIVITAI, just right-click on the LoRA you want more info about, and select "Show Info", a lot of information will be available to you.

Warning: the first time you start a generation the workflow needs to load the Unet, Clip and Vae files, so it will take a few minutes and it will stay “stuck” at the first nodes. The speed depends on which Unet weight and Clip you are using and on the Vram/Ram your computer is running.

Then we have the bright green group, the Latent space where the image is generated. 

Right after this, there is a dark green group “Save FLUX image with Metadata”.  Here you have the "Image Overlay Text" node very useful for testing your image generation: you can write in the text box the description of the image you are generating and the text will be added on the image. Leave the text box empty if you don't need this.

Right below the overlay text, there is a “Create Extra Metadata” node that allows you to add specific metadata to the image: in the example above you can see how I set these metadata for my images.  

Last, there is the Save Image node, that will allow you to save the image with its metadata. This means that the image will contain all the generation metadata (prompt, scheduler/sampler, seed, steps...) and the extra metadata if you set any.

Available Prompts

This workflow allows you to use different prompting methods. The first is the classic txt2img prompt: you just write a description of what your image should look like, and generate the image. Remember to use descriptive text, as FLUX clips understand very well human language and will give you better results. You can also try to write prompts in other languages, for example, Italian works fine (but also French, German and Spanish should work as far as I know). To select this prompt method you will have to set "off" all the other Prompt modules in the red Prompt Selector.

The second method is a img2img prompt generator using Florence 2 model for support. Just upload an image and the model will generate a text describing in detail the image you uploaded. There are many Florence2 models you can choose, but my advice is to use the "MiaoshouAI/Florence-2-large-PromptGen-v1.5" model as it gives the best results.

You can also use a small upscaler to upscale the latent and then you should set the img2img denoise:  the range for best results is between 0.30 and 0.90.  I suggest staying around 0.40 and 0.60.  The lower the value the less the final image will be changed according to the text prompt (this is why you could copy the Florence 2 proposed prompt in the “Modified Input 2 Text Prompt” and modify the things in the prompt you want to change), to the Lora’s and to the settings in the blue group.

I tried also to use JoyCaption img2img nodes, but unfortunately, JoyCapition uses an extremely large model, and on a 16Gb Vram it will always go OutOfMemory, so I decided not to add JoyCaption to my workflow, at least for the moment.

The third method (for those of you who can’t write in good English or are just lazy like me) uses a LLM model (Groq is suggested as it’s free at the moment, but you could also use OpenAI). You will need an API key (free on Groq) and it has to be saved in Comfy by using a specific node (TaraApiKeySaver, in the image is the bright red node) that you can remove once you save the key, just open the node, insert the key, launch a generation and you are done.

You must write your instructions in the bottom-left node, it can be a brief description of the image you want or just a few keywords. You can try to change the LLM settings (temperature can be set between 0.0 and 2.0, never above 2.0 or you will get errors). More about these LLM nodes on Github: Tara – ComfyUI Node for LLM Integration .

My suggestion for LLM settings: use mixtral-8x7b model on Groq, temperature 1.0, max tokens 512-1024. In my tests, these gave the best results.

Above this group, you will find the Portrait Master module.

This module allows you to generate a token version of a txt2img prompt (as it was used with SD 1.5 and SDXL) and then feed this to the LLM prompt generator. You can set many details of the image here, like the description of your base character, his/her skin details, the makeup and the style and pose of the image.

A fourth method is the Batch prompt from the txt file. It's just a txt2img batch prompts system that allows you to write how many prompts you want in a .txt file and upload it in the workflow and with a single click on Queue you will generate all the prompts in the .txt file as a batch.

Inpaint module

Another kind of input module is the Inpaint module. Here you can load the image you want to modify, right-click on it to access the Masking windows, and draw the area you want to modify.  Once you start the generation the workflow will apply the prompt you selected and the LoRA you chose (if any) only to the masked area, trying to blend it to the rest of the image. The Inpaint module does not work with Latent Noise Injection module since only the masked area is sent to the latent space, the output would be really terrible.

Workflow Control Center

In the bright red group you have the Workflow Control Center, where you can control with the On/Off switches the modules you want to use. So, if you turn all the switches off you will have a plain and simple FLUX model with LoRA’s txt2img generator. About the Prompt, in this version, the workflow will use the normal txt2img (Input 1) prompt if all the other prompt methods are turned off, otherwise, it will use the lower input number of the methods turned on (so, for example, if you turn On Input 2 and Input 4, it will use only Input 2, the img2img prompt).

Remember: the order of the modules is mandatory: the workflow will first apply the lowest module’s number, then step to the following one. So you can't apply Adetailer (module 08) before latent noise (module 06) and/or expression (module 07). But you can skip those two if you don't need them so the image will go from the Core module directly to the Adetailer. On the other hand, if you turn on Expression (module 07) and Adetailer (module 08) the workflow will first execute Expression and then Adetailer; if you turn on Adetailer, Latent Noise and Expression module the workflow will first apply the Latent noise module (06), then the expression module (07) and last the Adetailer (08).

Remember also not to turn on more than one Prompt module. If all are turned off, the classic txt2img prompt node (Input 1) will be used by the workflow.

Latent Noise Injection module

This module is a very "powerful" way to upscale the image (but only up to 2x, more than that would ruin the image) and to add a lot of detail to it. But it could also apply some modification to the image, so use it carefully. It takes the image you generated in the Core module and sends it back to Latent space, upscaling it and adding some noise to it on two passes, and then it generates an enhanced image.

The settings are many, there are some suggestions in the yellow notes node, use them as a starting point, then play with the settings. There are no valid settings for all images, it depends on the subject: a portrait will have completely different settings from a landscape image or an anime image. In the module, you can even apply a light sharpening to the image.

ADetailer module

This is the new version of ADetailer. It checks the image FLUX generated, recognizes the face, the eyes and the hands in it, and applies more details to both. It is completely working on the FLUX model now. Obviously, this module will work only on portrait images, where it can recognize a human face. Otherwise, you can turn it Off.  This module may use a lot of Vram, so I suggest using only the detailer (face, eyes and hands) that you need and turning off hands and/or eyes if you don’t need them.  Otherwise, you will need to use a lighter FLUX model (like a GGUF Q8 or Q6.K).

Expression module

This module is used to modify the expression of a face. It works mostly for portrait images.  You will be able to move the head (left-right/up-down), modify the eyes (open-close/look left-right/wink) and the mouth (smile or more-less open). 
The output of this module will be a little blurry, don’t expect incredible results, but it works well and the image will look very natural (unless you push too much all the settings! In this case output will look weird.).
Since ComfyUI can not work in real-time, every change in the expression editor settings must be followed by a new generation to see its results.
To use this module efficiently you need to Queue just the Expression editor group. There are two ways to achieve this: 1. Right-click anywhere on the group and select "Queue Group Output Nodes (rgtree)" from the menu; 2. (suggested) Open Rgthree settings and checkmark "Show fast toggles in Group Headers", select to show only "queue" and set "always" in When to show them.  The image below shows you how:


This way, once you generate an image, you can modify the settings for the Expression Editor, then you run just the module nodes by pressing on "Queue group output nodes" and it will just update the Expression module, without running all the workflow from beginning.  Remember to set the “Seed” to “fixed”!  Otherwise, every time you Queue a new generation it will start from the beginning since ComfyUI will check what was the first node in the workflow that was changed.

The Upscaler

The Ultimate SD upscaler will upscale only (and automatically) the last module you turned on. So if you have everything off and turn on only the Upscaler it will upscale the core output, if you turn on the ADetailer the Upscaler will work on the ADetailer output only. So choose with attention your modules and what kind of workflow you want to use for Upscale.

Here you can set the parameters for the Upscale. You can play with the settings, but beware, it could make the upscale process extremely long. Upscale by 2 means the image will double width-height sizes, so a 1024×1024 image will be upscaled to 2048×2048. You can change the Steps, the tile width/height (but it’s better to leave it close to the original image size) and the other settings if you want. There are many Upscaler models you can choose: https://openmodeldb.info.

The FaceSwap module

This module uses ReActor nodes to manage a simple faceswap. You just upload an image (possibly a high-res portrait) of the source face you want to swap in the Flux-generated image, and start the generation. You can compare the original image with the resulting image with a slider.

Warning: ReActor nodes may return CUDA/Cudnn errors due to updated Python packages.  To fix these errors you have to update manually both “onnxruntime-gpu” and “onnxruntime” packages.  Check issues on the following website: https://github.com/Gourieff/comfyui-reactor-node  .

The Post Processing module

This is a completely new module that replaces the LUT apply module.  Here you can add “vignette” effects, LUT filters and add some grain to simulate analogic film photos.

You can use LUTs (Look UP Tables) to give your image an analogic-film look. Just download the LUT you need (here are some free good ones). The LUT files must be saved in the following folder (WARNING the folder has changed!!!  Move the LUT files, those with the extension .cube, in the new folder):   ../ComfyUI/models/luts/

Where to download the workflow from

I have my workflow available for download from two websites:

1. CIVITAI (you need to be logged to Civitai to download it)

2. OpenArt.ai

Model, Lora and other files you will need

This workflow was designed around the original FLUX model released by the Black Forest Lab team. You will need the following files to use the workflow:

1) UNET – Dev or Schnell version (each one is around 24Gb, Dev gives better results, Schnell is the “turbo” version). These files must be saved in /models/unet/ folder.
Dev download Link:
https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors?download=true
Schnell
download Link:
https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/flux1-schnell.safetensors?download=true

2) GGUF - if you want to use a FLUX GGUF model you have just to choose its "weight", Q8 Q6.K and Q4.K are the three I tested and give good results. These files must be saved in /models/unet/ folder. You can find them here: https://huggingface.co/city96/FLUX.1-dev-gguf/tree/main

3) Clip – you need these clip files (use fp16 for better results, fp8 if you have low Vram/Ram). These files must be saved in /models/clip/ folder.

t5xxl_fp16:
https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors?download=true

t5xxl_fp8:
https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp8_e4m3fn.safetensors?download=true

clip_l:
https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors?download=true

4) VAE – last but not least the VAE file (must be saved in /models/vae/ folder):
https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/ae.safetensors?download=true

ADetailer uses some specific files to recognize faces and eyes in the image.
These are the files you need for the mask detector:

1) sam_vit_b_01ec64.pth (goes in folder /models/ultralytics/bbox/ )
https://huggingface.co/datasets/Gourieff/ReActor/blob/main/models/sams/sam_vit_b_01ec64.pth

2) face_yolov8n_v2.pt (goes in folder /models/ultralytics/bbox/ )
https://huggingface.co/Bingsu/adetailer/blob/main/face_yolov8n_v2.pt

3) eyeful_v2-paired.pt (goes in folder /models/ultralytics/bbox/ )
https://civitai.com/models/178518/eyeful-or-robust-eye-detection-for-adetailer-comfyui

3) hand_yolov8n.pt (goes in folder /models/ultralytics/bbox/ )
https://huggingface.co/Bingsu/adetailer/tree/main

(If you get a warning about the .pt file could be unsafe, it’s because it is still in the old .pt format that is now deprecated as it could be harmful.  But the files I listed above are all tested and safe)

Last, but not least, for the Upscaler I use the 4x_NMKD-Siax_200k.pth model (goes in folder /models/upscale_models/) that you can download here: https://huggingface.co/gemasai/4x_NMKD-Siax_200k/tree/main or https://civitai.com/models/147641/nmkd-siax-cx

But there are many other upscale models you can use.

I hope you will enjoy this workflow. Leave a message if you have any questions, requests or hints. Thanks!

Tenofas

142

Comments