Using A1111? Why not SwarmUI? - A transition guide

I’ve recently transitioned from Forge to SwarmUI (previously known as StableSwarmUI), and I’m really glad I did! I had experimented with it before, but it was still somewhat rough around the edges. It’s still in beta, but it has now been refined enough that it surpasses Automatic1111’s SD Web UI for me in almost every aspect. It might be a bit more subjective when compared to Forge or SD-Next, but depending on your workflow, SwarmUI could very well replace them for you too.

The aim of this post is to help “translate” SwarmUI for people who are accustomed to A1111’s interface. I will also highlight several extensions that have their functionality included in SwarmUI. I’ll mainly focus on what’s achievable in the Generate tab, as that’s where most A1111 users will feel at home, and because the Comfy Workflow tab allows you to utilize ComfyUI in all its noodly glory so has fewer limitations.

Interested? Check out the GitHub - https://github.com/mcmonkeyprojects/SwarmUI

So, why SwarmUI?

As someone who has used A1111 and Forge, I find SwarmUI to be an overall improvement in terms of UI, stability, usability, and utility. The singular Generate tab is easier to work with compared to the dual txt2img and img2img tabs. For anything I feel is missing or could be improved, I can either use a Comfy workflow or develop a Simple tab workflow. Since the backend is ComfyUI, everything runs fast and efficiently. While it’s not perfect, it’s more stable than A1111/Forge. And honestly, if nothing else, I’m glad to be done with using a Gradio interface!

So, why not A1111?

Automatic1111 and its forks are still good interfaces, no doubt about it. However, in my opinion, the code foundation and Gradio interface are starting to show their age. Even though I got used to it, I’ve always found A1111 and Forge somewhat unwieldy, especially when tackling more complex tasks. Extensions tend to tack onto an already flawed interface, making it feel cluttered and awkward. While there are some nice UI extensions that help, I think they’re just band-aid solutions to the core problem. A1111 was never built with long-term support in mind, and the Gradio UI has some serious flaws that limit its usability. That's why, once I realized SwarmUI can do everything I was doing in A1111 and Forge but easier and better, I decided to make the switch!

Transitioning from SwarmUI to A1111 can be challenging due to the significant adjustments required to use A1111. The changes in layout and terminology can make moving from one UI to another difficult. So, I’d like to address some possible questions you might have as you try SwarmUI for the first time.

Txt2img

Practically everything found in the txt2img tab is also included in SwarmUI’s Generate tab—and then some. Most features are self-explanatory, but I’ll delve into a few that are less obvious:

How do I use styles? - SwarmUI doesn’t use styles per se; it has presets that can be applied to both your prompt and parameters. Nearly every aspect that can be tweaked on the Generate tab can be saved as a preset. You can even activate multiple presets simultaneously if you wish to combine “style” presets with parameter presets.

Can I create an X/Y/Z chart? - Absolutely! At the bottom of SwarmUI, you’ll find a series of tags for Image History, Presets, Models, VAEs, Loras, Embeddings, ControlNets, Wildcards (more on that later), and, finally, your Tools. Within Tools, there’s a dropdown menu where you can select the Grid Generator. It’s just as robust, if not more so, than the Automatic1111 and Forge X/Y/Z generator, and you can choose to output to either a local webpage or a grid of images.

Does SwarmUI have a Highres Fix? – Yes, it does. The functionalities similar to Highres Fix are located under the Refine / Upscale category on the left side of the Generate tab.

Can I specify the resolution I want instead of an aspect ratio? – Absolutely. Simply select Custom as the aspect ratio, and you’ll be able to set a custom resolution.

Can I still add or subtract emphasis from a word in the prompt? – Indeed, you can. SwarmUI utilizes the same syntax for adding and subtracting emphasis, such as (word), [word], and (word:1.5).

What about other advanced prompting syntax? – While the syntax may vary depending on your needs, SwarmUI supports much of what A1111 offers by default, and even more. For detailed information on SwarmUI’s prompt syntax, refer to this documentation: https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Features/Prompt%20Syntax.md

Is there an option to increase the batch size? – Yes, there is, although it’s categorized as an advanced feature. To access it, turn on Display Advanced Options at the bottom of the parameters list and check under the Swarm Internals section. This isn’t to be confused with batch count, which is just listed as Images under Core Parameters.

Img2img

Wait, where is it? How do I do img2img? – I’ll admit, this can be confusing, especially if you’re used to A1111. Instead of a separate img2img tab, you can load images you want to run through diffusion again under the Init Image section of the parameters on the left side of the Generate tab. Make sure to flip the little switch next to Init Image to enable it! Alternatively, when you generate an image, you’ll see the option to Use As Init near the top-middle of the interface to set the image as the Init Image immediately.

Where do I adjust denoising strength? – It’s under Init Image but is called Init Image Creativity. Under Refiner / Upscale, it’s referred to as Refiner Control Percentage.

How do I inpaint? – After you generate an image, select an image from the Image History tab, or drag an image into the center of the UI, you can select Edit Image next to the image preview to enter the Image Editor. You can use this interface to inpaint and perform simple edits. It might take some getting used to, and it’s not 100% finished, but it’s still as good, if not better, than the A1111 inpainting interface. I won’t go into the details here on how to inpaint to keep things brief.

Do I need to use ControlNet or an inpaint model? – While you’re free to use them, by default SwarmUI uses Differential Diffusion and mask blending to improve inpainting automatically. This allows you to use a high amount of Image Creativity (or denoise if you prefer) without losing continuity with the original image.

How do I outpaint? - It’s a bit tricky right now as it’s still being implemented, but here’s how to do it currently: https://github.com/Stability-AI/StableSwarmUI/discussions/360#discussion-6739872

Is there an easier way to outpaint? - I created a simple workflow that can be used under the Simple tab to make outpainting easier. I recommend using an inpainting model with it. Download the Simple Outpainting workflow attached to this article and place it under “SwarmUI\src\BuiltinExtensions\ComfyUIBackend\CustomWorkflows”

Is there a way to interrogate pictures using CLIP, DeepBooru, or other image captioners? – There isn’t at the moment, but it’s on the todo list!

Extras

Is there an equivalent to the Extras tab? – Not entirely. You can still upscale with an upscale model by setting an Init Image, putting the Image Creativity to 0, and then under Refiner / Upscale setting an upscaling model. If that’s a bit much, I’ve created a simple workflow for using an upscale model under the Simple tab attached to this article; download Simple Upscale and put it under “SwarmUI\src\BuiltinExtensions\ComfyUIBackend\CustomWorkflows”

PNG Info

How to I extract metadata from the images I’ve created? – To extract metadata, drag an image you created (even one made in A1111) to the middle of SwarmUI where generated images display. From there, you can click Reuse Parameters if you’d like, or just read and copy the parts you need manually.

Checkpoint Merger & Training

Currently, there isn’t an equivalent to these two tabs in the main SwarmUI interface. Adding checkpoint merging and training is on the to-do list. The training tab in A1111 isn’t really recommended these days anyway; it’s better to use a third-party program like Kohya or OneTrainer. As for checkpoint merging, this can be done with a ComfyUI workflow if you desire.

Settings

While SwarmUI offers many options, I’ll be frank: Automatic1111 and its forks have way more. However, many settings that were under the Settings tab in A1111 are instead hidden in the parameters in the Generation tab in SwarmUI. You can expose them by clicking Display Advanced Options at the bottom of the parameter list.

How do I set the default parameters? - To set default parameters, create a preset in the Generation tab and name it Default. The settings in this preset will load whenever SwarmUI is started or the parameters are reset under Quick Tools, Reset Params to Default.

How to I set Clip Skip? – After enabling Display Advanced Options, you can find it under Advanced Sampling in the parameters as CLIP Stop At Layer. Note that you don’t need to change this for SDXL models, including Pony Diffusion.

Can I enable HyperTile or tiled VAE? – VAE tiling happens automatically if the image you’re processing is too big to work with in memory. You can manually set VAE Tile Size under Advanced Sampling in the parameters after enabling Display Advanced Options.

What about SAG? - SAG is also available in Advanced Sampling after enabling Display Advanced Options.

Is FreeU in there? – Yes, FreeU is implemented and can be found under its own parameter category after enabling Display Advanced Options.

Extensions

The most appealing part of the Automatic1111 ecosystem is undoubtedly the extensions. SwarmUI integrates many of these functionalities directly into its main interface, and anything that isn’t can be replicated in a Comfy workflow. Let’s explore some of the most popular ones and how to use them.

ControlNet – Fully implemented, SwarmUI supports up to 3 ControlNet modules at once (modules two and three are accessible by clicking the “Display Advanced Options” checkbox at the bottom of the parameters). More information can be found here: https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Features/ControlNet.md

IP Adapter, ReVision, Reference Only – These features, typically associated with ControlNet for A1111 users, are technically separate but implemented. To access them, drag the reference image onto the prompt box, and a new ReVision category will be added to the parameters. More information is available here: https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Features/IPAdapter-ReVision.md

TiledDiffusion with Tiled VAE – As mentioned earlier, Tiled VAE is automatically managed with some manual control if desired. Regional prompting will be covered later.

ADetailer, Detection Detailer, µ Detection Detailer – You can replicate and even expand on the functionality of After Detailer and similar extensions using the <segmentation> command. For more details, see this segment of the Prompt Syntax doc: https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Features/Prompt%20Syntax.md#automatic-segmentation-and-refining

AnimateDiff, video diffusion – SwarmUI supports Stable Video Diffusion (SVD) under the Video section of the parameters. This includes options for frame interpolation, though additional features are currently limited to external ComfyUI workflows. More details can be found here: https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Features/Video.md

Booru tag autocompletion – You can set up your own autocompletion word list or use Danbooru tags by following the instructions at the bottom of this doc: https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Features/Autocompletions.md

Dynamic Prompts, Wildcards – SwarmUI includes prompting syntax for creating dynamic prompts (prompts with their own random word lists) and built-in wildcard functionality. More information is available in the Prompt Syntax document: https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Features/Prompt%20Syntax.md

TensorRT – It’s built right in! Simply go to the Models tab near the bottom of the Generation tab, find the model you want, select the hamburger menu (you know, the ☰ symbol), and choose Create TensorRT Engine.

Ultimate SD Upscale – While it can be used as a ComfyUI node, that would be cheating. So instead, you can perform tiled upscaling by placing the image you want to upscale under Init Image, setting the appropriate Image Creativity, selecting an upscale model/method under Refiner / Upscale, and turning on Refiner Do Tiling.

Regional Prompter, Latent Couple, multi-subject-render – SwarmUI supports region prompts, though it’s still a bit rough. To get started, add <region: to your prompt and follow the displayed instructions to set the region and prompt for it, then close with >

RemBG, PBRemTools, ABG – There are a couple of ways to remove backgrounds in SwarmUI. One way is to enable “Display Advanced Options” and check “Remove Background” to run RemBG on your diffusion outputs. You can also remove backgrounds and other objects and replace them with transparency using the <clear> syntax. More details are available here: https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Features/Prompt%20Syntax.md#clear-transparency

Dynamic Thresholding, CFG-Schedule – You can Enable dynamic thresholding by checking Display Advanced Options, which will add a new category named Dynamic Thresholding under the Generation tab parameters.

CivitAI Browser+, Stable Diffusion Webui Civitai Helper, Civitai Shortcut, CivBrowser – While SwarmUI doesn’t include the full functionality of these extensions, it does have a CivitAI and HuggingFace model downloader under the Utilities tab. This feature allows you to download models and accompanying metadata by pasting in the URL. This functionality is expected to expand in the future, so stay tuned!

Aesthetic Image Scorer – An experimental implementation of AI image scoring is available by enabling “Display Advanced Options” and looking for the Scoring section of the parameters. To enable it, follow the readme doc: https://github.com/mcmonkeyprojects/SwarmUI/blob/master/src/BuiltinExtensions/Scorers/README.md

FaceChain, roop, ReActor – Currently, there is no built-in easy face replacement functionality in SwarmUI, aside from using <segmentation>, IP Adapter, or similar functions. This feature is reportedly on the to-do list. In the meantime, ComfyUI workflows can be used as an alternative.

Conclusion

I think that covers it for helping people transition from A1111 and its forks to SwarmUI. There are a lot more features not covered here, and obviously, SwarmUI is not a perfect fit for everyone. However, I think that for most people, SwarmUI offers the best of both worlds: a decent interface for creating images and access to advanced functionality through ComfyUI. Things are still being worked on, but SwarmUI is being updated all the time. Even if you decide not to switch, I suggest you keep an eye on it as it’ll continue to get better and better!