ghost
Sign In

The Araminta Experiment (SDXL+Flux)

797
15.9k
304
Verified:
SafeTensor
Type
Checkpoint Trained
Stats
4,098
Reviews
Published
Aug 12, 2024
Base Model
SDXL 1.0
Hash
AutoV2
FC95636A6C

NOTE: All my images are uploaded with embedded ComfyUI workflow which is alas incompatible with CivitAI processing and most often prompt cannot be retrieved. You can however download the original PNG image with the workflow included by clicking on the "DOWNLOAD" icon in the image viewer.

NOTE 2: Why I try to mostly publish images straight from my model with maybe a bit of a Lora (mine or some detail enhancer), I also sometimes now use Controlnet to get better more detailed compositions more easily: in this case the source image is obviously not in the workflow, but I guess you can still use the image I publish as a source if you want to make a variation :)


If you enjoy my contribution to this community, feel free to buy me a coffee: the more caffeine I drink the more models I can create 😅

Now

Comparison gallery here: Ev4 - Ev3 and Fv1 - Ev4

Current SOTA model in my experiment:

  • Flux model: A1... let the fun begin!

  • Base model: Fv1 (if you accept unexpected NSFW) or Ev4 or Cv6... Fv1 is biased toward NSFW, but still better in many aspects.

  • NSFW model: Fv1

  • Illustration : Fv1 (NSFW) or Cv6 (SFW)

Flux A serie is my first Flux.1 model created by merging Flux-dev-fp8 with several Loras I have trained using my dataset. At this point it has to be considered as a WIP and it is not clear whether it will be possible to create a versatile base model using this approach. But Flux is obviously the SD3 we were all hoping for and its capabilities out of the box are quite amazing.

Image Generation Settings for Flux model

Still a learning process but at this point my preferred settings are DPM++ 2M / beta or sgm_uniform or DDEIS / normal for the sampler / scheduler, beta giving a bolder stronger image. For a more subtle image, Euler / simple or beta seems a good bet.

CFG seems to have a huge impact on the final image and be very sensitive even to small variations.

  • For photos, CFG should remain low (1.5-2.5) to avoid plastic skin.

  • For fine art and illustration it is more complicated because it depends on the medium. For "rough" styles (painting, watercolours etc.), CFG should stay quite low in the 1.5-2.5 range but for anime or comic style, CFG needs often to be pushed further to achieve the desired style (3-6 or more).

If the image is messy/malformed or blurred, it is often because the CFG/steps are inappropriate for this image, but it is not always easy to know whether CFG/steps must be increased or decreased (at least to me 😊).

There is for sure a lot to learn concerning Flux behaviour which is quite different than SDXL and we will need to adapt.

Future

I will not spend more time on my SDXL models as Fv1/Cv6 are mature and probably still better than Flux models in a few key points.

I will focus now on my Flux model and try to understand the best way to create a versatile model focused on realism/NSFW as well as excellent styling capabilities. Not sure if it will works as well as with my SDXL models though: styling (illustration) in Flux is a bit a hit or miss it appears.

Past


Starting from E serie, models are evolving sometimes by merging with other models (thanks to other contributors!), but mostly via training on my own dataset: a modest dataset (~2000 images currently), but I try to somehow compensate with quality and originality.

Starting with Fv1, I have included many synthetic images I created using previous versions: playing hard with prompt and retouching when necessary the result in Photoshop in order to have a dataset which contains many original images.


The core idea behind this model was to create a versatile tool by merging some of the best existing models which fits my personal taste (photography and fantasy art to make it simple). My primary goals were:

  1. Photorealism: The ability to produce stunningly realistic images of both people and objects/nature.

  2. Flexibility: The ability to create highly stylized images, allowing for artistic expression through various styles and combinations of artists. I am from a older generation and comes from Europe so "style" does not mean for me "Japanese kawaii aime with boobs" or "DC Comics cartoon with lots of superheroes and voluptuous blonde babes" but more from a univers of Frank Frazetta, Milo Manara, Boris Vallejo, H.R.Giger, Wojtek Siudmak and such fantasy art masters: there are boobs involved for sure, but the style is somewhat different :P

  3. As I doesn't like being limited in my exploration of the human body, the idea is also to have a fairly capable NSFW model. However, dur to the nature of available training images in the data sets, NSFW often comes with a strong bias toward either porn photos or porn Japanese anime and impact the flexibility (typically as soon as you use the word "sexy" in your prompt you need to weight in the style). This point is thus NOT the priority for the base model but is pushed forward in the NSFW model.

Currently this imply having THREE different model branches depending on the usage (but all are actually quite versatile and none are specialised). Currently branch C is a bit more mature (6 versions so far) and versatile because I push less NSFW stuff and is thus less biased. Branch B is more pushed toward NSFW and thus tends to be a bit biased. Branch E is a new branch spawned from C and is currently a WIP: I will push it more towards some sort of creativity in the future probably. Branch A is now deprecated.

Model Versions & Includes Models

Version past Cv6

See the description of the version itself for information about models included.

Version Cv6

Added models:

  • pixelAlchemy_v22

  • artUniverse_v40

Version Cv5

No specific model added but merged with some previous test I made, so the recipe is "not well understood but tasty" :)

Version Cv4

Added models:

  • pixelAlchemy_v16

  • pixelAlchemy_HyperCFG

  • pixelwave_10

Version Bv4

Added models:

  • level4XL_vA04

  • HelloworldXL_60

Version Cv3

Added models:

  • realismEngineXL_v30

  • projectUnrealEngine5_v10

  • pixelAlchemy_v13

  • anterosXXXL_v10

Faces are less stereotyped and skin texture and NSFW capability quite a bit better. Framing tend to be a bit more narrow which may require to play with the prompt to widen the view.

Version C: New Base Model

Added models:

  • realvisXL_v40

  • RealitiesEdgeX_V7

  • robsMixUltimate_v10

  • SevenofXL_NSFW_v94

This is a "better A" version supposed to become the new base model. It improves the A version with a bit better everything :)

But overall, compared to A I would say it has richer composition, is a bit better as NSFW (but version B is still the best one for NSFW stuff) while mostly keeping its ability to stylise even if it may sometimes requires a bit more weighting in your prompt towards "illustration".

I consider version C replacing A as the base model for me, but feel free to disagree ;)

Models used in versions A/B

  • ElysiumXL_v10

  • NewrealityXL_40

  • Haveallsdxl_v10

  • AcornIsBoningXL_v10

  • JuggernautXL_X

  • Onlyfornsfw118_v20

  • EpicrealismXL_v7

Version A: Base Model

  • Goal: To provide a highly versatile base model capable of producing both beautiful photorealistic and stylized images. NSFW is also possible with this model if you ask kindly in your prompt.

Version B: Enhanced NSFW Model based on A

  • Goal: To enhance NSFW capabilities, sometimes at the expense of stylization, which can be managed by adjusting weights in the prompt. Girls tend to be also a bit less dressed with this model even if you don't ask for it ^-^