Realism: A struggle + plans for the future
Read this article for the full announcement: https://civitai.com/articles/1139
This is going to be detailing our struggle trying to do a realism model and the problems we face MAKING IT - plus the problems we're noticing in dealing with keeping within terms of service and the face of the current news world with AI.
TL;DR: We're not sure we're going to continue our realism line, but we'll get to why in the sections below.
How realism? How , not why? Well ok yea we know why: Realism is for the people who DON'T want anime. We got that much. The how of it is?
Well to make a realism model you just gotta step straight outside touch grass -- oops wait you meant in stable diffusion.
Well since we're REALLY bad at figuring out the ways to stop being Neurodivergent when training a model - we've been doing the Squish & Merge tricks.
Without making this a full tutorial because this isn't a tutorial article - basically we've used super-merger on a ton of our models already to fuel different styles.
The problems with the hows were: Trying to recreate a GOOD realistic or semi realistic model inspired by dreamshaper, NED and other ones like Cyber Realistic. We're so used to how FAST you can make an anime model -- (More like how long, and why we are up for 10 hours doing it but OK)...
The Problem Child
EH it's like imagine you're out babysitting and it's some rabid spiderman hyperactive level 4 year old that wants everything at once.
Ok that's us in a nutshell, now imagine that babysitter is more like 85 years old and doesn't understand where their hearing aids went and just wants to feed the kids cookies.
That's how we feel about making realism right now.
Let's go into it by sharing some of the XYZ plots and crying together in a hunk of model making goo because I think we're done and don't wanna do this anymore.
None of these models exist. because after we started doing our XYZ plots at hi res and nearly crashing our instance?
We sent our XYZ plots to the Chief, and Chief's a good keen eye - swear we owe him fish and chips or a good case of speights or Lion Red or something...
"You rezzing at 512 or lower?"
"cause your hi res isnt' fixing the mess"
See we were getting somewhere until it started unravelling.
Sure we were making some consistently amazing things but we'd panic, add another model on top of it - see if we could FUSE a better style out of it (It's realism, you can't unless you're doing 2.5-3D).
You shouldn't always (not EVERY TIME just most of the time) need adetailer, hi res fix to save your butt - it's meant to enhance, not fix your mistakes. (Except on anime models, it can sort of do magic)
Noted that we were dropping the resolution to lower than 512 on one edge to see if that was going to start fixing our mistakes...
And it was only getting worse.
And we were at nearly 12am our time, downloading heaps more stuff to throw at it.
Once your model looks like a dog ate it for breakfast, and shit it out the other end? Ya can't save it if it's realism- you're just literally asking for a Therapist to fix stupid.
Frankly it didn't matter the seed, the sampler - it just wasn't working. And, largely we were making TI's to help our prompting because testing these things and remembering prompts is probably more difficult than you can think when you work 20 miles a minute LOL.
What's the deal you ask why are you rambling?
Yea, it's a workflow and let me admit before we hit the next section: This is worse than trying to write a case study in my 3rd year of college.
There's WAY MORE INVOLVED in realism because of the quality of photorealism. A 3D/2.5D realism like Rev Animated is WAY easier to contain because it's an illustration of sorts - the hires and adetailer can carry the not-so portraits.
Comparison between the failing test and other models.
Keep in mind: we only have ONE current mix and even THAT one isn't as good as other models could. be - despite you seeing "ELLIS MIX" in a lot of our TI's there's still a high chance we're actually going to delete it and we'll explain later.
Epic Mix test was the original mix that started failing, it was built from Epic Mix V6.
We're not sure how much we enjoy doing realism.
We're just chasing a trend, of popularity and we're getting concerned with it for our own sake.
But regardless of that, you can see that SOMETIMES our main model was just flying off into worse than the netherworlds. It didn't matter how good ONE generation was out of 20, 19 of them on Epic Mix and then later "ELLIS MIX" things were just not turning out.
We do get the if "HANDS ARE OK 50+% of the time you're good" mode, but with realism you have detail resolution errors with eyes. Eyes are the worst with realism - you don't mind imperfection fine - but you'll get oversharp, burned in pixelated errors that you just can't fix even with inpainting.
While the comparison up there looks FAIR and dandy?
We weren't keeping 90% of the failed gens because we figured we were just going to keep going.
But it wasn't JUST the failures...
The why we're struggling with realism (part one)
The why? Well, frankly we're not sure HOW to do realism with how long we've been focusing on Anime for. We've never REALLY been THAT enthusiastic about realism, as more just things that blur the lines.
We just know people ASK for realism (not just because the internet's for corn.)
Realism MERGING comes with all sorts of problems: It's FAR less of a gene pool to pull from and so you're basically playing the Hapsburg royalty genetic's game before you mess up and have to start again.
Lora merging isn't guaranteed to work with realism, we've realistically done it once - and that model kinda didn't turn out the best.
(In theory also we're bad at getting the realism VAE ready and still use WD vae with realism)
The why, how where and how to stick it (aka part two)
Apart from toying with hyper realistic models and having fun making textual inversions with them- one of the largest issues right now is:
The news circles.
Anything that LOOKS like it's under the age of 30 to some people is going to cause a huge lawsuit shitstorm at some stage. With the release of SDXL also, wtf's the point of doing a realism model on 1.5? (That's not to say anyone else CAN'T just -- hold up lol)
Content that is "UNDERAGE" of any sort is wrong yes, but realism is the focus with the media outlets.
We're not doing AI to break laws here, we're aware of civit AI's terms of service, and refuse to largely break it (bending it with dumb TI memes is different, everyone loves a good meme and you never wanna fully break something - mind you we think we're breaking it and we just get pat on the head getting told that's nothing XD).
Besides the part one issues, and the CONSISTENT struggle - we're impatient. We don't want to wait three months to wait for others to beta test. IF we find an error repeating in a model, before we even throw it to anyone who wants to play with it - we'll yeet it.
Also while we're aware that a REALISM model's intention is to never make illegal, dodgy shit with it - while making our TI's even a BORDERLINE picture that an adult goes "Nah that's an adult" the bot still picks it up as underage.
Now, that's not OUR personal fault per se - but if you see where i'm going with this?
The Drawing end of the Line
Anything that can be mistaken as underage is a legal lawsuit waiting to happen.
While this is true for illustrative and anime content, there's a big difference between Klee from Genshin Impact and someone that's 4 years old dressed as Klee in the model's pics being shared.
While it's clear that you can do this sort of stuff on Midjourney and not everyone's trying to be illegal and some people clearly have interesting reasons for doing so - it's not up to us when we're making the content to be consumed.
And 90% of the time no matter what LORA, TI or anything we'd make or test with it?
The men would look like they're 16-18 even without the textual inversions, and with them sometimes as well - the women wouldn't look much older either - and that's where the line stands - We don't feel comfortable continuing our realism line.
On top of that.
We MIGHT retire Epic V6 entirely for that reason alone.
We're not sure yet what we're doing, but there's a VERY LARGE CHANCE that "ELLIS MIX" won't officially release as a realism model and the name may go to a new anime or even 2.5d/3D line.
See, when you're talking like 20 odd mixes later on a Vast AI instance and it's still producing borderline things that shouldn't be shared on any website let alone the fact several times i had to go in and DELETE pictures that were coming out -- because i'm not about to get thrown in the slammer just for making a realism model.
This isn't calling anyone out btw.
Realism models ARE NOT bad, there are some CLEARLY AMAZING ONES out there, and we'll continue to play around and make textual inversions to help with those - and even loras.
But i think we're looking at dropping from the realism race, it's a level of stress we've never seen in ourselves making stuff.
I mean lmao - the Anime TI's we made with the embedding merge tool worked MOST of the time on the realism model... so there's always that (kidding - y'all don't want am odel full of winged realistic catboys XD) -- Who wouldn't want Anthony of Red hot chili pepper's 19th cousin from an alternative universe? (again kidding).
The Final Verdict
We'll take some time to think it over, but at this stage we're going to at least stop working on realism models.
Ellis Mix has a repo on HF right now, but if anything we're considering removing our realism lines entirely.
We'll make an announcement sooner or later about it.