Type | |
Stats | 1,374 |
Reviews | (243) |
Published | Sep 12, 2023 |
Base Model | |
Usage Tips | Clip Skip: 2 |
Hash | AutoV2 AAA085EBF8 |
This is definitely pre-beta! ha!
Oh one note when you look at generated images here: MarvinAI-SynFlow was the development name I had - I changed it before uploading to the Fusioncore name. I'd update the photos here to reflect the new name since the model is identical, but I can't figure out how to edit the photos on the model editing page (I know, right? ... engineer can't figure out engineering. sigh...) Anyway, this is my first shared public model. upload here so I'm learning
Donations: https://www.buymeacoffee.com/quadpipe
My background is in product design and engineering. I used to work at MIT and Dreamworks - I've been working with AI for a couple of years, but this year is my first deep dive into model creation. That doesn't make me any better at this than anyone else, but it may help to understand why I'm approaching my process the way I am. Anyway, I love the work you guys are putting together and thought I could add some value, too.
I constantly notice that regardless of how well rendered many photographic images are, they tend to miss something "human" about them. There's a certain essence to being human that's hard to capture by just throwing great photos at a training regime. So, I focused my training on photos that captured emotions AND photographic principles. I hoped the generations would find an easier path to "getting it" when a scene was described in a prompt. Right now, the prompts are a little heavy, and while it does pretty well with light negative prompting, using things others have tested is a great way to validate what we're doing. So, these prompts in the sample images may be heavier than needed. Actually, they are pretty heavy, but they don't need to be - have fun experimenting and let me know what you learn.
My images used A LOT of steps, but you don't need to. It was intentional on my part to see how far I needed to go to get a terrible burn-in. I had pretty good success in the 30-50 range, and many of the photos looked great at 512x768 without upscaling. These are mostly upscaled to 2x. I used DDIM often but had solid success with Euler and others. It is important to note that the only reason Epochs and Steps aren't listed is because of how many models were created to be merged together here. Also, there's some secret sauce in that.
Once I feel like this is a solid 1.0 candidate, I'll try to take what I've learned to an SDXL model - that might be a little while because I'm not sure if I can just upscale what I need or if I have to generate a ton of images again to build the training data.
So, how did I approach this?
I created about 18 models focused on elements of photography that I thought were appropriate. For example, I made a model called "contemplation" that tried to capture photos of people in deep thought. I focused on emotions and the essence of what being human is all about.
After creating my models, I merged them with the SD1.5 base and tested and tweaked - for months. As I went, I would develop new submodels to fix things based on successful images generated. I was trying to break away from inheriting derivative challenges (still working on that), so I only relied on images generated from other models rather than merging directly. Like everyone else, hands can be a challenge, and I'm still working on improving those.
If you want to see me do more here, I'll do my best. I really appreciate any donations, and it means a lot to me. Thank you. (I'll put this line at the bottom too, it may help my chances a bit)
Donations: https://www.buymeacoffee.com/quadpipe
These models inspired me, and I used images generated from them occasionally to build my training data.
Analog Diffusion by wavymulder
AbsoluteReality by Lykon
CyberRealistic by Cyberdelia
Analog Madness by CornmeisterNL
A-Zovya Photoreal by Zovya
epiCRealism by epinikion
Juggernaut by KandooAI