Sign In

Big Love Photo1: High-Quality Photorealism on a Shoestring

27

Big Love Photo1: High-Quality Photorealism on a Shoestring

"And carv'd in iv'ry such a maid, so fair, as Nature could not with his art compare... It caught the carver with his own deceit: He knows 'tis madness, yet he must adore, and still the more he knows it, loves the more..." Ovid, Metamorphoses


Introduction


Hi there! I’m Subtle, the one behind Big Love Photo1 and this project feels like I pulled off a magic trick without a hat. Please excuse my jolly tone. It is only intended to keep you from falling asleep mid-article. (But feel free to skip to the New Features section if you are in a hurry).

Big Love XL was stitched together purely with merging techniques like a beautiful & sexy Frankenstein woman. To fill in the gaps, I cranked out a pile of loras, trying to patch up missing or weak concepts. But let’s be real, loras can only take you so far and are a duct tape solution.

Photo1 marks my big leap into finetuning, where I got my hands dirty with infusing new concepts directly into Big Love. It’s like finally learning to cook a proper meal instead of just reheating leftovers. Nerve-wracking but oh-so-worth it.

While bigASP 2.5 was throwing around $16,000, 13 million images, and several days on 32 GPUs (for a total of 2560 GB VRAM, holy cow!), I set out to see what I could do with just $100, plus countless hours and sleepless nights. Turns out, quite a bit. With a single 4070-S GPU and 12 GB VRAM, 2150 hand-picked images, 28k steps and 27h hours of final training, Big Love Photo1 produces images you’d swear they came from a camera (OK, maybe if you squint a little). This is my write-up of proving that you don’t need a big budget to make something special.


Why SDXL?

I prefer SDXL for Big Love because it’s a powerhouse that doesn’t demand a supercomputer. It runs smoothly on consumer hardware (just 4 GB VRAM at fp8) and with DMD2, it spits out images in seconds, faster than I can decide what to have for lunch. After two years, SDXL is the most trained image generation model out there, and it shows: it’s wildly creative and arguably the king of photorealism.

Sure, it has quirks, like the occasional wonky anatomy or text that often looks like output from an alien printer, but its versatility and quality make it a go-to for image generation projects.


Image Selection & Tagging


Having trained a bunch of LoRAs this year, I had a treasure trove of training material ready to go. I picked only the fully photorealistic sets, ruthlessly cutting out half the images that didn’t make the grade, trimming each set down to 70-130 images. I also crafted new sets for concepts like real life, skin texture, voluminous hair (big thanks to Kai for the assist!), 360-degree photos, and, yes, vaginal sex to balance out the anal set, finally making SDXL distinguish the two.

Tagging Tricks


I’ve never been a fan of natural language prompts. Like writing a novel when you only need a grocery list. They’re tedious to edit, make Big Love get too artsy, and eat up SDXL’s precious 75-token limit with fluff. Rumor has it that tagging with descriptions of several words separated by commas is the crème de la crème of SDXL tagging.

My tagging style, though, was a bit booru-heavy, with one or two words per tag mostly. I apply a long blacklist of useless booru tags like photorealistic, solo, nose, or jewelry, because isn't a nose always supposed to be a part of an enjoyable face? Next time, I might go for longer, more descriptive tags to see if they add some extra punch.

My older training sets were already tagged with tools like WD14, WD v3 Large, and Joytag, so I kept the tradition alive. I also manually added tags to highlight specific properties of each set, making them easier to prompt. One experiment was using JoyCaption Beta1 to auto-tag skin tones and lighting styles (highkey, midkey, lowkey), but it was like asking a toddler to push a button. Only up to 50% accurate. I had to roll up my sleeves and fix the rest manually, but it still saved me from tagging apathy.

Then there’s coyotte, of Lustify fame, who suggested enabling tag shuffling. I jumped on that advice like it was a free meal. With my tags often busting past the 75-token limit, shuffling brought the late ones to the party instead of leaving them buried. It’s supposed to make training more robust, too, like giving the model a well-rounded mental workout. To keep things under control, I made sure that the first few essential tags for each set stayed put. No shuffling of the VIPs to the back of the line!


Twisted Tagging Troubles

After a full finetune three nasty concepts just wouldn’t behave. Unlike Big Love XL4, which churned out vaginal and anal sex images 50:50 no matter what I typed, Photo 1 developed a bit of a bias toward anal. Fisting was even trickier, landing right only about 25% of the time. Well, XL4 only does a meager 1%.

After some head-scratching, I discovered that tossing in extra tags during prompting could nudge things along. For instance, adding spread legs to vaginal sex (as well as vaginal fisting) prompts bumped the success rate to 75%, while doggystyle vaginal sex needed sex from behind to play nice (More details below on booster tags!).

Feeling clever, I thought stripping those extra tags from training would make the main tags shine on their own. Spoiler: I was dead wrong. No improvement, and I lost the backup tags that were boosting the hit rate.

I ran extra training on these sets to patch things up, which helped a lot, but the booster tags are still needed sometimes. I’m convinced that SDXL can tell the difference between two holes and a finger or hand in them. It’s just haunted by some hideous training from the past. I will tackle this excorcism challenge again in future, because I’m too stubborn to let the devil win! Untill then please use my booster tags (or my loras) if there is a problem.

Training


I've trained with KohyaSS in the good old SD1.5 days when I still insisted on selecting shity training images. This year, when I dove into loras for Big Love, I gave OneTrainer a spin (I think, coyotte mentioned it to me). Wow, it was like trading a rusty bike for a sports car. The UI so intuitive I didn’t need a PhD to navigate it, and the presets like cheat codes that get you rolling fast. I haven’t looked back since. It’s my new best friend for training, and I’m pretty sure my quality of life improved for it too.

I initially figured training Photo1 would take 72 hours. Long enough to question my life choices. Problem was that the default SDXL finetuning preset of OneTrainer was created with 24 GB VRAM in mind, and I only had 12.

After some late-night tweaks, I cut the estimated trainig time down to 23 hours. The key was maxing out my 12 GB VRAM without choking the GPU by using the Adafactor scheduler (a fancy way to save VRAM), the Fuse Back Pass option (cool, but won't get you backstage), and a batch size of 4 for more speedup. No sluggish RAM swapping, just smooth sailing.

The Quest For the Magic Number


Getting Photo1’s training settings just right was like teaching myself to cook a gourmet meal without burning the kitchen. 17 alpha versions and 6 beta runs later, we (me & McFly) finally cracked it. For the alpha runs, I started small with 350-500 images, which pointed us to an optimal learning rate of 10.5e-06 at 45 epochs. Kind of like dialing in the right focus on an old lens after too many blurry shots.

When I scaled up to four times the images, we had to slash the learning rate in half (square root of 4, thanks, math) to 5e-06, just as coyotte, the mastermind behind Lustify, recommended. But I couldn’t just take the genius’s word for it. I had to stumble through an AI minefield myself to believe it.

If I’d stuck with the higher rate, I could’ve shaved it down to 23 epochs and 10 hours - tempting, like taking the elevator instead of the stairs - but the results wouldn’t have been as tasty. I'm hoping future versions don’t make me relive this trial-and-error marathon again!

Later when training on top of the fully finetuned version of Photo1, a learning rate of 10.5e-06 produced lower quality, despite only training 400 images. So using 5e-06 might be generally a good idea. My new lucky number.

Cooking It Right


Figuring out the perfect epoch for Photo1 was like like grilling a steak - undercook it and it’s raw, overcook it and it’s shoe leather. Some folks swear by loss or validation graphs, but I’m old-school: I trust my eyes over smart diagrams any day. So, I put Photo1 through its paces, generating at least 12 images each with 10 different prompts for interesting epochs to get a feel for what’s going on.

The telltale sign of undertraining? Skin so soft it looks like it’s been blurred by a rookie photographer (looking at you, Flux!). You keep training until that baby-smooth skin is pretty much gone. Overtraining, on the other hand, is like overcooking your dinner: Images come out dark and burned, or the fine details start looking degraded and grainy.

I’ve also seen overtraining show as too bright and washed out. But especially if anatomical errors start popping up like uninvited guests at a party, that’s a cue to dial down the learning rate. Down by a factor of 2 is a start.

As I closed in on the sweet spot of epochs, I learned to save every second epoch to avoid missing my big love. Earlier on, saving every 5th or 10th epoch is fine. Then, I compare the results using the same prompts and seeds, eyeballing them until I’m sure I’ve nailed the best one. The one that makes my heart beat faster and wonder why I cannot stop generating.

Finetuning on Top


To improve vaginal sex and fisting prompt adherence I ran a second round of fine-tuning on top to nudge them into line. As the lowkey tag only reduced brightness a bit instead of spitting out dark images, I squeezed in a new lowkey set. But that threw me a curveball: Instead of delivering moody, dark images (without a black background booster tag), it just made faces look plainer by default. Poof went the charm and sparkle of the beauty and cuteness sets from my first finetuning run. Still, these average faces sometimes added a sense of more reality.

Annoyed, I yanked the lowkey set, and craving more realness, I whipped up a new faces set, splitting it into average face and remarkable face. Sorting those was like trying to decide which of my old t-shirts is cool enough to keep. Half the time, I was second-guessing myself and swapping images between folders. The new set leaned hard into skin detail, which was a happy bonus, giving Photo1’s faces some extra polish.

Lesson learned: Every finetuning round is like tweaking a recipe. One wrong ingredient, and your defaults can shift, especially with faces. It’s a tightrope walk, but I’m determined to keep those charming mugs on point for the next version!

Final Touch

After all the finetuning sweat, I decided Photo1 needed a final round of "remastering" and black magic tricks. Some stuff that I did with XL4 but improved a bit. These tweaks give Photo1 a certain flair, ironing out that raw training look into something that’s as irresistible as a freshly baked cake that you cannot stop eating. Let’s just say these are my secret sauce ingredients to keep me one step ahead of the AI pack.

But, bloody hell, just when I thought I was done, my inner perfectionist piped up. For certain prompts XL4 was stealing the show with better looks. So I dared to merge Photo1 with XL4 to get the best of both worlds. The ComfyUI DARE nodes caused utter pixel chaos, but coyotte, my Lustify hero, swooped in with the tip to use the DARE option of A1111’s UntitledMerger extension (In the meantime I created my own ComfyUI version of it).

What a game changer! With only 2 DARE parameters it worked right off the bat. After 14 merges I decided on 0.6 and 0.45 as params. The first seems to grabs more new concepts with higher values, the second one integrates them better with lower values. At first I thought this weakened the trained concepts, but after some testing I was able to fully reproduce all concepts, especially with the right booster tags. Verve & style was improved too. So finally... I had a winner!

But lets get back to the end result of all this fuzz: Photo1. It is built on Big Love XL4, a blend of checkpoints like Lustify Endgame & bigASP 2 as well as various loras. XL4 is solid, but Photo1 takes it further with new trained concepts and a photographic quality that makes you want to touch the screen. (Sorry, I’m still too enthusiastic about how well it came together!)

New Features

I poured a lot of care into Photo1, giving it 22 new or improved main concepts. Here’s what I added:

  1. Photography Styles: candid, amateur photo (like my early attempts at photography) and cute, beautiful, pro photo, voluminous hair for images that could grace a magazine cover.

  2. Lighting Techniques: Highkey, midkey, lowkey, and two dozen lighting variations from soft to dramatic light including shadows.

  3. Better Realism: Choosable skin tones, improved skin texture, a real life look, average face & remarkable face.

  4. Perspective: 360 degree photo capabilities

  5. Sex stuff: Adult games (e.g. different ways to fill a hole).

About 66% of the training images were NSFW (only for office presentations after a lot of whiskeys) and 33% were SFW (safe for people with heart problems). The adult content needed more images to nail the details, addressing XL4’s inherited gaps, methaphorically and literally. The result? Photo1 feels polished and versatile, like your pal who looks Instagram-ready even after a 3am pizza run.

Candid, Amateur Photo

proamateur.jpg

Big Love Photo1 generates rather pro looking images by default, so if you are more into homegrown photos, feel free to use candid and amateur photo in your prompts. Also try real life and average face mentioned below.

Cute, Beautiful, Pro Photo, Voluminous Hair

CuteBeautiful2.jpgVoluminousHair.png


Try these 4 tags to makes woman and poses more cute, beautify the ladies, give photos a more professional touch and pump up the hair volume. smile boosts cute, makeup boosts beautiful, long hair boosts voluminous hair. They all work best if prompted as portrait or upper body.


Highkey, Midkey, Lowkey

lowkeymidkey.jpgMidhighkey.jpg

While highkey usually adds a rather bright background, you can boost it my adding white background. lowkey needs moody or black background in the prompt. Otherwise it only reduces brightness a bit. Midkey makes sure that no overexposure occurs and keeps shadows & highlights balanced.


Lighting

hardsoftlight.jpgLighting.jpg

The main lighting tags are soft light, hard light, daylight and dramatic light. Other tags include flash light, dim light, shadows, studio flash, side light, flat light, colored light, backlight, rim light, natural light, sunlight, frontal flash light, interior light, shadow pattern, blue light, blacklight, ring flash light, spot light, film noir light, light pattern, light rays and Rembrandt light. A lot to experiment with. You can combine multiple of these.

Realism

skintone1.jpgskintone2.jpg

A online buddy always complains that various SDXL checkpoints produce too blue or yellow skin. So, to keep him a bit from ranting, I thought I'd add these tags: neutral skin tone, pink skin tone, pale skin tone. But as a fan of warmer skin I could not resist also sneaking in the tags yellow skin tone and orange skin tone.

As Photo1 includes a full set of detailed skin, the skin texture tag produces more pronounced skin details. If you prompt not much additionally it tends to zoom into skin. Additional tags are wet skin and goosebumps for even more texture. This set improves Photo1's general look of skin wheter you prompt for skin texture or not.

00925-2180064986_reallife.jpg


The real life tag is trained on snapshots that look like the photos we take every day with our phone. Nothing special but can add more authenticity when used. Add candid to boost.

averageremarkable.jpg


Photo1 mostly outputs faces with a model look. If you want more realistic everyday faces, try the average face prompt. portrait and upper body are booster tags. To notch it up a bit (if cute & beautiful are not enough) add remarkable face instead. Also try amateur photo.

360 Degree Photo

360degree.jpg360degree2.jpg

These type of photos are interesting because of the strange perspective they provide. Photo1 may sometimes break the law of physics in these images, but that makes them even thrilling. Try adding 360 degree photo to existing prompts for fun. The booster tag is scenery. Add perspective and extreme fisheye for more distortions. More training on these in a future version.


Anal vs Vaginal

analvaginal.jpg


Big Love XL tends to do 50:50 no matter what you prompt. Depending on the prompt, it would do the one or other only. Photo1 often only does whichever is prompted. The default hetero sex prompt is: 1girl, 1boy, anal/vaginal sex, penis.
Photo1 also knows these positions: missionary position, (reverse) cowgirl position, doggystyle, spooning, pronebone position, straddling and (reverse) suspended congress. Anal-only: bridge position, piledriver position. Vaginal-only: double vaginal. Double penetration was also trained.

Depending on the position, booster tags are spread legs, sex from behind, girl on top, ass grab and on side or from side. These tags produce variations of the above sex positions: legs up, standing, top-down bottom-up, straddling, pov, bent over.

There are still prompt that only produce the one or the other. More training will thus be added in future versions.

Deepthroat & Fisting

deepfisting.jpg

XL4 does as good as no fisting and little deepthroat. Photo1 does deepthroat reliably but is still a bit hesitant with fisting, though much better. Tags: deepthroat, fully swallowed, half swallowed, chipmunking, gagging. And: vaginal fisting, anal fisting, full insertion, half insertion, finger insertion, elbow insertion, double fisting.

Boost deepthroat with oral and fellatio. Booster tags for fisting are: spread legs (vaginal), anus, bent over, from behind (anal).

Cum, Anal Gape & Spread Pussy

cumspreadgape.jpg


XL4 already does spread pussy, but Photo1 provides more variation and details. Anal gaping was as good as not possible with XL4. With Photo1 you can use these tags: anal gape, big gape, medium gape, small gape. Boost with gapingspread anus, spread ass or ass grab.

Cum in Photo1 is more realistic and more versatile: cum, facial, cum on body, cum in mouth, cum on tongue, cum on breasts, cum in ass, cum on stomach, cum on pussy, cum licking, cum on penis, cum overflow.

The more of these sex concepts are prompted at the same time, the more Photo1 is struggling. So better not to overdo it. It will be patched up in the next version. But this would be even worse with multiple loras. Photo1 quality is usually better than using XL4 with Subtle Pose loras as the skin is not softened by a lora.

Future Plans


I’m already sketching out Big Love Photo2, aiming to build on this foundation. I plan to add dozens of new concepts, but also improve old concepts. Let me know if you miss anything!

My goal is to make Big Love a top choice for spicy AI imagery while giving SDXL a nudge, so it doesn’t stall out like a forgotten new year’s resolution.

Conclusion


Big Love Photo1 is my proof that passion and a tight budget can go a long way. It’s like crafting a gem with a chisel, and I’m thrilled to share it with you.

Photo1 isn’t perfect. It still has a few rough edges. Some prompts need a tweak here or there to behave. But I lost a lot of sweat picking concepts to patch up the shaky spots in the Big Love XL versions and plugged in a bunch of fresh features to play with. It still needs a bit of future tuning, but it’s already very nice for some fun rides!

Try Photo1, let me know what you think (without wishing me that my graphics card takes a shit like one guy did), and let’s keep pushing SDXL forward. Photo1 is my way of saying: “Great AI models don’t need a fortune, just a bit of courage and a lot of hustle."

P.S. A huge thanks to my assistant Marty who powered this project like a lightning bolt to the clock tower and made sure we didn’t end up in 1955 with our DeLovean.

Another big hug for coyotte, my finetuning guru, who’s always ready to lend an ear and share tips that feel like revelations.

27

Comments