I use a lot of vehicles in what I make, and I generally find the experience pretty frustrating. It's hard to pose things the way you want, very hard to get vehicles have the shape you want and almost impossible to get things consistent from one image to the others. For characters we have a good array of openpose/depth libraries, but for cars it's very limited.
I thought about using 3D models, but I already suck at drawing 2D, I'm definitely going to suck at 3D modeling
That said, I can do Legos. I could make a Lego model of whatever vehicle/ship I want to render, and keep that around for all the different angles.
So here is my investigation into doing just that
Disclaimer: I used stud.io since it was a bit faster than trying to make something out of my own limited legos. I tried with my own models, then with official sets once I realized I actually suck at legos too. It's nice in that you can get a model done quick, you can easily keep it around without losing it or it taking space, you can easily take pictures, and there are thousands of available models. Only biggest flaw is that it's much much harder to move a model's articulations than if you had the real thing in hand
Disclaimer2: This was going to be cars + spaceships, but this is getting long already so I'll do spaceships in a follow up, sorry
Settings in common for all images:
Model: RevAnimated, 30 steps, Sampler DPM++ 2M Karras, 768x512, CFG 7
Hi-Res Fix Model: 4x_foolhardy_Remacri, 10 steps, Denoising 0.5, Upscale By 2
1 - Depth & HED
I used Toyota GR Supra 76901 for this.
Here are the images from the render I'll use: https://imgur.com/a/JFVyVO4
I tried only with Depth and HED models, either separately or combined. Here are the results:
Depth only, seed 1: https://imgur.com/a/9IQI5nY
Depth only, seed 2: https://imgur.com/a/ASCyUZG
HED only, seed 1: https://imgur.com/a/wRnscum
HED only, seed 2: https://imgur.com/a/ZWQbivG
Depth + HED, seed 1: https://imgur.com/a/EWzOBd7
Depth + HED, seed 2: https://imgur.com/a/zY97lVK
Some observations up to here:
All models are able to correctly follow the car shape and orientation, which is already useful
Depth makes cleaner images than HED, but has more trouble being consistent
Depth + HED looks like a good compromise, with consistent clean images
Color is an issue for all models, with random parts of the image being different colors every time
Front part is definitely difficult to keep consistent, with car lights being different on almost every seed (to be fair the lego set is very undetailed there)
Studs and plate lines definitely impact the final result. While it can be used to our advantage (unique stud placement helps identify it as a specific car), it's problematic as those studs tend to be rendered as something different every time. It will definitely need fine tuning depending on your Lego set, or even just covering all the studs
All images above were generated with ControlNet weight 0.8 and Ending Control Step 0.5. Since the shape of the model is very strong in previous images, here are more tries with Ending Control Step 0.3, same seed as "seed 1" above
Only depth, ending control step 0.3: https://imgur.com/a/R11ugGj
Only HED, ending control step 0.3: https://imgur.com/a/0sSm9V8
Depth + HED, both until 0.3: https://imgur.com/a/HVpER2V
Depth until 0.5, HED until 0.3: https://imgur.com/a/55y1FTH
Depth until 0.3, HED until 0.5: https://imgur.com/a/IEERTZo
Single models are barely useful here, giving a new car on almost every picture and having trouble to follow the car shape. Dual models fare better when at least one of them is at 0.5
2 - Depth & Reference
Since results are better but we still get variations on car lights and color, I wanted to try using a reference image to see if it changed things. For those I will just use Depth since HED tend to overpower the results otherwise. I also had to use "from above" and "from side" to some images for them to not completely break up
Here is the image used: https://imgur.com/kcGa652
Seed1: https://imgur.com/a/TCMuI95
Seed2: https://imgur.com/a/OtfNPnw
This completely breaks the angles that don't match the reference picture, but for all the "front" angles the result are pretty nice. You get more or less similarly shaped lights and hoods, even though the color is still a bit random
I think if you could get a front and back picture of the car you want to imitate and use them accordingly, this could go a long way
3 - Depth & LORA
Since reference image was working okay on front angles, I was wondering how LORA would fare. I tried 3: one similar to the original model, one pretty different, and one for generic car making. Again only Depth, and "from above" "from side" when relevant
First is this LORA for the Mitsubishi 3000GT: https://civitai.com/models/40497/mitsubishi-3000gt-vr-4
Prompt: <lora:3000GTVR4:0.8> red 3000GTVR4, night city
Seed 1: https://imgur.com/a/V21NvwS
Seed 2: https://imgur.com/a/bUS95Xx
This is getting pretty good, there is some consistency between images, and between seeds
Color is still all over the place
Next is this LORA for the Toyota AE86: https://civitai.com/models/11464/toyota-ae86
Prompt: <lora:AE86:0.8> car, mountain road
Seed1: https://imgur.com/a/gvPaoAY
Seed2: https://imgur.com/a/wKfeewH
This is... less good. The LORA has a lot of trouble following the Control, has some orientations are impossible to get.
I think in that case the options are:
use another Lego Model that better match your goal
tweak ControlNet settings until it works
maybe see if a more flexible LORA would help
Finally, zeekars / badass cars: https://civitai.com/models/54798/badass-cars
Prompt: <lora:zeekars:0.8> red sport zeekars, night city
I pushed Control Ending Step to 1 since the LORA is pretty strong and at 0.5 it would just do its own thing
Seed1: https://imgur.com/a/Qo7xMPV
Seed2: https://imgur.com/a/fuLlYJo
This one is a bit less interesting. Images are good by themselves, but there is almost no consistency between poses or between seeds
4 - Conclusion
I think this can be interesting, and can help generate exactly the cars you need, in the poses you need.
It's not perfect however, and will still need finetuning per image, lego models that match your intended use case, picking the right poses (pose #4 was almost impossible to get right, but could be done by using #5 and flipping it) and as always some gacha.
Please tell me if there are more comparisons that you want to see, or if there is something I can improve in the process.
There was supposed to be a Part 2 about generating spaceships, but the post is long and the hour is late so I will do that in a follow up post m(_ _)m
Cheers!