home models images videos posts articles bounties challenges events updates shop

EnvyBetterHands LoCon

Name: EnvyBetterHands LoCon
Rating: 5 (11075 reviews)
Author: _Envy_

11.1k

123.1k

1.6m

968

Updated: Oct 5, 2024

concept

photorealistic hands

Verified: 2 years ago

SafeTensor

Details

Type	LyCORIS
Stats	115,907 1.6m 15.6k
Reviews	Overwhelmingly Positive (10,615)
Published	Apr 25, 2023
Base Model	SD 1.5
Training	Steps: 2,800 Epochs: 28
Hash	AutoV2 BA43B0EFEE

1 File

About this version

Restarted training from scratch, because apparently training on vanilla 1.5 is actually better in terms of making models that don't overcook things or change the style very much. This new version is still in need of more training, so it's not quite as effective as the old one, but it does seem to, on average, improve things a bit, and it works across a lot more models and doesn't mess with style at all, so I think this is probably the right direction to go in. I'll play around with prompting a bit and update the main description with advice.

_Envy_

This model is a LoCon. You MUST install the Lycoris extension for it to load.

I'm using Lora Block Weight. I believe you can also use Additional Networks and SD Webui Lycoris.

UPDATE 4/27/2023: I've hit a training plateau so I'm in the process of adding a bunch more images to the dataset, including some more complicated stuff like intertwined fingers. I'm probably going to have to drop the learning rate some more, so things may be slower from here on. I'll keep everyone posted as things progress.

UPDATE: Prompting advice for beta 2:

This is a completely new train on top of vanilla Stable Diffusion 1.5. I did this based on the advice of a fellow enthusiast, and it's surprising how much more compatible it is with different model. It doesn't mess with the style of your model at all as far as I can tell, and it really only affects hands and occasionally arms, leaving everything else untouched.
It seems to work best at a strength of 1, although turning up higher than that (1.5, 2, etc) can help it on some images at the cost of making it worse on others. No need to mess with your CFG scale, as it doesn't cause things to overcook at these levels.
Freely mix it with other LoRAs.
I've had best results putting "nice hands, perfect hands" in the positive prompt (increasing the weight makes things worse), and "(extra fingers, deformed hands, polydactyl:1.5)" in the negative prompt. This is on EnvyMix v1 (and probably RevAnimated), but YMMV for other models.
"Bad hands" negative embeddings appear to make it worse, although I haven't tested this extensively.
As usual, this won't work miracles, but I do find that over a large number of images, it does make things generally better on average. Hopefully this will continue to improve with a few more nights of training.

Prompting advice for alpha 3 and beta 1:

Note that this advice is for RevAnimated 1.2. YMMV with other models.
It overcooks things a bit, but you need the strength set to 1.0 for it to really work well. You can work around this by reducing the CFG value to 5 or 6 or so. I've had good luck with enabling the dynamic thresholding extension and setting it to mimic CFG 5, and then I can set my CFG value to 9 or 10 and things come out fine.
I tried using it with another LoRA and got some pretty strange results, so YMMV there as well. Right now I'm just trying to get it to work consistently in a simple use case.
Oddly, I think it's regressed a bit on hands in neutral positions, but it's noticeably better at more complicated interactions, such as holding objects (which is why I have so many pictures of blacksmiths and librarians in the example images).
Keep your prompts simple and it tends to do better.
With RevAnimated, I tend to get 1 or 2 usable images out of every 8, with a bunch of other ones that are pretty close and can probably be fixed with inpainting.

Prompting advice for alpha 2:

It's getting stronger now, and it works best around strength 1. Setting it to 1.3 like the previous version will make things look bad.
My negative prompt is still "(extra fingers, deformed hands:1.15), (worst quality, low quality, poor quality, bad quality:1.35)"
I had good luck just putting "nice hands" in the main prompt.

Prompting advice For alpha 1:

Your prompt should contain these words: "beautiful hands, perfect hands, fingernails". I've had the best luck with them towards the middle, and at no emphasis.
The alpha1 LoCon seems to work best at a strength of around 1.3 (on RevAnimated 1.1, where I'm testing it right now -- YMMV for other models)
Don't use negative embeddings for improving hands. When I removed badhandv4 from my negative prompt, things improved noticeably. You may want to try without any negative embeddings at all. I haven't used them for a while now.
My negative prompt is: "(extra fingers, deformed hands:1.15), (worst quality, low quality, poor quality, bad quality:1.35)", which I arrived at through a lot of experimentation adjusting strengths and terms one at a time. It should work decently well.
This all gives me hope that there's a real shot at solving hands on SD 1.5. Even with good prompting, I'm generally not getting perfect results, but things are close. I'll consider this done when it creates well-formed hands without having to add anything to the positive or negative prompt.

Now back to your regularly scheduled readme...

I'm testing the theory that maybe the reason MidJourney's hands are so much better now is that they just took the time to specifically train a network on a high quality set of pictures of hands, and literally nobody else has actually tried. This LoRA definitely isn't at MidJourney levels yet, but I've been training it over night for several nights now and adding to the dataset where it appears deficient, and quality seems to be steadily improving. As such, I'm going to post this now so people can start using it. Consider this an early alpha -- I'll only stop updating once it stops getting better.

Example images are cherry-picked. Please don't expect this model to make all of your hand generations better. It may even make some of them worse, so you should evaluate its usefulness on a large number of images and not just one. If it works for you like it does for me, the a lot of your results should be the same or better quality (some will just be bad in different ways).