Learned a lot from the community here, since a lot of people ask about it, finally decide to write down some notes, hope it's helpful. will keep on update it once have new findings.
Don't have much time after work and reply to all the people under comments, but I will try to reply the comment under this article, so welcome to correct me here or share your findings or results here.
Basics:
The easiest way for everyone to generate animations is to use the animated plugin in automatic 111, this is how I generate all the animations .
the latest extension will paste the generation information in the video meta data, download some latest video, you can see the prompts and parameters there, CIVITAI should be able to write tool to extract the information.
the github page for the plugin https://github.com/continue-revolution/sd-webui-animatediff
have plenty of information ,be sure to follow the instructions in the github to setup properly.
Using the extension is super easy,just like generate the images, enable the plugin and press the generate forever ,let your video card do the job.
I know Comfy have much powerful workflow , but this workflow beats comfy with simplicity and stability, and basically I only use text to video, it suits me well.and if you can generate a good and stable image with auto1111,you should be able to get video with same quality when enable the plugin.
I only use SD 1.5 model so far, will try XL if better motion model for xl comes out , the current 1.5 result already pretty good.
Parameters of animediff and explanation
model: v3_sd15_mm.ckpt, --->you can use v2 model, but v3 give more sharp image, and v2 doing well with unstable model. you can find more more motion models
video_length: 24, ---> don't make too long, or you will have higher possibility to get deformed body . Set it longer when you want to make morph animation , and design transition in the clip.
fps: 16, ---> keep it default ,
loop_number: 0,
closed_loop: R+P --->for me ,it's good to make looped animations
batch_size: 16,
stride: 1,
overlap: 4,
interp: FILM, interp_x: 4, --->make your animation longer, here is 4 times longer than your original clip, so it's better than most of the AI animation solutions .
Other general parameters :
generation parameters samples.
nothing special here, find keep the CFG low seems necessary when using v3 model, 5.5 seems to be a good number :
Steps: 20, Sampler: DPM++ 2M Karras, CFG scale: 5.5, Seed: 1150333683, Size: 512x768, Model hash: b2eb92023a, Model: henmixReal_v5c, VAE hash: 735e4c3a44, VAE: vae-ft-mse-840000-ema-pruned.safetensors, Denoising strength: 0.4, Clip skip: 2
Hires fix :
Hires upscale: 1.75, Hires steps: 10, Hires upscaler: 4x-UltraSharp,
Picture prompts:
like I said you might be able to find prompts in video meta data if generated with latest plugin, download the video and take a look .
something I found useful:
keep prompts and neg prompts as simple as possible.
use pose in prompts help to stabilize the animation, thing like,standing,walking,kneeling, sitting, big movement (dance , fight) no good result in most case.
using slider loras to fix certain elements to help stabilize the animation as well.
negative prompt still important,
Prompts travel:
you will need it when making morph animations or long animation clips with prompt keyframes. take a look at this great article
https://civitai.com/articles/2967/how-i-make-morph-animation-workflow
it's a great tool ,some key points here
make sure you see prompt travel is enables in console log.
disable dynamic prompt extension , be careful about the conflicts with other plugins. plugins that will modify prompts and seeds will affect the result.
before make long video make a simple one , test with open eyes and closed eyes , see if prompts travel working.
Base Model and
base model is very important , choose ones that generate less noise ,sharp images , and have more motion datas.
I only use SD1.5 so far, list some base mode used, tried some LCM model ,but the quality is an issue,feel free to try your favorit model and share with us:
henmix https://civitai.com/models/20282/henmixreal both the 4 & 5 model is great;
picx_real https://civitai.com/models/241415?modelVersionId=272376
dreamshaper https://civitai.com/models/112902?modelVersionId=251662
https://civitai.com/models/25694/epicrealism?modelVersionId=143906 absolutely wonderful for realism clips.
Lora
yes,real people lora works well with animediff and most of the lora as well. but if a lora generate a lot of noise and variation , the animation can be messed up. you might need to tweak the weight a bit to generate better result, weight bigger than 1 works with some lora if you failed to see it in animation.
some useful slider loras to stabilize ,you can find more in civitai
<lora:backlight_slider_v10:1>
<lora:people_count_slider_v1:4>
<lora:color_temperature_slider_v1:-2>
some lora can generate good character movement , if you have more findings of these lora,will be appreciated if you can share with us, something I used.
<lora:cowgirl-2.0:1>
<lora:ass against glass:1>
<lora:onoff4:1.2>
Gacha
the workflow use wildcard extension to make gacha and able to do some runtime modification with out restart the automatic1111, is one of the reason I stick with auto1111.
simple sample prompt like this
(best quality),half body shot,<lora:backlight_slider_v10:1>,__location__,
0: 1 ((monkey-__animal__-human creature)):1.5,(closed eyes),wedding dress,covered with fur,furry,smoke,mist,
8: 1 ((monkey-__animal__-human creature)):1.5,(open eyes),wedding dress,covered with fur,furry,smoke,mist,
16: 1 ((monkey-__animal__-human creature)):1.5,(open eyes),wedding dress,covered with fur,furry,fire,flame,
24: 1girl,fantasy movie,claws,__random_exp__,__hair-color__ __hair-female__ hair,__color__ eyes,fire,flame,
pick up nice ones after daily work is my favorite entertainment these days. plugin is hear
https://github.com/AUTOMATIC1111/stable-diffusion-webui-wildcards
How to get video detail meta info
right click video file.see the details->commentshttps://answers.microsoft.com/en-us/windows/forum/all/file-explorer-tagging-editing/411ec45f-09fb-4840-b0fb-5adf8fd02730
My understanding ,finding ,problem.
AI is not my area, and my understanding can be very wrong.
the animediff model will use generated picture as condition , and use noise to find next most fit latent image in the model. my understanding the process is bit like how are snowflakes formed,but it still don't have knowledge of movement (speed ,direction) and physics,walking backward is quite normal in generation.
And sometimes it's interesting to see SD model actually understand depth and try to solve collision problems somehow,sometimes it generate great fluid simulation without understanding physics!
the biggest problem for met is still the stability and general AI image problem (hands),share your solutions .
reference :
great guys give great tutorials already
https://civitai.com/articles/2877/how-i-make-animation-via-animatediff-tutorial
https://civitai.com/articles/2967/how-i-make-morph-animation-workflow
https://civitai.com/user/efastcurex/articles
Having fun!