Sign In

Video test

Keep in mind that this is my first video test, most of this was more a learning process for me then a real guide.

I first tried using different extensions for videos, mov2mov, animate-clip, TemporalKit but I couldn't get any of those working with adetailer/multidiffusion/controlNet. So I just ended up doing every single frame. Which is NOT the best way to do this.

Here is what I did :

  1. extracted frames using VLC with a ratio to get a 20fps (932 images for 46sec)

  2. first batch generation with a model that makes basic colors to flatten the image, Multidiffision to upscale + controlNet (openpose + softedge) and get a nice static background

  3. 2nd batch generation using previous result but this time with adetailer for body + face using openpose controlNet for both

  4. using flowframes to interpolate frames to get a 120fps look result

All programs used :

  • avidemux : cropping video + fps target + re-encoding of the frames for video

  • Flowframes : interpolation of the final video

  • GIMP : retouching of bad frames (had to manually retouch around 40 frames, removing background faces, etc.)

All A1111 extensions used :

  • ControlNet

  • Tiled Diffusion

  • Adetailer

Model used :

First pass generation :

1girl,dancing,cat ear,white tee-shirt,long blond hair,barefeet,<lora:cartoony:0.3>,masterpiece,8k resolution,HDR,
Negative prompt: (worst quality, low quality:1.2),verybadimagenegative_v1.3,nsfw,lowres,bad hands,bad anatomy,watermark,badhandv4,bad-hands-5,
Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 3500173058, Size: 808x992, Model hash: 2c3bbd47cb, Model: flat2DAnimerge_v10, VAE hash: df3c506e51, VAE: kl-f8-anime2.ckpt, Denoising strength: 0.5, Clip skip: 2, Mask blur: 4, ControlNet 0: "Module: openpose_full, Model: control_v11p_sd15_openpose [cab727d4], Weight: 1, Resize Mode: Crop and Resize, Low Vram: False, Processor Res: 512, Guidance Start: 0, Guidance End: 1, Pixel Perfect: True, Control Mode: Balanced", ControlNet 1: "Module: tile_resample, Model: control_v11f1e_sd15_tile [a371b31b], Weight: 1, Resize Mode: Crop and Resize, Low Vram: False, Threshold A: 1, Guidance Start: 0, Guidance End: 1, Pixel Perfect: True, Control Mode: Balanced", Lora hashes: "cartoony: 9d01a9eac50f", TI hashes: "verybadimagenegative_v1.3: d70463f87042, badhandv4: 5e40d722fc3d, bad-hands-5: aa7651be154c", Version: v1.6.0

2nd pass generation:

masterpiece,8k resolution,HDR,
BREAK
1girl,dancing ,bare feet, cat ear, cat tail, long white hair,
BREAK
bedroom background,

Negative prompt: (worst quality, low quality:1.2),verybadimagenegative_v1.3,nsfw,lowres,bad hands,bad anatomy,watermark,badhandv4,bad-hands-5,
Steps: 24, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 801237163, Size: 808x992, Model hash: 2c436fe5d3, Model: camelliamix_v3, VAE hash: df3c506e51, VAE: kl-f8-anime2.ckpt, Denoising strength: 0, Clip skip: 2, ADetailer model: person_yolov8n-seg.pt, ADetailer prompt: "1girl,smiling, dancing ,barefeet, cat ear, cat tail, long white hair, ", ADetailer confidence: 0.3, ADetailer dilate/erode: 4, ADetailer mask blur: 4, ADetailer denoising strength: 0.4, ADetailer inpaint only masked: True, ADetailer inpaint padding: 32, ADetailer ControlNet model: control_v11p_sd15_openpose [cab727d4], ADetailer model 2nd: face_yolov8n.pt, ADetailer prompt 2nd: 1girl, ADetailer confidence 2nd: 0.3, ADetailer dilate/erode 2nd: 4, ADetailer mask blur 2nd: 4, ADetailer denoising strength 2nd: 0.39, ADetailer inpaint only masked 2nd: True, ADetailer inpaint padding 2nd: 32, ADetailer ControlNet model 2nd: control_v11p_sd15_openpose [cab727d4], ADetailer version: 23.9.3, TI hashes: "verybadimagenegative_v1.3: d70463f87042, badhandv4: 5e40d722fc3d, bad-hands-5: aa7651be154c", ControlNet 0: "Module: openpose_full, Model: control_v11p_sd15_openpose [cab727d4], Weight: 1.0, Resize Mode: ResizeMode.INNER_FIT, Low Vram: False, Guidance Start: 0.0, Guidance End: 1.0, Pixel Perfect: True, Control Mode: ControlMode.BALANCED", Version: v1.6.0

Link to the original video :