Type | Other |
Stats | 1,082 68 |
Reviews | (190) |
Published | Jun 24, 2023 |
Base Model | |
Hash | AutoV2 7737C257BE |
Stop! These models are not for txt2img inference!
Don't put them in your stable-diffusion-webui/models directory and expect to make images!
So what are these?
These are new Modelscope based models for txt2video, optimized to produce 16:9 video compositions. They've been trained on 9,923 video clips and 29,769 tagged frames at 24 fps, 576x320 res.
Note that they can look much better - I had to convert the mp4 outputs to gif for Civitai. We can also upscale these videos using the Zeroscope v2 XL txt2vid models, which I'm currently uploading!
Note: this model is the lighter version of the XL model (available here) which requires a lot more VRAM. If you have >15GB of VRAM, you should be using the XL version.
Where do they go?
Drop them in the \stable-diffusion-webui\models\ModelScope\t2v folder
It's imperative you rename the text2video_pytorch_model.pt to .pth extension after downloading.
The files must be named open_clip_pytorch_model.bin, and text2video_pytorch_model.pth
Who made them? Original Source?
https://huggingface.co/cerspense/zeroscope_v2_576w
What else do I need?
These models are specifically for use with the txt2video Auto1111 WebUI Extension