Type | |
Stats | 18,093 |
Reviews | (1,039) |
Published | Feb 29, 2024 |
Base Model | |
Hash | AutoV2 B363551A70 |
🌍欢迎加入QQ群"兔狲·AIGC梦工北厂",群号 :780132897 ;"兔狲·AIGC梦工南厂",群号 :835297318(入群答案:兔狲)。Telegram群聊“兔狲的SDXL百老汇”,链接:https://t.me/+KkflmfLTAdwzMzI1
🚨Recommended parameters for FilmGirl Ultra:
Clip skip:1
CFG scale: 9
Direct output image resolution: ~500,000 pixels (640x768)
2024.2.29 Introducing "FilmGirl Ultra",Say goodbye to the AI face of SD1.5
On February 24th last year, I completed the first version of FilmGirl LoRA. This LoRA was my first model to achieve a high download volume and marks the beginning of my dream in AI. Since the launch of SDXL, I have devoted a great deal of effort to improving the HelloWorld and AIArt XL models. It has also been 8 months since the FilmGirl series was last updated.
In fact, whether it's FilmGirl, or the subsequent Polaroid LoRA or Helloworld XL, I have always been pursuing the ultimate in photorealism. Now a whole year has passed, and to commemorate the first anniversary, I have decided to release a model that elevates the photorealism of SD1.5 to new heights. The LoRA model is no longer sufficient for this mission; the new FilmGirl Ultra is an SD1.5 base model.
To completely break away from the homogenization of SD1.5 photorealistic models and the issue of AI faces, FilmGirl Ultra didn't choose basilmix, chilloutmix, or their descendants as the training base model, but instead selected the newly released SPIN-Diffusion by UCLA. SPIN-Diffusion is a Self-Play Fine-Tune SD1.5 base model using the winner images of the pickapic_v2 dataset, which outperforms the SD1.5 original base model and SD1.5 DPO base model, and its prompt alignment performance is far superior to heavily fine-tuned and merged base models like Chilloutmix.
The training set for FilmGirl Ultra comes from HelloWorld XL. In fact, the first version of HelloWorld XL also used the training set from the last version of FilmGirl LoRA. Throughout this year, I have been meticulously accumulating and selecting this training set, which now totals ~10,000 images. The training process for FilmGirl Ultra utilized multiple labeling methods, including GPT4V natural language captions, GPT4V tag-style captions, and Blip+Clip captions. To ensure that the model is compatible with the commonly used prompts "1girl", "best quality", and "masterpiece", these terms were also appropriately added to some images (but you can still accurately trigger the effect of a little girl with "little girl/child girl"). The reason for using multiple sets of labels is to maximize the likelihood of triggering the desired effect. As part of the FilmGirl tradition, the film style has been given special attention, and you can trigger this style with the prompt "film grain analog photography".
This model underwent a total of 7 training phases, with different batch sizes, optimizers, learning rates, and training set ratios used in each phase to achieve the current effect. If anyone is interested in fine-tuning SPIN-Diffusion, I recommend that your total training iterations should exceed 50,000 steps; in fact, I trained for about 100,000 steps with batch sizes ranging from 40 to 64.
The photorealistic effect of FilmGirl Ultra exceeded my expectations and is now close to the image quality of SDXL. Below is a comparison of this model with Realistic Vision v6 and epiCPhotoGasm, the former being the currently highest downloaded base model on Civitai, and the latter being the most photorealistic SD1.5 base model in my opinion for a long time. I pay tribute to these two excellent base models and their creators.
close-up couple's portrait,African young woman and man,clear skin face,looking at camera,fashion photography,simple background
Negative prompt: watermark,anime,cartoon,open mouth
close-up couple's portrait,African little girl and boy,clear skin face,looking at camera,fashion photography,simple background,
Negative prompt: watermark,anime,cartoon,open mouth,
Thanks to GPT4V captions and the SPIN-Diffusion base model, the model's prompt alignment performance is excellent. Below are some xy plot tests for different concepts.
Ethnic test
Body shape test
Skin color test
Age test
Animal test
However, FilmGirl Ultra doesn't lead in all dimensions. After all, it started from a new point and gave up on the continuous optimization and refinement of the community's 1.5 base models over the past year. Through extensive testing and comparison, I found that this base model has a higher rate of limb errors than the community's mature realistic models. Also, due to a lack of anime-related content in the training set, the output is not good when your prompts involve related tags of ACGN. It is recommended to avoid using words like "digital art", "anime", "cartoon", etc. These two issues are the main current shortcomings of FilmGirl Ultra.
FilmGirl Ultra is an annual summary of my first year on my AI journey, a gift to those AI enthusiasts who have supported me. The open-source community has brought me many friends, memories, joy, and knowledge. I also hope to contribute a bit back to the community. I welcome everyone to base your model training or merge it with FilmGirl Ultra. If you find this model helpful in improving your own model, please mention it in the model description. I hope that FilmGirl Ultra and SPIN-Diffusion will become more widely known and used.
FilmGirl Ultra will continue to be updated, and I wish everyone happy usage!
Hope we can continue to progress with AI, and meet here again this time next year!
去年的2月24日,我完成了第一版FilmGirl LoRA制作。这个LoRA是我的首个高下载量模型,是我的AI之梦的开始。自从SDXL推出后,我将大量精力投入到HelloWorld和AIArt两个XL大模型的改进中。FilmGirl这个系列也已经8个月没有更新了。
其实不管是FilmGirl,还是后来的拍立得LoRA、Helloworld XL,我一直都在追求极致的写实感。如今已整整一年过去,作为一周年纪念,我决定推出一个可以将SD1.5的写实感抬升至新高度的模型,LoRA模型已不足以承载这个使命,新的FilmGirl Ultra是一个SD1.5大模型。
为了彻底摆脱SD1.5写实感大模型的同质化和AI脸问题,FilmGirl Ultra没有选择basilmix、chilloutmix及其子子孙孙们作为训练底模,而是选择了UCLA最新发布的SPIN-Diffusion。SPIN-Diffusion是一个使用 pickapic_v2 数据集胜者图像进行自我对弈微调的SD1.5底模,其表现优于SD1.5原始底模以及DPO底模,同时提示词对齐性能远好于Chilloutmix等经过大量微调与融合的底模。
FilmGirl Ultra的训练集来自HelloWorld XL。实际上HelloWorld XL的第一版所使用的训练集也来自最后一版FilmGirl LoRA。这一年我都在精益求精地积累和筛选该训练集,如今整个训练集数量已达到1万张。FilmGirl Ultra的整个训练过程使用了多种打标方法,包括GPT4V自然语言caption、GPT4V 标签式caption、Blip+Clip caption。同时为了使得该模型可以兼容大家超常用的1girl、best quality、masterpiece三个词,也适当地在部分图像中添加了这三个词(但您仍可以通过child girl/girl这两个词准确触发小女孩效果)。之所以使用多套打标,是为了使训练集的效果可以尽可能高概率地触发。同时作为FilmGirl的传统,胶片风格被重点关注,您可以通过film grain analog photography来触发该风格。
本模型进行了共7阶段的训练,不同阶段选用了不同的batch size、优化器、学习率以及训练集比例,方才达到了目前的效果。如果有朋友同样对微调SPIN-Diffusion感兴趣,我建议您的总体训练迭代步数应在5万步以上,实际上我以batch size 40~64,共训练了约10万步。
FilmGirl Ultra的写实效果超出了我的预料,已经与SDXL的图像效果接近。上图中列出了该模型与Realistic Vision v6以及epiCPhotoGasm的对比,前者是目前C站下载量最高的1.5底模,后者是我心目中长期以来最为写实的1.5底模,向这两个优秀底模以及其背后的作者致敬。
同时得益于GPT4V打标以及SPIN-Diffusion底模,该模型的提示词对齐性能优异。
但FilmGirl Ultra也并非在所有维度都全面领先。它毕竟是从一个全新起点出发制作,放弃了社区一年多来对1.5底模的不断调优打磨内容,经过我的大量测试对比,该底模的肢体错误率要高于社区成熟的写实模型。同时由于训练集缺乏二次元内容,当你的提示词中涉及二次元相关tag时,出图效果不佳。建议大家避免使用digital art、anime、cartoon等词。这两个问题是FilmGirl Ultra目前最主要的两个缺陷。
FilmGirl Ultra是我AI之旅第一年的年终总结,是我送给支持我的AI同好们的礼物。开源社区为我带来了诸多朋友、回忆、快乐以及知识,我也希望回馈社区做出自己的一点点贡献。希望上述的模型制作总结能为大家带来一些帮助,同时也欢迎大家基于FilmGirl Ultra进行你的模型训练或融合。本模型与其训练底模SPIN-Diffusion一样,请大家遵循Apache-2.0许可证使用,否则将被追责。如果您觉着这个模型有帮助您让自己的模型变得更好,请在模型说明中提及下它,希望FilmGirl Ultra以及SPIN-Diffusion能被更多人了解和使用。
FilmGirl Ultra后续还会持续更新,祝大家使用愉快!
希望我们能随AI一起不断进步,明年此时,仍能在此相遇!
版权声明:
FilmGirl Ultra系列模型(以下简称“本模型”)是由我(以下简称“所有者”)基于SPIN-Diffusion开发的SD1.5大模型。
所有者授权个人或机构可免费使用本模型所生成的图像用于非商业性质的教育或信息传播目的,并且:
- 遵守相关法律规定,不侵犯本模型或任何第三方的合法权益。
- 在使用图像时需注明图像来源为“由LEOSAM's FilmGirl Ultra大模型生成”。
对于商业目的的使用,必须先与所有者签署商用授权协议。有关商业授权和模型定制事宜,请通过所有者在Civitai平台的主页信息联系。
所有者将持续为个人玩家免费提供FilmGirl Ultra模型的更新,以此表达对社区开源贡献者的支持和感谢。商业用户的有偿合作是推动本模型开发和持续改进的重要动力。感谢每一位用户的理解与支持。
请注意,任何未经授权的使用行为都可能违反相关法律规定,并可能承担法律责任。本声明的最终解释权归所有者所有,并受相关法律法规的约束。