Type | |
Stats | 937 |
Reviews | (120) |
Published | Apr 22, 2024 |
Base Model | |
Usage Tips | Clip Skip: 1 |
Hash | AutoV2 2678E58740 |
V5_SD1.5:
模型主要训练胶片摄影、计算摄影、数码原片三大类,对亚洲人像进行深入微调。目前训练集主体为亚洲年轻男性、女性、少量风光。使用GPT4V进行打标,部分使用Cogvqa、WD1.4混合标注,其中wd1.4标注的1girl、1boy已修改为woman、man,防止年龄错乱的情况发生。
注意:此版本为SD1.5版本,从其他模型页合并而来。同时模型并不是以全面均衡为训练训练目标,可能会存在明显的功能、审美偏差。
数码原片:Raw format
胶片类:Film photography
部分胶片型号:Fuji C100 shooting、Fuji C200 shooting、Kodak 400 shooting、Kodak gold 200 shooting、Nolan 5219 shooting
计算摄影:computational photography
画质评级:Mobile phone image quality、Landline image quality、Pager image quality
起手式(可根据实际情况修改,将计算摄影及以下评级加入负面,可获得画质的最大提升,反之同理):
正面提示词:8K,masterpiece, best quality:1.2,ultrahigh-res,
负面提示词:anime,cartoon,3D rendering,high saturation,facial blemishes,lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username,CyberRealistic_Negative-neg,(SkinPerfection_NegV15),
其余参数设置可参考示例图
存在的问题:1.5模型眼睛的处理始终很勉强,如需深化可借助XL模型对眼睛进行单独重绘。
模型基于LEOSAM FilmGirl Ultra训练,所有版权使用声明皆延续上一级模型
V4:
模型主要训练胶片摄影、计算摄影两大类,对亚洲人像进行深入微调。在优化核心数据的同时增加部分流行的胶卷型号。对图片进行画质评级,引入各种低质量图像对负面特征进行提取,增强画面质量控制能力。
The model mainly trains film photography and computational photography, and makes in-depth fine-tuning of Asian portraits.
Add some popular film models while optimizing the core data.
The picture quality is rated, and a variety of low-quality images are introduced to extract negative features to enhance the ability of picture quality control.
本模型使用LEOSAM's HelloWorld5.0大模型(后文简称HW5.0)作为基础模型,所有使用规则同样遵循HW5.0的申明,如果对本模型进行融合修改,请务必在简介中提及本模型以及HW5.0。以下为HW5.0模型地址:https://civitai.com/models/43977/leosams-helloworld-sdxl-base-model
This model uses the LEOSAM's HelloWorld5.0 large model (hereinafter referred to as HW5.0) as the basic model, and all the usage rules also follow the statement of HW5.0. If you modify this model, be sure to mention this model and HW5.0 in the introduction.
The following is the HW5.0 model address: https://civitai.com/models/43977/leosams-helloworld-sdxl-base-model
如果你喜欢我的模型,欢迎给我买杯咖啡或者来爱发电支持我。也欢迎多多返图点星评论,这对我真的很重要!
训练提示词概览:
分辨率这块优先推荐896*1152,其他参数可参考V4RC1、或者HW5.0的说明。
以下为重点训练概念及召唤词
胶片:Film photography
部分胶片型号:Fuji C100 shooting、Fuji C200 shooting、Kodak 400 shooting、Kodak gold 200 shooting、Nolan 5219 shooting
计算摄影:computational photography
画质评级:Mobile phone image quality、Landline image quality、Pager image quality
(画质依次递减:手机画质、座机画质、寻呼机画质)
训练过的负面提示:overexposed background,poor lighting,overexposed areas,uneven lighting,Low resolution,potential compression artifact(这部分我进行了测试,不会过度干扰画面)
使用建议:
在使用胶片时,输入Film photography+任意胶片(或者不加胶片名)
使用训练的负面提示和worst quality, low quality, illustration, 3d, 2d, painting, cartoons, sketch,
可选加computational photography、Mobile phone image quality、Landline image quality、Pager image quality
将计算摄影类目完全嵌入负面可能会导致过度的锐化的风险,同时由于数据主体依然是胶片,所以不加入特定提示词一般也默认为胶片效果。
在使用计算摄影时(即手机画质),输入computational photography+任意画质评级
负面可不加入训练过的负面提示,因为这些也正是计算摄影的特点。
不同摄影类别的画质对比:
不同胶片型号对比:
不同负面嵌入对比:
注意:所有摄影效果均不能代表现实世界中的胶片、手机拍摄效果,此处均为AI模拟结果,同时带有个人审美,请勿将模型效果对应到现实中的具体设备。
Note: all photography effects can not represent the real world film, mobile phone shooting effects, here are AI simulation results, with personal aesthetic, do not correspond to the model effect to the real specific equipment.
V4_RC1:
V4已经在本地训练了上百个小时,数据主体使用GPT4V进行打标,部分使用WD1.4+cog。模型也遇到了部分过拟合的情况,已经进行了第一轮MBW进行缓解。比较之前版本V4的手部结构有相对更好的表现,更灵敏的提示词反应,更激进的色彩调教。(由于数据的有限,并不能保证全面的效果)
V4_RC1:
V4 has been training locally for hundreds of hours, and the data body is marked with GPT4V and partly with WD1.4+cog.
The model has also encountered the situation of partial overfitting, and the first round of MBW has been carried out to alleviate it.
Compared with the previous version of V4, the hand structure has relatively better performance, more sensitive cue response, and more radical color adjustment.(Due to the limited data, the comprehensive effect can not be guaranteed.)
使用说明:
通常情况下我使用DPM++2M K或者restart,采样30-40(restart20),cilp终止1,CFG:5-7
ADetailer中重绘边缘模糊20,重绘幅度0.4。
可以接入refiner0.8获得更好的高频细节。
高清修复使用8x_NMKD-Superscale_150000_G,放大1.5,迭代12,幅度0.35。
负面提示词:(worst quality, low quality, illustration, 3d, 2d, painting, cartoons, sketch),mole,skin blemishes,(这个版本可能会出现皮肤瑕疵,增加最后两个可以有效缓解)
Instructions for use:
usually I use DPM++2M K or restart, sample 30-40 (restart20), cilp terminates 1pm CFG restart20 5-7.
In ADetailer, the redraw edge is blurred by 20, and the redraw range is 0.4.
You can access refiner0.8 for better high-frequency details.
HD repair uses 8x_NMKD-Superscale_150000_G, zoom in 1.5, iterate 12, and range 0.35.
Negative prompts: (worst quality, low quality, illustration, 3D, 2d, painting, cartoons, sketch), mole,skin blemishes, (this version may have skin defects, adding the last two can effectively relieve)
The model may still have problems, and now GPT4V marking + new process retraining has been used.
V3正式版说明:
采用自然语言在FP32下的训练尝试,无固定触发词,测试未发现有采样器存在问题。人种不稳定的情况下可以输入亚洲人来加强触发。根据实际情况开启高清修复、面部修复(脸部全屏镜头不建议面部修复)
2.1.2:
现在大部分采样方法在大部分场景下不再产生过度的噪点,所以refiner不再是必须,After Detailer 也可以正常使用。其余部分依然同下。
DPM++ 2M Karras、Euler、Restart表现相对更好。
Most sampling methods now no longer generate excessive noise in most scenarios, so refiners are no longer necessary, and After Detailer can also be used normally.
DPM++2M Karras, Euler, and Restart performed relatively better.The remaining parts are still the same as below.
2.1.0:
1:我进行了很多测试,目前最推荐的是使用euler和euler a采样器,最佳步数建议50,可以启用refiner,切换时机0.9。
其他采样器均会在部分场景上不同程度产生大量噪点,可以使用refiner降噪,这很有效。但是开启ADetailer后又会重新添加噪声,造成几乎不可用的情况。如果必须使用其他采样器,请不要开启修脸插件,直接高清修复,这也是我目前遇到的最大的问题。
2:CILP:2(使用1训练,测试开2无区别)
3:ADetailer仅在euler和euler a采样器时建议开启。
4:分辨率:896*1152(或者其他官方推荐的分辨率)
Instructions for use:
1: I have conducted many tests, and currently the most recommended ones are using Euler and Euler a samplers. The optimal number of steps is recommended to be 50, and the refiner can be enabled with a switching time of 0.9.
Other samplers will generate a large amount of noise to varying degrees in some scenes, and refiners can be used for noise reduction, which is very effective. However, when ADetailer is turned on, noise will be added again, resulting in almost unusable situations. If you have to use another sampler, please do not open the facelift plugin and fix it directly in high-definition. This is also the biggest problem I have encountered so far.
2: CILP: 2 (training with 1, no difference in testing with 2)
3: It is recommended to turn on ADetailer only when using Euler and Euler a samplers.
4: Resolution: 896 * 1152 (or other officially recommended resolution)
提示词部分:
1:使用了新的打标模型,cilp modle:VIT-L-14/openai,captain model:bilp2-flan-t5-xl,所以当你不知道如何描述画面的时候,不妨尝试使用这两个模型反推提示词,会获得最佳的效果。
2:关于召唤词:我也做了测试,结论是对画面会有一定影响,但不多,但建议还是带上。
3:重点来了!我统计了所有的字幕文件,总结出了最有效的几个tag:ulzzang,naver fanpop,ffffound,streaming on twitch,character album cover, 他们是除了召唤词出现频次最多的内容,你可以一次性全部加入作为起手,效果很好。
4:关于正面质量提示,我一般会使用这两个:<lora:DetailedEyes_xl_V2:1>,<lora:neg4all_bdsqlsz_xl_V7:1>,他们都来自@bdsqlsz,几乎没有污染。手部修复<lora:ClearHand-V2:1>,来自@frostyforest,对简单的手部关系处理的很好,复杂情况下确实是个难题。
5:负面提示词:(worst quality, low quality, illustration, 3d, 2d, painting, cartoons, sketch),greasy skin,
最后这是我的QQ交流群749047075 密码:SDS,感兴趣的小伙伴可以加入探讨交流,如果有新的模型我也会第一时间进行内测。
Reminder section:
1: We have used a new marking model, cip model: VIT-L-14/openai, and capture model: bilp2 lan t5-xl. Therefore, when you don't know how to describe the image, you may want to try using these two models to infer prompt words, which will achieve the best results.
2: Regarding summoning words: I have also conducted a test and the conclusion is that it will have a certain impact on the screen, but not much, but it is recommended to still use them.
3: Here comes the key! I have compiled all the subtitle files and summarized the most effective ones: ulzzang, naver fanpop, ffffffound, streaming on tweet, character album cover. They are the content that appears the most frequently except for summoning words. You can add them all at once as a starting point, and the results are very good.
4: Regarding positive quality reminders, I usually use these two:<lora: DetailedEyes_ Xl_ V2:1>,<lora: neg4all_ Bdsqlsz_ Xl_ V7:1>, they all come from @ bdsqlsz and have almost no pollution. Hand repair<lora: ClearHand V2:1>, from @ frost forest, handles simple hand relationships well, and is indeed a challenge in complex situations.
5: Negative prompt words: (best quality, low quality, illustration, 3d, 2d, painting, cartons, sketch), great skin,
Finally, this is my QQ communication group 749047075, password: SDS. Interested friends can join the discussion and exchange. If there are new models, I will also conduct internal testing as soon as possible.
采样器/步数 / Sampler/Steps:
召唤词 / trigger word