Type | |
Stats | 336 843 |
Reviews | (52) |
Published | Sep 22, 2023 |
Base Model | |
Trigger Words | s4s the pallas's cat |
Hash | AutoV2 B83E32F30D |
模型必备触发词:s4s the pallas's cat,建议搭配的其他提示词包括:manul、4K HD hi-res photo,realistic Hasselblad photography
模型建议权重:0.6~0.8
模型建议参数: Sampler: DPM++ 2M Karras或Restart、 CFG scale: 7~10、Size≥1024x1024、Steps≥30
心里话:
狲思邈(2015-2022.10.10)是一只从野外救助、曾生活在西宁野生动物园的雄性兔狲,是中国知名度最高的兔狲。
得益于西野圆掌多年如一日的不断科普,以及狲思邈和它的家人们的巨大魅力,兔狲近年来从罕为人知逐渐变为明星物种,和雪豹、荒漠猫等物种一起,极大提升了人们对青藏高原野生动物保护以及生态建设的关注度。
狲思邈的名字源于它第一次被观察到的交配行为只持续了4秒,从而谐音得名狲思邈。它的一生以一个调侃的名字开始,又以一个调侃的方式结束。去年的10月10日,因为吃鸡肉太着急,被鸡骨头卡住喉咙窒息而死。通常圈养兔狲的寿命在12-15岁左右,7岁还正值壮年,我身边喜爱兔狲的朋友们,当时都是巨大的震惊与难过。
今年2月份,我第一次接触了解LoRA模型,做的第一个模型便是兔狲LoRA。兔狲作为一个冷门物种,不管是midjourney、官方SD模型,至今都还没法准确地输出兔狲形象,而LoRA正是微调这一小众概念的绝佳方式。我的初版兔狲LoRA是类似filmgirl这样的通用性模型,使用了300张不同兔狲个体照片来进行训练。在做完这个模型后,我的第二个LoRA模型便是为狲思邈制作一个专属形象LoRA,为此我从兔狲朋友们那里收集到了共263张狲思邈的照片,其中70%来自@yspenny,30%来自@西宁野生动物园 、@圆掌 以及@天音文创馆 。但做成的效果并不好,主要原因有三方面:
1.绝大多数照片都是由手机拍摄、远景裁切,普遍画质模糊并带有自动滤镜色彩。
2.西野小猫馆的玻璃偏绿且反光严重,进一步劣化画质。
3.狲思邈一生都生活在小猫馆的小展室里,场景单一固化。
以上原因导致模型出图率较低,需要大量roll图。因此版本的狲思邈LoRA我并没有公开发布,而是经常笔记本跑图一晚上,然后第二天早上从中选择我认为最理想的图片发布在自己的动物科普账号上。我也曾发微博调侃这样大量的筛图是AI对我的反向微调洗脑,扭曲着我脑海中对于狲思邈真实长相的记忆,因此不得不筛选一阵子就去看看真实照片洗洗眼。
但如今,这些问题有了新的解决方式。虽然仍面临很多不足,但SDXL模型无可置疑要比性能更强,天花板也更高。赶在狲思邈去世一周年之前,我决定再次制作狲思邈的专属形象SDXL LoRA模型。面对SD1.5模型的不足之处,我尝试了以下改进方式:
1.考虑到SDXL模型对训练集的质量极为敏感,然而原有训练集质量参差不齐且普遍画质模糊,因此我重新精选训练集到224张,在淘汰一批图像同时,对于一些可以挽救的训练图进行了AI画质增强。
2.在正则化集中添加了1006张精选的全球各地兔狲个体照片。训练采用自然语言+标签混合打标方式,测试了不同触发词的触发效果,以追求狲思邈训练集与其他兔狲正则化集之间能恰当的区分和关联。尽量做到避免污染个体样貌同时,借助正则化集强化毛发细节,并泛化背景环境与表情姿态。
3.采用adam8bit、DAdaptAdam、Prodigy、adaFactor四个优化器分别进行了训练。对产出模型进行了对比与融合,最终选择出了色彩细节、泛化性以及相似度三个维度上最为平衡的一版作为正式版LoRA。
最终的完工结果让我满意,虽然仍存在SDXL远景模糊的通病,但模型在色彩、神态、环境上的泛化性相比模型显著提升。
以上就是狲思邈LoRA模型的完整心路历程,这注定会是个小众到只有几个人会经常使用的模型,但它在我眼中的价值并不比制作过的其他任何一个模型低,因为这个模型寄托着我和提供训练图的拍摄者们对狲思邈的纪念。生地短暂,活地辉煌,身处一个穷困的西北动物园没能拍出很多好照片,但不妨碍你赛博永生。一年过去了,我们很想你。祝你在狲星慢点吃饭、一切都好。
斯狲已逝,栩栩如生。
Model essential trigger words: s4s the Pallas's cat, other suggested prompts include: manul, 4K HD hi-res photo, realistic Hasselblad photography.
Model suggested weight: 0.6~0.8
Model suggested parameters: Sampler: DPM++ 2M Karras or Restart, CFG scale: 7~10, Size≥1024x1024, Steps≥30
Inner thoughts:
Sun Simiao (2015-2022.10.10) was a male Pallas's cat that was rescued from the wild and lived in Xining Wildlife Zoo. He was the most famous Pallas's cat in China.
Thanks to the consistent popular science efforts of Xining Wildlife Zoo and the immense charm of Sun Simiao and his family, the Pallas's cat has gradually gained fame from being unknown, along with the snow leopard and chinese mountain cat, greatly raising public awareness of animal conservation and ecological construction on the Qinghai-Tibet Plateau.
Sun Simiao got his name from his first observed mating behavior, which lasted only 4 seconds, thus the name came from a homophonic pun. His life started with a jesting name and ended in a jesting manner. On October 10th last year, he died of suffocation from choking on a chicken bone because he was eating too fast. Typically, the lifespan of a captive Pallas's cat is around 12-15 years. Sun Simiao was in his prime at 7 years old. My friends who love Pallas's cats were all greatly shocked and saddened.
In February this year, I first became acquainted with the LoRA model, and the first model I made was the Pallas's cat LoRA. As a lesser-known species, neither the midjourney nor the official stable diffusion models can accurately output the image of a Pallas's cat. LoRA is a perfect means of fine-tuning this niche concept. My initial Pallas's cat LoRA was a general model similar to filmgirl, trained on 300 photos of different Pallas's cat individuals. After completing this model, my second LoRA model was a dedicated LoRA for Sun Simiao. For this, I collected a total of 263 photos of Sun Simiao from my Pallas's cat friends, 70% of which were from @yspenny, and 30% were from @西宁野生动物园, @圆掌, and @天音文创馆. However, the results were not good, mainly for three reasons:
The vast majority of photos were taken by mobile phones, cropped from distant views, and generally blurred with automatic filter colors.
The glass of the kitten hall at Xining Wildlife Zoo has a green tint and severe reflections, further degrading the image quality.
Sun Simiao lived all his life in the small exhibition room of the kitten hall, resulting in a single, fixed scene.
These issues led to a low output rate for the model and required a lot of rolling to generate images. Therefore, I didn't publish this version of the Sun Simiao LoRA model publicly but often ran the notebook overnight to generate images and then selected what I thought were the best images the next morning to post on my animal science popularization account. I once joked on Weibo that this extensive image screening was AI reverse fine-tuning my brain, distorting my memory of what Sun Simiao really looked like, so I had to look at real photos to cleanse my eyes after a while.
Now, however, there are new solutions to these problems. While there are still many shortcomings, the SDXL model is undeniably stronger in performance, and the ceiling is higher. Just before the first anniversary of Sun Simiao's death, I decided to make another dedicated image upgraded base model LoRA model for Sun Simiao. In response to the shortcomings of the model, I tried the following improvements:
Considering that the SDXL model is extremely sensitive to the quality of the training set and the original training set is of uneven quality and generally blurry, I re-selected the training set to 224 images. In the process of eliminating a batch of images, I also used AI to enhance the quality of some salvageable training images.
I added 1006 carefully selected photos of Pallas's cat individuals from around the world to the regularization set. For training, I used a combination of natural language and label tagging, and tested the triggering effects of different trigger words to seek an appropriate distinction and connection between the Sun Simiao training set and the Pallas's cat regularization set. I tried to avoid polluting the individual's appearance while enhancing hair detail with the regularization set and generalizing the background environment and expression posture.
I used adam8bit, DAdaptAdam, Prodigy, and adaFactor four optimizers for training. I compared and merged the output models and finally selected the most satisfactory one. This way, the model could incorporate the strengths of each optimizer, creating a more robust and accurate representation of Sun Simiao.
In the end, the model produced was far better than the previous one. The images generated were more realistic and captured the essence of Sun Simiao much better. The model was also more consistent in generating images that were specific to Sun Simiao, with his unique markings and characteristics.
This project has been an emotional journey, as it allowed me to remember Sun Simiao in a unique way. I believe this model is a fitting tribute to his life, and I hope it will inspire others to learn more about Pallas's cats and the importance of wildlife conservation.