Hello everyone,
In my latest update on Pony Diffusion, I expressed my interest in leveraging SD3 for the upcoming V7, so let’s talk about it!
Today, as Stability AI introduces their new SD3 Medium model, I extend my congratulations on this significant milestone in the open model community. Although it's only the 2-billion parameter variant (somewhat comparable in size to SDXL), it includes a number of serious technical improvements. I'm hopeful that SD3 will provide a strong foundation for one of the next Pony iterations and that we'll eventually gain access to larger variants.
As expected, the initial release of SD3 lacks robust community and fine-tuning support. My original goal was to tackle these challenges head-on to provide an early SD3-based model. Unfortunately, despite the interest in an SD3-based Pony, I must ask for your patience as I outline my roadmap and explain the decision-making process.
For newcomers, Pony Diffusion is a non-photorealistic, character-focused model that promotes a wide spectrum of creative expression. It supports natural language prompting and a broad range of artistic styles without relying on artist names. Starting with SD 1.4, Pony has evolved through eight versions, each refining its technical capabilities and community impact. The latest V6 model is based on SDXL but has established itself as a well-recognized base model despite its finetune origins.
Personally, I've been surprised by the popularity and ways users use my model to express themselves, but I cherish all (legal) uses of Pony Diffusion that bring joy to them.
As of today, Pony and its derivatives surpass base SDXL models in both downloads and generations on Civit. The rising popularity of Pony fills me with pride—it's not every day you successfully compete with corporations from your garage. However, I fully recognize that this is only possible due to the work of Stability AI. The synergy between Stability AI and its community has been profound in the text-to-image generation space.
Regrettably, the ambiguous rollout of SD3’s commercial licensing have been quite disheartening. The lack of clear and proactive communication from Stability AI, especially concerning the new model's commercial use, has left me in the dark as only the non-commercial license of the model was mentioned in initial release announcement.
Stability AI provides a Professional Membership, allowing users to pay a nominal fee and use the models commercially, as long as the company makes less than $1M in annual revenue. PurpleSmartAI is part of this program, but no announcements about whether the Membership will cover SD3 have been made, prompting me to seek clarification from the Stability AI Discord. While I’ve received some responses indicating that membership might suffice, I was not able to fully address the issue, it seemed even the active technical Stability employees were uncertain about the specifics and no one from the product side was present to provide any guidance.
So, why should I care about a commercial license for Pony, a model that has always been free? Pony is not just a labor of love but a significant investment. The extensive data preparation and costly GPU time underscore my commitment, and being able to monetize the model in various ways, like working with SaaS services, provides necessary support for development costs and covers costs of the Discord server where anyone can use Pony Diffusion for free. All these endeavors make Pony a commercial project, and I strive to do things the right way and ensure that Pony remains a responsible and community-friendly project—from respecting licensing requirements to honoring artist preferences via the Opt-out program.
During my efforts to clarify these concerns, I finally had the opportunity to engage with the SAI technical team as they were active in preparations for the SD3 release. It's difficult to determine whether their perspectives are representative of the entire company, but the conversations left a bitter aftertaste. It seems their understanding of Pony's purpose and technical underpinnings is limited, and they've been unexpectedly patronizing. At the heart of the issue, they appear to dismiss Pony as merely a (perhaps low-effort) niche-focused fine-tune, and they seem uninterested in my technical efforts.
This has been a letdown; however, there were still important topics to explore, such as improving CSAM protection in derivative models. These areas, which are critical to model creators and benefit from Stability's unique expertise, are worth discussing, even if our technical viewpoints are not aligned. Regrettably, my efforts to learn more about methods in these domains have gone unanswered.
We at PurpleSmartAI take these issues very seriously. From the project's inception, we have dedicated substantial resources to human and automatic moderation of our inference network and the data that feeds into our models. We will continue to invest in this area, even if SAI is unwilling to assist model creators, and we hope to share our tools and insights with the broader community.
The good news is that with the today's release of SD3, the new licensing terms are available... yet complicate things further. The "Professional Tier" has been replaced by the new "Creator License", which introduces a 6000 per month image limit. Anything above now requires an Enterprise License, which I would gladly acquire, and have reached out to Stability AI the day the new commercial license was pre-announced, but I have not received any acknowledgment or information.
In addition, the announcement states that the large-scale and enterprise license exists to "ensure that businesses can leverage the full potential of our model while adhering to our usage guidelines." However, what these guidelines entail is not clear. They may just be the Stability AI’s Acceptable Use Policy, but no official clarification has been provided. The AUP that SAI established are without a doubt a reasonable and welcome set of guidelines, however, this also may be an attempt to control the types of models created from the base SD3 through arbitrary enforcement. I wish I had clearer answers, but as mentioned before, so far, I have not received any communication from SAI on this matter.
So looking ahead, my enthusiasm for SD3 has waned, but my commitment to Pony has not. The upcoming V6.9 ("noice") will incorporate all technical improvements I’ve covered in the last V7 update, and I'm excited to share early samples as I begin training the model in the coming weeks.
As for SD3-based V7, I remain optimistic that eventually licensing concerns will be cleared and I am indeed unjustifiable worried about types of model enforcements in the enterprise license, which I will eventually acquire under reasonable terms. In the meantime, all I can do is wait.
In any case, thank you for your continued support. Stay tuned for the rollout of the next model, and let’s celebrate together when it arrives!