Is there currently a way to batch-download everything on Civitai and have it auto-update as the models Loras do?
I understand that this will be a massive repository but I will expand our storage to match. Is there an estimate on how many TBs or PBs there will be?
There is an extension called CIvitAI Helper. As i know, it gives you different useful tools to download and update all your stuff from Stable Diffusion. But you can't download everything, as i know.
I hope this can help at least.
Here's my estimate:
First, let's make the distinction of lora variants / checkpoints + SD1.5 / XL (and say we drop all 2.X related garbage for simplicity and the benefit of humanity).
SD1.5 checkpoints started from 8-11gb models, but the overwhelming majority soon became 2gb pruned only (or pruned + full, I don't think it necessary to collect both). I estimate ~3gb on average, up to 4gb. XL checkpoints are recent, ~3 months, and are 6.5gb a pop.
SD1.5 loras in the first few months were 144mb standard, then most people realised the default dims were excessive in most cases and dropped to 9 - 72mb. I would estimate the average at 50mb, though that's very uncertain.
XL loras had a similar process of going from 1.7gb to up to<100mb, albeit on a much shorter timescale, apart from a few purists and not so few with more difficult to train concepts. They would probably average ~200-300mb, but their number is currently not very high.
There's also checkpoint extractions which I think would be futile to keep since you've got the source, but let's say that's another 10% of about a tenth of the checkpoints (most are forgettable merges), insignificant.
There appear to be approximately 160 SD1.5 checkpoints a week, and 30 XL.
~1500 SD1.5 loras, ~200 XL loras.
This means increasing storage by, say, about 1TB a week. Civitai's been around since january (40 weeks), and XL since ~mid july (12 weeks), so that should be ~26TB for full history. I haven't taken into account multiple versions and lack the data for an estimate, let's say that makes x2 - x4 of those numbers (mostly due to the checkpoint version updates).
Personally, I believe sturgeon's law applies and you could very easily reduce the collection by quality metrics to ~10% with little loss of diversity, but if one has deep pockets 100TB doesn't break the bank either and saves the hassle.
You'd need to code a daemon to get the recent models from civ's API yourself, link below. I won't say how, since the servers are already collapsing under the pressure and I do not endorse additional redundant strain of this nature, but it's not difficult.