Screencap Datasets: Where to Find Them?


Ever needed a simple fix without wading through or using complicated datascrapers? Just wanted to fish through a zip file of numerous images hand capped by gracious people in the interwebs? While this is partially a self-shill post (we just got into capping) we'll also introduce you to some websites, tumblrs and where to scroll around cheekily finding publicly accessible huggingface repos (not all are screencaps).

Part One

The Shill Introduction:

We just got into screencapping, and we're enjoying finding lost and found media doing it so we also wanted to take a moment to advertise this in this article as well as show you other places you can find stuff.

Isekai's caps is a joke name, if you're into anime or gaming you'll get the joke. (We did it around the time we made the Isekai Truck TI).

Our mission is to cap things we enjoy, as well as requests. This isn't going to mean you'll find everything we cap is going in to our data. No, just there's way too much that we've just gotten capped for that!

Where to find our caps:

You can find our caps at two locations:

Huggingface: (this will be pretty much random when we throw zip files in there, we don't have the ability to wait for ever to upload things lol)


Tumblr will also include reblogs of good content, or things we're wanting to find.

Example of what we've done: Raggedy Ann Movie (but before you use it, please remove any realistic content), Several older Kpop MV's, Random Hogan's heroes clips, and some found content fro Doujinshi and other things.

Whats to come from us: FFXIV, Bauldrs Gate, Genshin, Honkai and just some really off the wall laserdisc, betamx and VHS rips.

Part Two

Locations on Tumblr

Apart from OUR specific caps blog there are more locations on tumblr: - They have a really LARGE list, and some really large datasets including Sailor moon, and some older movies. - More or less a gallery, you'll need a "mass downloader" For this.

Clearly, there are MANY MORE on tumblr LOL. We're just scraping the surface.

Locations Elsewhere - Warning on this one the file sites aren't dodgy to get them from after this just that yo'ure gonna fight for ads etc.

Where else?

If you go peeking around certain anime nerds on huggingface they may have datasets that are out in public. I won't name or shame anyone that's not my game :).

Part Three

Why not use a datascraper?

Say you have something extremely rare, and your dataset scraper doesn't have access to the APi of that specific website. not everyone's scraping Danbooru for HD Sexy Raggedy Ann fan art or rare pictures of Colonel Klink in his maid Apron. (Please tell me the 2nd one exists on danbooru.)

So if you're looking for something out of the rarity, these are some great starting links.

Why cap things yourself?

If you're interested in knowing how to cap something, just take a quick google search for VLC screencapping. Clearly it's a preference what settings you use, and you already are likely a Lora or model trainer at the time of reading this.

What about Copyright?

"ITS FOR RESEARCH" - Look, it's already against copyright to screencap let alone rip half these videos in the first place.

What do you do when you're already scratching through tumblr or instagram for the "NEXT BEST ARTIST" - which is almost worse in some ways than doing billion dollar movies screencaps LOL. (EAT THE RICH! LOL)

Anyways, it's up to you - it's research, it's learning - and it's literally what it is.

Just promise me you're not going to cap the artist who just released their first animatic ever, and has no clue what AI is? Scratch that, most people are a "I DO WHAT I WANT AND DONT TELL ME WHAT TO DO". Just yea -- I won't even ask LOL.

What to watch for

When you're trying to either cap things yourself, or trying to find caps - you might be running into dodgy websites.

Don't go into the dark web, don't try scraping 4chan, and please for the love of god - use an adblocker LOL.

Beyond that I don't know how to help ya! just don't go clicking rogue ads!

Pinterest can be useful but watch out because there ARE LOWER quality niji gens on there.

I would recommend that youdon't try scraping tumblr for actual art, they're all glazing and "POISONING" their art like morons over there.

