Sign In

Joschek's Batch Image Captioning Tools for Qwen-vl-3 (and other models)

3

Joschek's Batch Image Captioning Tools for Qwen-vl-3 (and other models)

i couldn't find a captioning tool that runs qwen vl 3 and is simple to use and has some features i need, so i vibecoded my own one.

it now has exactly the features i actually need and is actually quite a timesaver.

qwen vl 3 is the first actually good vision model i tried that i can easily run locally (IMO cogvlm, llava, moondream etc etc all suck)

so if you happen to be looking for something like this you can find it here:

https://github.com/realjoschek/joschekscaptioner

should be realtively easy to install in linux, no idea if it works in windows. if you have problems installing i probably cant help you.

If you find theres a feature missing, let me know,

if you find a bug: drop it into a good llm and ask it to fix pls. Thats all i would do as well XD

Have a good one!

3