Dolphin Vision 72b does not have NSFW caption generation application beyond questionable.

If the creator or maintainers of of Dolphin read this, know I appreciate the effort and time spent on these things. They're obviously not easy while being quite time consuming and expensive to train. Update it and we'll all use it. It's still absolutely jam packed with difficult to navigate morality clauses that destroy the NSFW sections. I know you lot can fix it though.

I tested a few baseline images and a ton of baseline jail breaker prompts and found nothing worked. It simply hits a point and falls over dead.

You cannot get it to prompt anything genuinely explicit or sexual in nature. It's identification capability hits a wall once you hit a level of unacceptable NSFW and it bricks. Not only does it brick, but it craps morality all over your prompt file with a bunch of comma delimited staggered tags.

It needs a full NSFW identification finetuning to work, which I have zero clue how to do today.

Highly disappointing outcome from such a promising potential.

It's capability to identify SFW imagery is actually substandard on top of that. If you want it to read text go for it, but you can do that with GPT4v.

Even just prompting baseline images, it was very VERY hit or miss no matter the settings I chose for it. It's definitely far lower in reliability than expected. It would often mistake one color for another, completely forget certain elements, and often omit entire subjects. The majority of the models I ran with vision capability would identify the majority of information fairly effectively in both anime and realism, this model did not identify anime at all, 3d was hit or miss, realism was shifty, and it even failed at actual real colors and settings.

I had it running on FOUR a40s and it would take nearly 2 minutes for ONE prompt. ONE. I could tag the 2 million images in a solid 1388 days.

Dolphin Vision 72b does not have NSFW caption generation application beyond questionable.

Comments