physical violence

revealing clothes

weapon violence

pg-13

corpses

wide hips

convenient censoring

oral invitation

thick thighs

huge breasts

downblouse

suggestive

sexy

pg13

sexual situations

disturbing

male nudity

female swimwear or underwear

male swimwear or underwear

partial nudity

graphic violence or gore

emaciated bodies

exposed female nipple

female nudity

undressed

male underwear

female swimwear

female underwear

breasts out

strapless leotard

breast out

one breast out

gigantic breasts

huge butt

covered nipples

hair over breasts

no panties

sitting on face

nude

lingerie

nsfw

adult toys

nudity

graphic male nudity

illustrated explicit nudity

graphic female nudity

sexual intent

genitals

porn

futanari

hentai

peeing

blowjob

sexual activity

vore

anal

dildo riding

oral

incest

hanging

hate symbols

nazi party

white supremacy

self injury

extremist

hate speech

diapers

urine

scat

child on child

bukkake

fellatio

bikini

cumshot

implied fellatio

eat_cum

cumdrip

cum in pussy

cum on face

after fellatio

cum on hair

cum on body

cum on tongue

cum on hands

cum in mouth

<h2 id="qwen3-8b-imagevideo-caption-(uncensored)">QWEN3-8B Image/Video Caption (Uncensored)</h2><a target="_blank" rel="ugc" href="https://civitai.com/articles/27449/qwen-videoimage-caption-full-nsfw-standalone">Version 2</a> - This version is highly attuned to NSFW content. However do to image only training it may generate some video captions as image.<ul><li>This version requires 24GB or more of VRAM</li><li>Full Finetune (NOT A LORA MERGE) of the 8B parameter model (Vision Frozen)</li><li>BF16/TF32 training unfortunately do to the size of the model Adam8bit needed to be used.</li><li>Version 2 Can use nearly any LLM prompt - Version 1 should use the prompt given in whole or in part.</li><li>Details regarding training of version 1 can be read about <a target="_blank" rel="ugc" href="https://civitai.com/articles/27203">here</a>.</li></ul><hr />Note: No image size safety is built in I have captioned 4k images which will be processed to a very large tensor shape - however reduction to 1k images is recommendI have an Ampere series card and can not convert this to FP8 or NF4 in high quality. If you have experience converting models with Linux and Transformer Engine DM me.

BF16_Version_2

BF16_Version_1.0

QWEN3-8B-VL Image/Video Caption (Uncensored)

Type	Other
Stats	79 0
Reviews	Positive (13)
Published	Mar 12, 2026
Base Model	Other
Training	Steps: 2,000 Epochs: 20
Hash	AutoV2 06A5AC3FBF