Image To Prompt.json

<img src="https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/a4f261ce-7529-458d-83a7-f5d2ea605b94/width=525/a4f261ce-7529-458d-83a7-f5d2ea605b94.jpeg" /><h1 id="what-is-joycaption-vx0bj6p8j">What is JoyCaption?</h1>JoyCaption is an innovative tool designed to enhance the training of image diffusion models. Its primary function is to automatically generate descriptive captions for images, offering several key benefits:<ul><li>It enables training or fine-tuning these models on a much wider range of images without relying on pre-existing captions or manual descriptions.</li><li>It significantly improves the quality of images generated by Text-to-Image models, as referenced in the DALL-E 3 research paper.</li></ul>The goal of JoyCaption is to provide a powerful, free, open, and unrestricted solution, delivering performance comparable to GPT-4 for caption generation.For more information:<a target="_blank" rel="ugc" href="https://github.com/fpgaminer/joycaption"> GitHub - JoyCaption</a><hr /><h1 id="where-and-how-to-install-joycaption-hrafem8qn">Where and How to Install JoyCaption?</h1>You can find all the necessary installation information for JoyCaption with ComfyUI on this Git repository: <a target="_blank" rel="ugc" href="https://github.com/EvilBT/ComfyUI_SLK_joy_caption_two/blob/main/readme_us.md">ComfyUI_SLK_joy_caption_two - ReadMe</a><hr /><h1 id="workflow-with-comfyui-and-joycaption-l9iz02141">Workflow with ComfyUI and JoyCaption</h1>The workflow for using JoyCaption with ComfyUI is divided into three main steps:<ol><li>Loading Images<ul><li>Images can be imported from a local disk, via a URL, or by loading a folder containing multiple images.</li><li>This feature is especially useful for preparing training datasets for LoRAs.</li></ul></li><li>Loading and Configuring the VLM Model<ul><li>The VLM (Visual Language Model) is used for inference, i.e., text generation (captions or prompts).</li><li>Adjustable parameters include:<ul><li>Caption type: description, training prompt, art critique, etc.</li><li>Text length: short, medium, long, or very long.</li><li>Model temperature control, allowing you to adjust the creativity or precision of the responses.</li><li>The ability to customize the prompt to guide text generation.</li></ul></li></ul></li><li>Saving Results<ul><li>Generated texts and their corresponding images are saved in the same folder with matching names, simplifying the preparation of training datasets for LoRAs.</li><li>Images are automatically resized to a maximum height and/or width of 1024 pixels.</li></ul></li></ol>

minature.png

Image to Text - Workflow (Joy-Caption two)

physical violence

weapon violence

wide hips

revealing clothes

thick thighs

downblouse

convenient censoring

huge breasts

pg-13

corpses

suggestive

oral invitation

pg13

sexy

sexual situations

male nudity

disturbing

male swimwear or underwear

female swimwear or underwear

partial nudity

undressed

female nudity

breasts out

exposed female nipple

breast out

lingerie

male underwear

hair over breasts

female swimwear

gigantic breasts

no panties

graphic violence or gore

covered nipples

huge butt

strapless leotard

sitting on face

emaciated bodies

one breast out

nsfw

female underwear

nude

graphic male nudity

adult toys

illustrated explicit nudity

nudity

graphic female nudity

hentai

futanari

porn

sexual intent

genitals

peeing

vore

oral

sexual activity

anal

blowjob

dildo riding

incest

hanging

hate symbols

nazi party

white supremacy

diapers

scat

self injury

hate speech

urine

extremist

child on child

latex clothing

swimwear

bukkake

fellatio

cumshot

implied fellatio

eat_cum

cumdrip

cum in pussy

cum on face

after fellatio

cum on hair

cum on body

cum on tongue

cum on hands

cum in mouth

triple fellatio

autofellatio

fucked silly

cum on pussy