LoRA Training - Dataset Creation - ChatGPT API

This is a small workflow guide on how to generate a dataset of images using OpenAI's API using a tool I wrote called DallE Image Generator. This is a quick and mostly automated way to create a dataset.

WARNING: Each API call to OpenAI costs money. Around $0.04 per image generated. So take that into consideration when considering which workflow to use for your dataset generation.

This is part of a series on how to generate datasets with: ChatGPT API, ChatGPT, Bing, ComfyUI.

Setup

https://civitai.com/models/195318 or https://github.com/MNeMoNiCuZ/DallE-Image-Generator

The tool itself can be downloaded there. It's a small python script.

The first thing you'll need to do is to get an OpenAI API Key. You can generate one here: https://platform.openai.com/account/api-keys

The key itself is free to create. But you also need to have credits on your account, which you can add here: https://platform.openai.com/account/billing/overview

Once you have your key, open up the settings.ini-file with a text-editor of your choice.

Add your key to the api_key line, like this: api_key = sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Save the file and close it.

Running the tool

Double-click DallEGenerator.py to run it. I think it will ask you if you want to install any missing package. If not, follow the instructions on how to install them that should appear.

It now shows you the interface.

The top part has a few buttons, and settings. Let's start by adjusting the settings.

Settings

Change the Dataset to the name of your model. This tool is created primarily in order to help me create Style LoRA datasets, but you can use it for whatever you want. The name here will be added as a trigger word to the caption files for your images.

You can also choose which model and resolution you want. I recommend leaving the defaults.

Make sure Captions is checked, as well as "Conceptify" if you are going to be creating concepts for your dataset (see Concepts) section below.

After you have chosen your settings, press the "Save Settings"-button, and restart the program. There are unfortunately some bugs where it won't update the Dataset Name until you restart the program. Hopefully I'll fix the bug in the future, but better safe than sorry for now.

Editing the prompt

In the "Original Prompt" section (top text field), you can enter the main prompt you intend to use. What you write here is what DallE will generate. As you can see from the example prompt:

In a style and spirit of christmas, a semi-realistic artstyle, photo manipulation [SUBJECT], making you feel a sense of joy and happiness from the celebratory holiday season. sparkling effects, winter atmosphere and Christmas feeling, snow covered, with sparkling lights and a lot of red feeling

You can use [SUBJECT] or other words in [BRACKETS], in order to create a list of different words / wildcards to use in there.

Subjects / Wildcards

Try clicking the "Analyze Prompt"-button at the top left corner. You should now see that for each word in brackets ([SUBJECT] in the example prompt), now has it's own textarea. This means that the word [SUBJECT] will be replaced in the final text, with EACH OF THE LINES in the [SUBJECT] section. In the example below, I have entered "airplane" and "pyramids".

If you then click the "Preview Prompt"-button, you will see a list of prompts that will be used to generate images for you.

If you have multiple [BRACKET WORDS], it will generate each possible combination of them. So be careful not to have too many combinations.

The intended way to use this is to have the prompt describe the style that you want, and you generate lots of images with this style, and only change out the [SUBJECT], to give the AI lots of variations to train on, but with the same prompt surrounding the subject, so that your images are consistent.

You can now choose the number of images of each prompt that you would like to get by changing the Quantity setting at the top right. For styles, I would recommend 4 images per subject/concept. All images will be output in the same folder as it is. See the next section for how to improve the output file structure.

Conceptify

When you add each concept/subject in the list of subjects, there's one more layer to it. You can add an initial word (or words), which will be used to name your files, as well as create a matching caption file. I call this "Conceptifying".

To do this you simply add a word with [brackets], in the subject list itself. Example:

[airplane]airplane in the skies

This would send a prompt to the image generator with the text "airplane in the skies", but it understands that this image is related to the concept of "airplane", and it will name yourfile airplane, as well as caption it "airplane".

This is very useful as you can now give it a long list of more detailed subjects, while still keeping the concepts short, concise and clean. Which is good for a style generation LoRA at least.

Generating

This should now essentially output a dataset for a full style LoRA that you can train with. It's likely that you may need to improve some of the concepts, prompts etc. based on the results.

Remember to choose the number of images for each prompt with the Quantity option at the top. For a style LoRA I recommend 4 for each concept. But you can also do with fewer. The model just won't be as flexible as it hasn't seen as many different examples.

When you press the "Generate"-button, it will give you an ESTIMATE <--- just an estimate, for the cost. This is usually accurate for me, but use this tool at your own risk.

The files should be output one at a time in a folder with today's date.

I think there's a cap at 1000 images per hour or something with OpenAI's API, so if you're running multiple of these, it may be an issue. But for normal personal use, it should be fine.

List of Concepts

You should figure out what kind of concepts works well for your training. But if you want to create something similar to the types of Style / World Morph LoRAs that I have created, here's the list that I use.

Big thanks to Konicony for writing the original guide to this workflow, and for Navimixu for teaching me additional tips and tricks, and providing the core for the list of concepts below.

[airplane]majestic airplane soaring through the skies
[android]humanoid mechanical robotic cyborg character inside a spaceship
[battle mech]battle mech on a battlefield
[bookshelf]bookshelf in a living room
[camera]hi-tech camera on a table in an office
[car]car driving on a road
[castle]castle in the mountains
[city]modern city skyline, closeup daytime
[cloud]cloud in the skies, fluffy
[coffee machine]coffee machine in a kitchen
[combine harvester]combine harvester plowing the fields on the farm, planted harvest crops
[cube]cube hovering in the air, artifact
[dirigible]dirigible soaring the skies on a sunny day
[excavator]excavator working the construction site
[forest]forest trees, close up of trees, stems and roots, branches in forest
[grass]grass closeup blade of grass in grassy field
[industrial boiler]industrial boiler, technical parts, pipes, gauges, in a warehouse
[lake]lake in nature with water, ripples of a pond
[landscape, hills, terrain]landscape with hills, (terrain:0.7), nature
[light rays]light rays from a glowing center in space
[macro, cells, microscope]cells and molecules and macro photography
[magical energy]magical energy swirling magic energies, glowing
[mystical runes]mystical runes and magic energies
[planet]planet in space, detailed and realistic, stars and cosmos in the background
[pyramid]pyramids in the desert
[recliner]recliner in a living room
[space station]space station in outer space, ISS, hi-tech, starry cosmos background in space
[spaceship]spaceship in outer space, {star-trek|millennium falcon}, stars and cosmos background
[sphere]sphere with hi-tech details, alien artifact
[submarine]submarine deep under water
[tank]tank panzer wagon, gun barrel, driving through the jungle and desert, tech
[toaster]toaster on a kitchen counter in a modern kitchen
[tree]tree in the forest, detailed leaves
[truck]truck driving in a city
[tv]vintage TV in a living room
[water]water closeup of a liquid surface, wet, dripping puddle
[wooden chair]wooden chair in a living room