Sign In

LoRA Training Data - Upload to CivitAI

LoRA Training Data - Upload to CivitAI

Sharing Training Data on CivitAI

I always love seeing other people's training data, so I figured, let's start sharing my own, and maybe other people will be interested in what I used to train my models. Be the change you want to see etc.

There is actually a category to upload training data on Civit. I wish it was a bit more exposed. If you agree, you can always thumb up the Feature Request.

If you upload your own training data, I would love to see it! Please post a link below and I will link to it to the article as a resource for all training data on Civit.

How to add your own training data to CivitAI

Would you also like to share training data? Do it!

  1. Click on Manage Files on your model

  2. Upload your training data in a zip-file and choose "Training Data" as the type


Links to my dataset gathering articles

Bing Dataset Generation

ChatGPT Plus Dataset Generation

ChatGPT API Dataset Generation

ComfyUI "One Click" Dataset Generation


Links to my training data and lessons learned

In the Attachments section of this article, you'll find my current Kohya_ss LoRA training data config (kohya_ss Example Config - CakeStyle.json). Remember to change the name, file paths, settings and sample info before using it. This is not a LoRA training guide.

Disclaimer:

My learnings below are just my own theories and understanding. This is not a technical guide, and this just documents my process of making models. If you follow the steps of my earliest models, you'll get worse results. Also, so far I've mostly made styles. My musings below are not useful for things like characters, art styles and other concepts.

Sushi Style

v2

This was my first model. I had no idea what I was getting into, and how addictive it would be. I just wanted to see if it could be done. On my first few days with SD, I found Konyconi's style LoRAs and was blown away of what you could do. So my first idea was to sushify something. I read Konyconi's style creation guide and the video guide from Olivio Sarikas.

Easy, right? Well, my initial results weren't great. I used way to few images, and too low quality. I assumed that the trainer would pick out the good parts of the images I provided. This is not how it works. It picks up all parts. Including stuff I didn't want in my sushi. Like grass.

Anyway, like 12 versions later, we have the v1 model I uploaded. If I were to re-train it, I would get much better and many more images from Bing. This is currently the best way I know to train models. I also have better settings now.

Gelato Style

v2

This model took 4 versions to get to this result. A few images were removed where I didn't like their influence on the style.

The whole model works, but is istill nflexible. If you try to make a "tree", you WILL get the tree that's part of the training data, unless you prompt it more and specify in detail what you want it to look like. I later learned that this is called memorization. Meaning that my model has memorized my very specific image, rather than learned and understood the concept of a gelato tree. It means I need more variations of the concept to show it, and I need less repeats of my data in the training. So all in all, I need a lot more images for my training data.

Later on, I also learned that if you don't add environment or locations to your training data, the trainer will figure out the most logical location for your data, and place generations there if the generation prompt doesn't specify it. This means that your images will end up at an ice cream shop with this model if you don't prompt the location.

Chocolate Wet Style

v2

With this one I started experimenting with shapes. I figured, if I give it geometric shapes like a cube, a sphere and a pyramid, maybe it can extract a lot of useful shape information from this, and therefore learn it pretty well. I'm still not sure about it, but I tend to include it in all my styles anyway. It's not everything you need but I think it helps.

I also started adding more copies of each item here. You can see there are several tree images, to get less memorization get more variety of that subject. You can see from the removed images that I removed all dog images I had. This was because all animals started to look like that and I didn't like it. It does animals fine without any animal training data anyway, but I'm thinking to start incorporating it in my training again. With more varied training images.

Whipped Cream On Top Style

The purpose of this one was to combine it with my other food LoRAs, and to see if rather than transforming the subjects into a type of food, it would instead add something on top. I think it kind of works, but it gets boring pretty quickly.

The images are a mix of Leonardo.ai and Bing, with some real images mixed in too.

I decided to make an adult version too, which almost works like a "whipped cream censor" model funnily enough. The attached training data is for the SFW version.

Peach Fuzz

With this model I wanted to see if you could force more visible vellus hair (the name for peach fuzz).

I used images of blonde female vellus hair as it's more subtle than hairy male bodies for example.

The dataset is from stock photo sites and some googling. No generated images are used.

I have no rights to the images used, so this LoRA is not commercially usable.

Pink Skin

I had the idea to have a way to control the phenomena of having red elbows and knees. Usually seen in colored manga images to get some color variation on the skin tones I think. You will also see it if you walk on your knees and elbows for a bit. Probably from irritating the blood vessels in there or something. It looks a little bit like a sun-burn, which could be useful too.

The images are a mix of stock photos, and some generated images. I painted on the effect on all body parts in Photoshop. In the first attempt I made everything way too red, and I ended up with something like this for a more subtle effect. I also used WD14 captioning, for no particular reason, I was just testing around.

I have no rights to most the images used, so this LoRA is not commercially usable.

Semla Style

This was the first model where I trained using Bing images exclusively. I was still doing random subjects without any plan here. I was using some geometric shapes, and a lot of animals for this set. I do not have the prompt I used, but it was pretty much something like:

A [parrot] made out of semla

So pretty straight-forward and to the point. Sometimes you don't need much more than that.

Croissant Style

Most of the images from this one were generated by Bing. A few from some other sources. If I would tweak it, I would like to figure out how to make it not convert the skin of humans to crispy croissants. That grosses me out so much. I should probably remove the examples of faces made out of croissants from the data.

There were a few images removed (included in dataset sharing). They contributed too much of a "cute face" to early versions of the model. So I removed them from the dataset.


More in-between to come


Carnage Style (NSFW, gore / violence)

The training data is SFW though. It's mostly just tendrils, teeth and glowing. No real photos or gore.

I made the style with Bing Image Creator. Here's the prompt I ended up using for a lot of it:

styled after marvel's Carnage, photorealistic [tree in a forest], made out of glowing red spirals, swirls, long sharp teeth fangs and claws, white glowing eyes, tendrils, red slimy skin, brooding horror and darkness, evil energies, slimy guts

In this case, I would mostly swap out "tree in a forest", to the next concept.

For some of them, you have to adjust it. Things like the environment, or water, where I didn't want to have as many faces and mouths/teeth in general.

If you don't specify a location, it's likely to just be a black background. If you train only on white / black backgrounds, it will be absorbed by the training data and will generate without an environment unless you prompt for it. So I try to always have some kind of background in most descriptions.

I went with 40 concepts for this model. I'm including a few new concepts for micro, macro and "ethereal" types of connections in the model. Such as [macro, cells, microscope], light rays, magical energy, mystic runes, grass, forest, water, planet.

Galactic Empire Style

The same as I did for the Carnage Style, I tagged everything as "white", in hope of increasing color control. I'm gonna make a new version without this and compare. I'll update this once I have some results.

My prompt for Bing Image Creator was:

studio photography in a style and spirit and sleek design of star Wars imperial regime, a Photorealistic white plastic [airplane]. The Empire from Star Wars. sleek plastic retro futuristic sci Fi look, black secondary details, [soaring the blue skies among the clouds]

I often put the environment description last. To make sure to get the style for my SUBJECT, rather than the ENVIRONMENT. Unless the subject I'm going for is an environment, or the environment is a key part of it.

NES-Style

It's possible that the training data is a bit too abstract in it's shapes. I feel like often I'm not getting what I'm hoping for, but sometimes I very much do. If I were to re-train it, I think I would get new data for zoomed out images. Hopefully with more of a plastic feeling.

taintedcoil2 trained a version of it that got slightly different results. I'd be interested in seeing more people attempting to re-train this one and sharing their version.

My prompt for Bing Image Creator was:

A photorealistic RAW photo of a [toaster] in the style of vintage video games, NES, Nintendo Entertainment System, with sharp geometry and edges, gray plastic. With red, white and dark gray details, [in a kitchen]. Controller. Game system.

You have to switch parts of it around, and remove some parts for different prompts of course.

Minion-Style

This dataset is generated with different colors in the prompting and captions, which makes it color-changable.

The Bing prompt for this one was quite hard to deal with. It had to be changed a lot depending on the results I got. Very often I got the Minion characters, instead of the thing I was prompting for.

a photorealistic, [silver castle], styled inspired the Minions from Illumination, big white eyes with metal goggles, anthropromorphic hybrid creature, 3D CG Render, movie, cute,  with yellow details, [on the countryside]

photorealistic, [pink magical energy], pixar style, 3D CG Render, movie, with yellow details

photorealistic, black [macro, cells, celldivision, microscope view], pixar style, 3D CG Render, movie, big white eyes with goggles from The Minions

Keep in mind when developing a look that you are essentially art-directing, but you have to imagine what the results will look like after the model is trained. You're not gonna get an exact replica your input as output.

For this one I always had to consider how much yellow details to include in the other colors, I ended up having yellow "minion"-skin on some of the images, but not all. Not sure how much it has affected the training or not.

Spy World 1950s

This model set out to make a style that integrated hidden cameras and microphones into everyday things. The prompting was pretty straight-forward when generating training data with Bing:

1950s spy themed and espionage flavored Photorealistic [toaster in a kitchen]. with the feeling of intelligence Gathering, unknown situations and a sense of dread, retro scifi gadgetry with lens, spy camera and hidden wires

If I were to gather data for another version of this set, I think I would tone it down to be much more subtle with the cameras. To make them really hard to detect. But I would worry that the training may not be able to pick up on it.

Science DNA Style

This model ran on a mix of mostly 2 Bing prompts:

a phororealistic [toaster] made out of genetic sequences and building blocks of life. proteins, enzymes, cells, chromosomes, Hi-Tech, DNA molecule, glowing science discovery
Inspired by science, a phororealistic [car] made out of genetic sequences, proteins, enzymes, cells, chromosomes, Hi-Tech, DNA molecule, science discovery, scientific diagram, DNA

I had this idea of a very transparent style that would almost work like an X-Ray on objects, but showing exaggerated versions of the "building blocks" that makes the object up.

In the end the result is more tangible, but still fairly ethereal.

Expedition Style

The images for this dataset are very pretty. I'm not super happy with the result of the model. I feel like it didn't pick up enough of the style compared to the images.

I may try to figure out a way to capture more of the style. Not quite sure how.

The Bing prompt was for the most part:

In a style of adventurous explorers and archeologists, a Photorealistic [toaster in a kitchen]. Giving the feeling and sense of adventure, exploring the unexplored, mysterious tombs and temples, archeological digs and traps. the spirit of ancient knowledge in dusty tomes, with some magical elements

Cyberpunk Style / World

This dataset is a bit generic. The data is generated from Bing with this prompt:

a cyberpunk inspired, netrunner styled motorbike, in neo tokyo, Inspired by cyberpunk art, Augmented reality. High tech. Low life. Cybernetics. Crime rate.

I then decided to train on it twice. They are both trained from the same data. The only difference is the trigger keyword. One uses the actual word of "Cyberpunk Style", so it "inherits" more information about this subject from the base model you're using. The other one uses a made-up word which should make the training more pure and it should inherit less knowledge about what cyberpunk means, outside of the training data. In theory at least.

From my test generations so far, I feel like the "Style" one is generally more creative and unique feeling. This was the one with the weird activation trigger (C7b3rp0nkStyle).

Halloween Glowing Style

This model was a part of the Halloween Collaboration between myself, navimixu - DonMischo - taintedcoil2. Our goal was to each train a Halloween model, and also train a model with the training data from all our datasets.

We wanted to see what the results would be when multiple "related" art styles were merged into one model.

This is my contribution to that model.

This model was trained during the transition from Dalle2 to Dalle3. So the training data may be a little bit inconsistent, as the original prompting for the model stopped working for Dalle3.

The dataset is pretty good, but it ended up with too many pumpkins. I wanted more sinister glows and eyes to be the focus, but it was nearly impossible to get that without also getting pumpkined to hell.

It's the largest style dataset I have gathered so far, 751 images as I wanted to cover all the concepts that the other collaborators were using. It is certainly overkill, but it also produces very stable results. For the merged model, I feel like the glowing eyes were very prominent, as the number of images from this set were larger than the other ones.

Halloween Collaboration Model

This model was a part of the Halloween Collaboration between myself, navimixu - DonMischo - taintedcoil2. Our goal was to each train a Halloween model, and also train a model with the training data from all our datasets.

We wanted to see what the results would be when multiple "related" art styles were merged into one model.

This is the resulting model of everything combined!

A total of 1738 images, in 5 different styles, all trained at the same time.

The result is actually a fairly consistent, yet flexible model that actually takes from all styles.

It is a little bit desaturated, but it definitely works well!

Swedish Desserts

A simple object LoRA. It was trained with images from Google. It trained on 14 different subjects (different desserts), at the same time. The amount of images for each subject was different for each one, between 7 and 19. Collaboration between myself and @kvacky.

While it's trained on specific subjects with keywords, it still picked up the "feeling" of Swedish cakes and desserts, so it can also be used as a dessert/cake enhancing model which adds some Swedish flavors to it.

Cinnamon Bun Style

This was the second model I created entirely with the use of ChatGPT and DallE-3.

The image prompt is something like this:

Photo of a delicious-looking cinnamon-bun [SUBJECT]. The entire aircraft is constructed from fresh, golden-brown cinnamon bun material, with swirls of cinnamon and icing visible.

I use a handcrafted prompt to automate the generation of the dataset images.

Piano Style

The training data from this model was gathered mostly from ChatGPT. I used some kind of starting prompt, but ChatGPT changes the prompt a bit with each one. I used prompts like:

Photo of a shiny luxurious sleek and elegant coffee machine in a kitchen, styled after the aesthetics of a piano.

and

Photo of a sleek coffee machine in a kitchen, exuding the aesthetics of a piano. The machine captures the reflective quality of a piano's surface

BatmanCore

The images from this set were generated with DallE3. Most of them were generated fully automatically with my DallE Image Generator script: (https://github.com/MNeMoNiCuZ/DallE-Image-Generator). A guide on how to use it should be up on the github, as well as in a CivitAI Article (https://civitai.com/user/mnemic/articles).

The prompt I used was mostly:

Envision a [SUBJECT], with an ultra-modern aesthetic, characterized by a sleek silhouette and aggressive geometry. Inspired by the dark knight. The design should incorporate a monochromatic palette, dominated by a deep, matte black and accented with elements that suggest cutting-edge technology. With textures reminiscent of kevlar or carbon fiber. Subtle bat motifs or insignia should be integrated into the design, suggesting a connection to a dark and mature bat-themed super hero.

Overall I think the dataset is fine, but the results are often not too interesting. It keeps on producing way too many armored men if I use the word "Batman" as part of the triggering keyword, and if I train without it, I don't feel like it got the style good enough. There's probably something that can be done to improve it, but I'm not quite sure what.

Transformers Style

The dataset was generated with a mix of Bing and ChatGPT (DallE-3 for both).

The prompt used was something like: A photorealistic RAW photo, a semi-realistic artstyle, photo manipulation of a red and blue spaceship in outer space, in the style of by Michael Bay's Transformers, metallic-looking sharp design, shiny two toned, dual colors scheme, metallic parts transforming it, high-class luxury item

The new experiment here was generating images with mixed colors, and captioning them as such. I think the results are quite positive. If prompted with multiple colors, it will usually split the design up in a reasonable way. Example captions:

blue, white, red, airplane
red, gold, camera

Christmas Postcard Style

The style was developed with Bing with DallE 2.5. But I didn't get around to generating the images then. They were instead generated with ChatGPT and DallE3. I ended up with a prompt like this for DallE3:

In a style and spirit of christmas, a Photorealistic [cup of coffee in a kitchen] making you feel a sense of joy and happiness from the celebratory holiday season. sparkling effects, winter atmosphere and Christmas feeling, snow covered, with sparkling lights and a lot of red feeling

The goal was to get MAXXIMUM CHRIXXMAS, very much over-the-top joy and merryment. I'm not loving the visuals of the result, but I can't argue with the learned style, it very much is exaggerated Christmas. I ended up naming it Christmas Postcard since all the decorations and framing makes it useless for most other things.

Christmas Winter Style

This dataset was created as a bit of an experiment. I used 38 different concepts, but only one image of each concept. This means that the model is a bit poorly trained, and you don't get a lot of variety within each concept unless you specify your details. For example, all images of cities are very similar.

As I did use the word "Christmas" and "Wintery" in the model name, it did of course maintain a lot of knowledge about these subjects. Which is perfectly fine in this case.

The dataset was generated using my DallE Batch Image Generator: https://civitai.com/models/195318/

It generated the dataset in 10 minutes, all pre-captioned. (except for 2 images that I took from me developing the prompt in Bing).

Neon Christmas Style

This is a merge of three models from MNeMiC and DonMischo.

The images that comes from the Christmas Postcard Style are quite overwhelming. This is likely becacuse there are more of them (4 for each concept), so they are more consistent.

The Christmas Wintery style also merges quite well with it, so even though there's only 1 image for each concept, it comes through quite a lot.

The Neon Christmas style comes through less automatically, but if you use words like "neon", the trained data will show up quite clearly.

ComfyUI One Click Generator

From this point on, I will mostly be using the ComfyUI One Click LoRA method as outlined by this walkthrough guide on civit. Mirror.

Davy Jones Locker Style

This model started as a DallE 2.5 style, and ended up in a ComfyUI learning experience.

It uses the IPAdapter functionality to extract the style from a source image, and apply it to the generated image. And with combinatorial prompts, I can quickly batch through prompts designed for the concepts I use for training. This is also the first time I've used SDXL to generate training images.

The ComfyUI workflow is shared along with the training data, as well as the Style Image used.

The prompt I used to pull out the style from the source was:

found at the bottom of the sea, covered in barnacles, tentacles, moist, dark, damp, wet, dirty, filthy, untouched for centuries

Deadpool-Style

The dataset for the Deadpool style was generated in ComfyUI using my Style Generator workflow (still to be shared). It came from one source image from DallE3, and I then generated the dataset using the following style prompt with SDXL in ComfyUI:

(red:1.4) colored, black and white details secondary color, modern sharp design, made from luxurious materials, leatherwork, stitches, deadpool-style

NES Voxel Style

The dataset for the this style was generated in ComfyUI using my Style Generator workflow (still to be shared). It came from one source image from DallE3, and I then generated the dataset using the following style prompt with SDXL in ComfyUI:

a colored, photorealistic (3D:1.2) (Voxel:1.3) hi-tech [SUBJECT], 3D-pixel, (wireframe:1.3), In the style and spirit of Nintendo NES pixel art, AR, 3D, thick outline,  wireframe

Wrong Hole Generator

To generate the dataset for this model I used the IPadapter workflow for ComfyUI. I just added a reference image of a suitable hole to use as a reference, and gave it a list of wildcards to generate holes onto different things..

Hornify Style

Used the ComfyUI One Click LoRA method as outlined by this walkthrough guide on civit.

I forgot to remove the goats/deers that I wasn't happy with from the first generated dataset. So I did a second version shortly after to fix the model.

Jedi Style

Used the ComfyUI One Click LoRA method as outlined by this walkthrough guide on civit.

The prompt for generating the set was:

inspired by the Jedi, blue light, Glow, soft edges, hi-tech, sci-fi, [SUBJECT], sci-fi, tech, soft, beige, white design

Cardboard Style

The images for dataset was generating using Bing and ChatGPT using DallE-3.

The prompt was usually something like:

A realistic, life-sized,(SUBJECT GOES HERE), it is entirely made out of cardboard. It should have a detailed, cardboard construction, appearing like a large-scale model. Its texture and appearance should clearly show that it is made of cardboard, with visible corrugations and seams, large chunks of cardboard, oversized, detailed background

Dark Charcoal Style

This dataset was generated with the Bing dataset generation method.

My prompt was usually something like this:

a dark charcoal painting of a bookshelf in a living room. Professional master of charcoal, dark strokes, rough lines, full screen

Semla Style v2

A new dataset generated with the Bing dataset generation method. I realized the old version wasn't up to par when I tried to re-use it for SDXL training. The old model was also a bit unreliable.

Gaelic Pattern Style

An initial image was generated using Bing using this prompt:

a creative fantastic creative depiction of a detailed and Lá Fhéile Pádraig-themed, "industrial boiler in a warehouse". Inspirid by celtic culture, iconographic patterns

And a resulting image was then used with the Comfy Workflow and a similar prompt to create a synthetic dataset resulting in the model.


Others Training Data

navimixu

konyconi

taintedcoil2

DonMischo

vimesrandom471

demoran - LoRA Guide - List of Character Models with training data

JohnJohn

duskfallcrew

_Pixel

LVl5Mage

Isabelia

KhajiitHasWares

Ceranos

Jentix

generateddavidk726

ArtfullyPrompt

BerserkFG

zer0TF

TikFesku

denrakeiw

Mikalichou

skraaglenax

sinatra

miette

kuritsutian197

steffangund

norod78

diogod

jinxit4

Please consider sharing your own training data <3

103

Comments