Sign In

The Essence of Star Trek DS9

23
230
10
Type
LoRA
Stats
204
Reviews
Published
Mar 18, 2024
Base Model
SDXL 1.0
Training
Steps: 7,000
Epochs: 20
Usage Tips
Strength: 0.7
Trigger Words
star trek ds9
Training Images
Download
Hash
AutoV2
48F41DF1C7
default creator card background decoration
mlsa's Avatar
mlsa

This LoRA aims to capture the general "feel" of Deep Space Nine as well as adding some of the core themes to the outputs it influences. It is not intended to directly replicate any one character perfectly although has been trained on some of the main characters names.

I trained it on 1000~ highly curated and a mix of auto and manually captioned photos, magazine images and stills from episodes of Star Trek Deep Space Nine, Magazine scans, Official artwork, Episode screencaps, Archival footage and more.

The source images were captioned with 'star trek ds9' as a prefix so most of the time it will help if you add that to your prompt somewhere.

It was trained with SDXL and works well with SDXL Lightning.

In my testing it works especially well with Lightning Fusion XL v1.4 as the base model, although the parameters can be a somewhat finicky - make very small changes when doing so.

Suggested Inference Parameters

I use InvokeAI for all image generation.

  • Resolution: 1024x1024

  • Base model: Lightning Fusion XL

    • Sampler: LMS Karras

    • Steps: 6-8

    • Base model CFG strength: 1.5-1.8~

    • LoRA CFG strength: 0.6-0.8~

Training Data

  • I scoured the internet and ended up with around 2000~ images which I refined down to 1500~, then for 1.0 to around 700.

  • Many of the source images I upscaled and de-noised with a combination of Topaz Photo AI, Pixelmator Pro and a few hacked together scripts.

  • Captioning

    • First I used a WD14 AI classification method to caption every source image with Kohya_SS.

    • I then (painstakingly) spent many hours manually adding caption details to the majority of images including character names, locations etc...

    • The model was trained with the captions keeping the first 6 tokens and shuffling the rest.


Limitations

Generally I'm pretty happy with v1.0, although there are some weak areas which include:

  • Parameters being quite sensitive (e.g. in some situations changing the LoRA CFG 0.75 to 0.70 can make all the difference).

  • It does not seem to have the ability to create two known subjects at once, for example if you prompt something like "star trek ds9, Miles and Bashir drinking coffee" - it will likely generate either two Miles, or two Bashir characters. I haven't looked into how to remedy this but would be keen to if I train another version.

  • Sometimes generates glitchy eyes, I suspect I need both a few more high resolution close ups in various angles and I might have overtrained part of the model.

  • While the model isn't intended to directly replicate individual characters, I have added some of my favourites - they're not perfect and not designed to be but they're not terrible either, areas of issue (at least with Lightning as a base model):

    • Bashir's forehead is often oddly large.

    • Dax's eyes are often too 'dreamy' and TOS like.

    • Mile's eyes are often glitched.

    • Quark looks too evil and chiselled.

    • I didn't include enough training images of Worf, so have a more generic Klingon vibe when he is in the prompt.