So basically, it's a text improvement LoRA, but it also acts as an aesthetics enhancement LoRA (which was unintended, but still welcome). Beta version.
How to use:
There are three main trigger words: "speech bubble", "text", and "snapchat".
While the first two are self-explanatory, "snapchat" adds a Snapchat-like semi-transparent bar with text on it.
The basic pattern is: "words words words", trigger word.
You can also use [n] in some cases to create a speech bubble or a string of text (Unstable!).
Just look at the prompts in the example images :3
Train info:
15 epochs, 2925 steps
Dataset: ~250 imgs, manually captioned text, WD3 Large for tags
Batch size 2, gradient acc 4, keep tags 5, shuffle the rest, no dropout, TE was NOT trained.
Resolutions = [768, 1024, 1280]
Trained on RTX 5060 Ti 16gb for ~16 hrs.
I want to continue working on this LoRA; however, captioning the text by hand is a huge pain in the ass.
