FAQ: Dataset Filename & Trigger Word Conventions
What problem does this filename format solve?
The filename is designed to avoid collisions with generic or common names while also serving as a programmatic signal. It encodes both the trigger word and the dataset type, making it easy for scripts and training pipelines to identify and handle the dataset correctly.
Why not use a generic filename?
Generic filenames tend to overlap across projects and environments. This format ensures:
Uniqueness across datasets
Clear intent when parsed programmatically
No ambiguity about dataset content or usage
What does “bukk” mean in the filename?
bukk is a shortened form of bukkake. It functions purely as a dataset category label and helps distinguish this dataset from others at a glance.
What does “mx” stand for?
mx means mix.
It indicates that the dataset is diverse, rather than focused on a single, narrowly defined subset.
What does “XLrd” represent?
XLrd specifies both:
The resolution of the dataset
The model architecture tier it is intended for
This makes it immediately clear what kind of model configuration the dataset targets.
What is the trigger word for this dataset series?
The trigger word is:
bukkmxft
Why is the trigger word placed first?
The trigger word is kept at the beginning and applied consistently across all instances to:
Ensure reliable activation during training and inference
Make automated detection trivial
Maintain consistency across the entire dataset series