FAQ: Dataset/model Filename & Trigger Word Conventions

FAQ: Dataset Filename & Trigger Word Conventions

What problem does this filename format solve?

The filename is designed to avoid collisions with generic or common names while also serving as a programmatic signal. It encodes both the trigger word and the dataset type, making it easy for scripts and training pipelines to identify and handle the dataset correctly.

Why not use a generic filename?

Generic filenames tend to overlap across projects and environments. This format ensures:

Uniqueness across datasets

Clear intent when parsed programmatically

No ambiguity about dataset content or usage

What does “bukk” mean in the filename?

bukk is a shortened form of bukkake. It functions purely as a dataset category label and helps distinguish this dataset from others at a glance.

What does “mx” stand for?

mx means mix.
It indicates that the dataset is diverse, rather than focused on a single, narrowly defined subset.

What does “XLrd” represent?

XLrd specifies both:

The resolution of the dataset

The model architecture tier it is intended for

This makes it immediately clear what kind of model configuration the dataset targets.

What is the trigger word for this dataset series?

The trigger word is:

bukkmxft

Why is the trigger word placed first?

The trigger word is kept at the beginning and applied consistently across all instances to:

Ensure reliable activation during training and inference

Make automated detection trivial

Maintain consistency across the entire dataset series