Sign In

KPOP Idol turn

100

923

0

18

Updated: Jul 30, 2025

conceptkpopdance

Verified:

SafeTensor

Type

LoRA

Stats

310

0

14

Reviews

Published

Jul 19, 2025

Base Model

Wan Video 14B t2v

Training

Steps: 4,655
Epochs: 133

Usage Tips

Clip Skip: 1
Strength: 1

Trigger Words

doing the zxtp_1y7urn move

Training Images

Download

Hash

AutoV2
D81885037B
default creator card background decoration
ZXTOPOWER's Avatar

ZXTOPOWER

2025-07-21

Alright, v2.0 is finally here!

After releasing v1.0, my first public LoRA, I honestly had no idea this much testing was needed just to get one of these out the door. Haha...

This whole process has given me a newfound respect for everyone who makes and shares their LoRAs. The amount of time and effort is just incredible, and I've come to sincerely respect them.

In my notes for v1.0, I mentioned that I got the best results by training on full-length video clips at once. I stuck with that method here, but it looks like it created a new little problem: if you try to generate a video longer than the training data (e.g., more than the original 2 seconds or 32(+1) frames), the motion gets skipped. This seems to be because the model is learning the entire motion at once without any breaks. I thought about trying to fix this by slicing the motion into different sections, but I'd already sunk so much time into this version that I decided to save that technique for a future concept.

Another change from v1.0 is that I didn't do the block weight adjustments this time. I realized that while those tweaks worked great in my personal workflow, they weren't a one-size-fits-all solution and didn't always work well in every environment.

So, while v2.0 is still far from perfect, I think it's the best one from the current training batch. I'm already thinking about a future version with even better motion, but no promises on when that'll be ready!


Quick heads-up: I didn't get to fully test this with img2vid. πŸ™

It tends to work best on images where the full body is visible, just like in the training data. If parts of the body are covered up or occluded, the motion might get a little wonky.

2025-07-19

Hey everyone!

I'm working on v2 now.

And I'm using all the awesome work you've shared in the gallery as a reference!

Honestly, I had no idea so many people would be interested in this concept!

Hope to release it soon, thanks!

2025-07-15

Hotfix v1.1 - Re-added important weights that were excluded in the previous version.

I2V isn't quite working. My LoRa is super picky with reference images. I'll test more and get it right in the next version. SORRY!πŸ™Œ

Recommendations

Workflow: Kijai - WanVideoWrapper T2V, FusionX

Utility LoRA: Wan Self Forcing Rank 16 (Accelerator) at Strength: 1.0

Sampler: LCM, flowmatch_distill, Unipc

Recommended LoRA Strength: 0.8 - 1.2

πŸ‘‰(The training data includes a workflow)

About This LoRA

This is my first public LoRA release.

The sample videos were generated using the workflow by Kijai. I have also confirmed that this LoRA works with Wan 2.1 14b I2V models.

The trigger word is is doing the zxtp_1y7urn move. However, I've noticed that the motion can be activated simply by increasing the LoRA strength, even without the trigger word. To be honest, I'm not entirely sure why this happens! :p

Training Details

Dataset: The model was trained on 10 video clips, each 2 seconds long at 16 fps with a resolution of 576x1024.

Captioning Process: Captions were generated in a two-stage process. First, 64 frames were extracted from each video for initial captioning with the Qwen-VL-7B model. These captions were then rewritten and refined by the Qwen-32B model.

Methodology: I experimented with various methods to train a LoRA for continuous motion like dancing. I found that training on full-length video clips at once yielded the best results. Furthermore, it seemed more effective to omit detailed motion descriptions in the captions and let the model infer the action itself.

Overfitting Issue: A side effect of this approach is that the model learned the entire content of the videos, leading to significant overfitting. I plan to conduct further research into more effective captioning strategies to address this.

Post-processing & Limitations: After training, I adjusted the block weights to minimize the inclusion of unnecessary details. This significantly reduced image degradation even at higher strengths. However, the problem of the model directly replicating the training data remains. As mentioned, the LoRA is heavily biased towards the dataset, so prompts for attributes like clothing or gender will not be very effective.

I welcome all feedback and constructive criticism on my LoRA.