home models images videos posts articles bounties challenges events updates shop

Disguise Drop - LTX-2 / Wan2.1 14B FLF2V

Name: Disguise Drop - LTX-2 / Wan2.1 14B FLF2V
Rating: 5 (126 reviews)
Author: kabachuha

124

1.1k

140

104

Updated: Jan 28, 2026

action

transformation cartoonish lora video disguise

Verified: 10 months ago

SafeTensor

Details

Type	LoRA
Stats	747 140 1.2K
Reviews	Very Positive (96)
Published	May 19, 2025
Base Model	Wan Video 14B i2v 720p
Training	Epochs: 70
Usage Tips	Strength: 1
Hash	AutoV2 E824840D74

1 File

default creator card background decoration

kabachuha

When the mask falls off… who’s really inside?

LTX-2 Disguise Drop!

Remake of my very first Wan lora now for LTX-2! The sound effects take it to a new level!

The LoRA brings the classic unzipping surprise cartoon effect where a character suddenly unzips their skin, revealing another character inside them all along!

This time - compared to my other LTX-2 loras so far - thanks to calibrated LR scheduling, it has virtually no grayness and you can use it in image2video and first-last-frame2video modes alike!

The workflows are embedded within the videos or you can use the .json file inside the Huggingface files folder at https://huggingface.co/kabachuha/ltx2-disguise-drop. It's the native workflow with KJ Nodes and VideoHelperSuite.

The training recipe is exactly the same as for my previous one-hand-squish lora, except for the lowered rank to prevent overfitting. This lora converged in 3520 steps.

The model was trained with Disguise drop. as the prefix, but you can simply describe the action in details.

Wan 2.1

This Wan2.1 First-last-frame LoRA aims to replicate the weird cartoonish transformation effect of full-body disguise, where it turns out that another person or a cartoon character has been inside the source person all along (or vice versa) when the first person unzips their skin.

This lora is tested to run in kijai's Wan Wrapper nodes + ComfyUI essentials for histogram correction of 0.2 (otherwise the frames will be too dark), with the default lora weight of 1.0.

The LoRA has been trained using diffusion-pipe for 70 epoch with variable flow shift: 4.3 (for ~ 40 epochs) -> 3.9 to capture both macroscopic changes (the falling skin) first and then the exact hand movements, using Prodigy optimizer. The train resolutions are 688x688x49 and 688x688x65.

As the text encoder is not trained, the keywords have little effect, and I recommend describing the action in details following the prompt-template below:

"""

The video begins with [object]. [She/He] touches the top of [her/his] head with [her/his] hand and then [she/he] fully unzips [her/his] skin in two halves slowly pulling the zip with [her/his] hand across [her/his] body from the top of [her/his] head, splitting [her/his] face, to the waist to reveal [target object] inside [her/him]. The [target object] then throws away the previous skin.

"""

The ending sentence variation is "The previous skin then falls down", if you don't want to toss it around. Also, you can have the same pictures at the beginning and the end resulting in an endless loop of unpeeling :)

This is an experimental lora and the success rate varies! The model sometimes tries to cheat using a "bag" to put the subject into for later unzipping. For the time being, add "bag, shell, case, hood, zipper, cloth, grainy, dark, movie shot, old movie, jungle, vegetation, plants" to the negative prompt. Obviously, the lora works the best when the subjects share the same or at least a similar background.