T2I2V Workflow – How I Make My Short Videos

Hey everyone!

I just noticed I’ve passed 1.7k followers — that was fast!
It feels like I hit 1k just a short while ago.
Thank you all so much for the continued support ❤️

Recently, I’ve been spending most of my spare time working on animated video content.
Luckily, many of you seem to be enjoying it so far — so I’m staying motivated and pushing forward.

I’ve received a number of messages asking how I make these short videos,
so I wanted to briefly share my current workflow here.

🧩 1. Image Generation (T2I) – with Forge

I use Forge to generate images.
While most people prefer ComfyUI, I personally find Forge much simpler and faster.
I recently upgraded from a 3090 to a 5090, and I’m extremely happy with the speed boost.

Once I get an image I like (usually using SDXL or Flux),
I upscale it with Hires Fix, and then do facial inpainting using a custom SD1.5 model
that reflects the Asian female aesthetic I personally like.

🎥 2. I2V – with FramePack Studio

Next, I use FramePack Studio to convert still images into videos.

Although it’s a separate tool from Forge, I run FramePack on another machine.
My setup includes two 4090s and one 5090, and I use Docker to deploy FramePack builds remotely.

The biggest advantage of FramePack is that it allows me to review the video as it's being generated,
one second at a time. If something feels off, I can simply stop and discard it — which saves a lot of time.

🖼 3. Upscaling – with Waifu2x Extension GUI

After generating the base video, I upscale it using Waifu2x Extension GUI.
It’s fast, clean, and gets the job done well.

⏩ 4. Frame Interpolation – with Video2X

Lastly, I convert my videos from 30fps to 60fps using Video2X.
I’ve tried pushing it to 144fps before, but honestly, 60fps looks perfectly smooth for my needs.

🗒️ Final Thoughts

This is my current workflow.
Sure, with ComfyUI you could technically combine T2I and I2V into a single graph...
but I’ve been using AUTOMATIC1111 WebUI for so long, it’s hard for me to leave Forge behind 😅

I spend around 3 to 5 hours a day on this workflow,
and on a good day, I’m able to generate anywhere between 10 to 30 short videos that I’m happy with.

Lately I’ve been creating more NSFW content, which makes it harder to keep things fresh and creative.
If you have any scene ideas or interesting concepts, feel free to drop a comment — I’d love some inspiration!

Thanks for reading — and if you have questions, just leave them below. I’ll do my best to respond!

🐳 5. Dockerfile

The following Dockerfile is a CUDA 12.4-based build script for running FramePack Studio on a 4090.

FROM nvidia/cuda:12.4.0-devel-ubuntu22.04

ARG UID=1000
ARG GID=1000

ENV DEBIAN_FRONTEND=noninteractive \
    PYTHONUNBUFFERED=1 \
    PYTHONDONTWRITEBYTECODE=1 \
    VIRTUAL_ENV=/app/venv \
    PATH="/app/venv/bin:$PATH" \
    USER=appuser

# Create a non-root user
RUN groupadd -g $GID appuser && \
    useradd -u $UID -g $GID -m -s /bin/bash appuser

# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    git \
    python3.10 \
    python3.10-venv \
    python3.10-dev \
    python3-pip \
    libgl1 \
    libglib2.0-0 \
    libsm6 \
    libxrender1 \
    libxext6 \
    ninja-build \
    && rm -rf /var/lib/apt/lists/*

# Prepare working directory
RUN mkdir -p /app && \
    chown -R $UID:$GID /app

USER $UID:$GID

# Clone FramePack Studio repo
RUN git clone https://github.com/colinurbs/FramePack-Studio.git /app
WORKDIR /app

# Copy additional files or modifications (if any)
COPY --chown=$UID:$GID mod/ /app/

# Create virtual environment
RUN python3.10 -m venv $VIRTUAL_ENV

# Upgrade pip
RUN pip install --upgrade pip

# Install optional acceleration libraries (CUDA 12.4 versions)
RUN pip install xformers --index-url https://download.pytorch.org/whl/cu124
RUN pip install triton
RUN pip install sageattention==1.0.6

# Install PyTorch with CUDA 12.4
RUN pip install --no-cache-dir \
    torch==2.6.0 \
    torchvision \
    torchaudio \
    --index-url https://download.pytorch.org/whl/cu124

# Install project requirements
RUN pip install --no-cache-dir -r requirements.txt

# Prepare output and model directories
RUN mkdir -p /app/outputs && \
    chown -R $UID:$GID /app/outputs && \
    mkdir -p $VIRTUAL_ENV && \
    chown -R $UID:$GID $VIRTUAL_ENV

RUN mkdir -p /app/hf_download && \
    chmod -R 777 /app/hf_download

RUN mkdir -p /app/outputs && \
    chmod -R 777 /app/outputs

VOLUME /app/hf_download
VOLUME /app/outputs

# Expose default port for web UI
EXPOSE 7860

# Launch FramePack Studio
CMD ["python", "studio.py", "--share"]

This Dockerfile is a CUDA 12.8-based build script for running FramePack Studio on an RTX 5090.

I tried installing xFormers, Sage Attention, and Flash Attention,
but they actually made things slower —
perhaps they’re not yet optimized for PyTorch 2.7, though I’m not entirely sure.

FROM nvidia/cuda:12.8.0-devel-ubuntu22.04

# Set default user and environment variables
ARG UID=1000
ARG GID=1000

ENV DEBIAN_FRONTEND=noninteractive \
    PYTHONUNBUFFERED=1 \
    PYTHONDONTWRITEBYTECODE=1 \
    VIRTUAL_ENV=/app/venv \
    PATH="/app/venv/bin:$PATH" \
    USER=appuser

# Create a non-root user for the application
RUN groupadd -g $GID appuser && \
    useradd -u $UID -g $GID -m -s /bin/bash appuser

# Install system dependencies and Python 3.10
RUN apt-get update && apt-get install -y --no-install-recommends \
    git \
    python3.10 \
    python3.10-venv \
    python3.10-dev \
    python3-pip \
    libgl1 \
    libglib2.0-0 \
    libsm6 \
    libxrender1 \
    libxext6 \
    ninja-build \
    && rm -rf /var/lib/apt/lists/*

# Create and set permissions for the application directory
RUN mkdir -p /app && \
    chown -R $UID:$GID /app

# Switch to the non-root user
USER $UID:$GID

# Clone FramePack Studio repository
RUN git clone https://github.com/colinurbs/FramePack-Studio.git /app
WORKDIR /app

# Create a virtual environment for Python packages
RUN python3.10 -m venv $VIRTUAL_ENV
ENV PATH="/app/venv/bin:$PATH"

# Upgrade pip to the latest version
RUN pip install --upgrade pip

# Print installation complete checkpoint
RUN echo "[✓] All installation steps completed."

# Install PyTorch and related libraries (CUDA 12.8 compatible versions)
RUN pip install --no-cache-dir \
    torch \
    torchvision \
    torchaudio \
    --index-url https://download.pytorch.org/whl/cu128

# Install Python dependencies required by FramePack Studio
RUN pip install --no-cache-dir -r requirements.txt

# Declare volume mounts for model downloads and output videos
VOLUME /app/hf_download
VOLUME /app/outputs

# Expose default web UI port
EXPOSE 7860

# Start FramePack Studio with web UI sharing enabled
CMD ["python", "studio.py", "--share"]

🧪 6. FramePack Studio Benchmark: RTX 4090 vs 5090

Here’s a simple benchmark I ran to compare performance between 4090 and 5090 when using FramePack Studio.
Please note that this is just for reference — results may vary depending on system setup and teacache usage.

4090 = 450W no limit

5090 = 450W limit

Settings: Steps=25 | Width=768 | Loras=2 | Length=8s | MP4 Compression=16

Type        | Teacache | 4090 (sec) | 5090 (sec)
------------|----------|-------------|-------------
Original    | TRUE     | 819.16      | 558.58
Original    | FALSE    | 1390.85     | 966.03
F1          | TRUE     | 650.48      | 467.13
F1          | FALSE    | 1068.94     | 848.26