Skip to content

Instantly share code, notes, and snippets.

@dougbtv
Last active March 18, 2026 18:32
Show Gist options
  • Select an option

  • Save dougbtv/77d870978a17e3ad79c40e6126d4aa0a to your computer and use it in GitHub Desktop.

Select an option

Save dougbtv/77d870978a17e3ad79c40e6126d4aa0a to your computer and use it in GitHub Desktop.
Running LTX-2 Video Generation with vLLM-Omni and ComfyUI - Conference Guide

Running LTX-2 Video Generation with vLLM-Omni and ComfyUI

This guide shows you how to run LTX-2 video generation (text-to-video and image-to-video) using vLLM-Omni as the inference backend and ComfyUI as the frontend.

Background

LTX-2 is a powerful video generation model from Lightricks that supports both text-to-video (T2V) and image-to-video (I2V) generation with audio synthesis.

Resources:

Prerequisites

  • Docker or Podman
  • NVIDIA GPU with sufficient VRAM (recommended: 24GB+)
  • Hugging Face account and token (for downloading models)
  • Basic familiarity with terminal/command line

Part 1: Setting Up ComfyUI

1.1 Clone and Install ComfyUI

# Clone ComfyUI
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

1.2 Install vLLM-Omni Custom Node

The vLLM-Omni custom node enables ComfyUI to connect to vLLM-Omni inference servers.

# From ComfyUI directory
cd custom_nodes

# Clone vLLM-Omni repository
git clone https://github.com/vllm-project/vllm-omni.git

# Copy the ComfyUI custom node
cp -r vllm-omni/apps/ComfyUI-vLLM-Omni ./

# Clean up (optional)
rm -rf vllm-omni

# Verify installation
ls -la ComfyUI-vLLM-Omni

The custom node requires no additional dependencies beyond what ComfyUI already has installed.

1.3 Start ComfyUI

# From ComfyUI root directory
cd ..
python main.py --cpu

ComfyUI will start on http://127.0.0.1:8188

Note: We use --cpu flag because vLLM will handle GPU inference. This keeps ComfyUI lightweight and prevents GPU memory conflicts.

Part 2: Running vLLM-Omni with LTX-2

vLLM-Omni supports both Text-to-Video (T2V) and Image-to-Video (I2V) modes for LTX-2. You'll need to choose which mode at server startup.

2.1 Set Up Environment Variables

# Set your Hugging Face token for model downloads
export HF_TOKEN="your_hf_token_here"

# Create directories for model cache and output
mkdir -p ~/model_cache
mkdir -p ~/video_output

2.2 Option A: Text-to-Video (T2V) Mode

Use this for generating videos from text prompts only.

podman run -d --name ltx2-t2v \
  --device nvidia.com/gpu=0 \
  --security-opt=label=disable \
  --userns=keep-id \
  --security-opt label=level:s0 \
  -e NVIDIA_VISIBLE_DEVICES=0 \
  -e CUDA_VISIBLE_DEVICES=0 \
  -e HF_TOKEN="${HF_TOKEN}" \
  -e HF_HOME=/hf/hub \
  -e HUGGINGFACE_HUB_CACHE=/hf/hub \
  -e TRANSFORMERS_CACHE=/hf/hub \
  --mount type=tmpfs,target=/workspace/vllm-omni/.triton \
  --mount type=tmpfs,target=/workspace/vllm-omni/.cache \
  -v ~/model_cache:/hf/hub \
  -v ~/video_output:/output \
  -p 8000:8000 \
  -w /workspace/vllm-omni \
  public.ecr.aws/q9t5s3a7/vllm-ci-test-repo:4036cd547c7552fd9329a87f38b9d6c484f3f14b \
  vllm serve \
    Lightricks/LTX-2 \
    --omni \
    --port 8000

2.3 Option B: Image-to-Video (I2V) Mode

Use this for animating existing images (should support T2V as fallback).

podman run -d --name ltx2-i2v \
  --device nvidia.com/gpu=0 \
  --security-opt=label=disable \
  --userns=keep-id \
  --security-opt label=level:s0 \
  -e NVIDIA_VISIBLE_DEVICES=0 \
  -e CUDA_VISIBLE_DEVICES=0 \
  -e HF_TOKEN="${HF_TOKEN}" \
  -e HF_HOME=/hf/hub \
  -e HUGGINGFACE_HUB_CACHE=/hf/hub \
  -e TRANSFORMERS_CACHE=/hf/hub \
  --mount type=tmpfs,target=/workspace/vllm-omni/.triton \
  --mount type=tmpfs,target=/workspace/vllm-omni/.cache \
  -v ~/model_cache:/hf/hub \
  -v ~/video_output:/output \
  -p 8000:8000 \
  -w /workspace/vllm-omni \
  public.ecr.aws/q9t5s3a7/vllm-ci-test-repo:4036cd547c7552fd9329a87f38b9d6c484f3f14b \
  vllm serve \
    Lightricks/LTX-2 \
    --omni \
    --model-class-name LTX2ImageToVideoPipeline \
    --port 8000

Key Difference: The I2V mode adds --model-class-name LTX2ImageToVideoPipeline which enables image input support.

2.4 Check Server Status

# View logs
podman logs -f ltx2-t2v  # or ltx2-i2v

# Check API health
curl http://localhost:8000/health

# List available models
curl http://localhost:8000/v1/models

First run will take time as the model downloads (~50GB). The model will be cached in ~/model_cache for subsequent runs.

Part 3: Using ComfyUI with vLLM-Omni

3.1 Create a Text-to-Video Workflow

  1. Open ComfyUI at http://127.0.0.1:8188

  2. In the Node Library (sidebar), find vLLM-Omni category

  3. Add these nodes:

    • Generate Video - Main generation node
    • Diffusion Sampling Params (optional) - Control quality/speed
    • Save Video - Output the result
  4. Configure Generate Video node:

    • url: http://localhost:8000/v1
    • model: Lightricks/LTX-2
    • prompt: Your text description (e.g., "A cat playing with yarn")
    • width: 768
    • height: 512
    • fps: 24
    • num_frames: 121 (5 seconds @ 24fps)
  5. Connect nodes:

    • Generate Video output → Save Video input
  6. Click Queue Prompt to generate

3.2 Create an Image-to-Video Workflow

Requirements: vLLM server must be running in I2V mode (LTX2ImageToVideoPipeline)

  1. Add these nodes:

    • Load Image - Input your image
    • Generate Video - Main generation node
    • Save Video - Output the result
  2. Connect nodes:

    • Load Image output → Generate Video image input
    • Generate Video output → Save Video input
  3. Configure Generate Video:

    • Same settings as T2V, but now with image input
    • Prompt describes the motion/animation (e.g., "gentle movement, cinematic")
  4. Queue and generate

3.3 Advanced: Using Sampling Parameters

For fine-tuning generation quality:

  1. Add Diffusion Sampling Params node

  2. Configure parameters:

    • num_inference_steps: 40-50 (higher = better quality, slower)
    • guidance_scale: 3.0-5.0 (how much to follow the prompt)
    • seed: Set for reproducible results
  3. Connect Diffusion Sampling ParamsGenerate Video sampling_params

Part 4: Tips and Troubleshooting

Recommended Settings for LTX-2

Parameter T2V I2V
Width 768 768
Height 512 512
FPS 24 24
Num Frames 121 (5s) 81-121 (3.4-5s)
Guidance Scale 4.0 2.0-3.0
Inference Steps 40-50 40-50

Common Issues

Problem: PermissionError: [Errno 13] Permission denied: '/workspace/vllm-omni/.triton' Solution: The tmpfs mounts fix this - make sure they're included in your podman run command.

Problem: GPU out of memory Solution:

  • Reduce num_frames (fewer frames)
  • Reduce resolution (512x512 instead of 768x512)
  • Enable vae_use_slicing and vae_use_tiling in sampling params

Problem: Video generation is very slow Solution:

  • First generation is always slower (model loading)
  • Reduce num_inference_steps to 30-40
  • Check GPU utilization with nvidia-smi

Problem: ComfyUI can't connect to vLLM Solution:

  • Verify vLLM is running: podman ps
  • Check port mapping: curl http://localhost:8000/health
  • Ensure URL in ComfyUI node is correct: http://localhost:8000/v1

Resource Requirements

  • VRAM: ~22-28GB for LTX-2 (depends on resolution/frames)
  • RAM: 32GB+ recommended
  • Storage: ~50GB for model cache
  • Generation Time: 20-60 seconds per video (varies by GPU)

Part 5: Stopping and Cleanup

# Stop the vLLM container
podman stop ltx2-t2v  # or ltx2-i2v

# Remove the container
podman rm ltx2-t2v  # or ltx2-i2v

# Stop ComfyUI (Ctrl+C in the terminal)

# Optional: Clear model cache to free space
rm -rf ~/model_cache/*

# Optional: Clear generated videos
rm -rf ~/video_output/*

Additional Resources

Example Prompts

T2V:

  • "A cinematic shot of ocean waves at golden hour"
  • "Timelapse of flowers blooming in spring"
  • "Aerial view flying over a mountain range"

I2V (describe motion, not content):

  • "gentle swaying, natural movement"
  • "slow zoom in, cinematic"
  • "subtle animation, soft lighting changes"

Conference Note: This guide uses the latest vLLM-Omni CI image. For production use, consider using official releases from https://github.com/vllm-project/vllm-omni/releases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment