Description:
This workflow is based on the LTX2.3 Distilled v1.1 large model and incorporates the VBVR Spatial Reasoning LoRA. It generates videos from images with automatic audio generation included.
Model Requirements
- ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensors Location:
models\diffusion_models - Ltx2.3-Licon-VBVR-I2V-240K-R32.safetensors Location:models\loras
- gemma_3_12B_it_fp8_scaled.safetensors Location:models\clip
- ltx-2.3_text_projection_bf16.safetensors Location:models\text_encoders
- LTX23_audio_vae_bf16.safetensors Location:models\vae
- LTX23_video_vae_bf16.safetensors Location:models\vae


