🎨 Lightricks has released Wan2.2-NVFP4-Sparse — an extremely fast version of the Wan 2.2 video generation model (14B parameters).

The model uses NVFP4 quantization and Sparse Attention, optimized for the NVIDIA Blackwell architecture. Inference is reduced to 4 steps, providing a 50-60x speedup: generating 720p takes only 45 seconds instead of 2668 seconds on an RTX 5090.

🌍 This demonstrates the capabilities of deep optimization for the new generation of GPUs (Blackwell), making heavy video models suitable for real-time operation.

👤 High-quality, high-resolution video creation can now take less than a minute instead of tens of minutes, radically changing the workflow in AI production.

Source 1: https://huggingface.co/lightx2v/Wan2.2-NVFP4-Sparse