ByteDance Unveils Multimodal Model Seedance 2.0

🎬 ByteDance Unveils Multimodal Model Seedance 2.0

ByteDance has released Seedance 2.0 — a generative model for video creation based on an audio-video joint generation architecture. The model supports text, image, audio, and video inputs, ensuring high character consistency, physically accurate motion, and directorial camera control.

🌍 The transition to an audio-video joint generation architecture reduces the gap between the visual sequence and the sound accompaniment, making video generation more cohesive and professional.

👤 It is now possible to create much more complex videos where characters remain recognizable from frame to frame, and sound effects and object movements are synchronized at a physical level.

Source 1: https://seed.bytedance.com/en/seedance2_0

Sources