At the Volcano Engine FORCE 2026 conference in Beijing, ByteDance introduced a new lineup of models, including the Seedance 2.5 video model, capable of creating up to 30 seconds of continuous video in a single segment, as well as the Doubao 2.1 Pro multimodal model for agentic tasks and coding.

image

What Happened

ByteDance presented the Seedance 2.5 video model, which supports up to 50 multimodal references (images and videos) for precise control over style and consistency. The model allows for generating up to 30 seconds of video in a single pass and includes a 3D white-box preview function for planning camera movement. Also announced were the Doubao 2.1 Pro multimodal model, focused on coding and AI agent workflows, and the Seedream 5.0 Pro image model. The public release of Seedance 2.5 is scheduled for early July 2026.

Context

The technological shift lies in the transition from generating short, disjointed clips to creating full-length, continuous scenes. The innovative approach using 3D pre-visualization allows for optimized computational resources, enabling users to plan a scene before the final rendering, bringing generative AI tools closer to professional directorial workflows.

Why It Matters for the Industry

The release of these new models intensifies competition with players like OpenAI and Runway, particularly regarding control over character and style consistency. The emergence of powerful APIs, such as Doubao 2.1 Pro, which can compete with Claude Opus 4.6, changes the landscape of high-performance models for development and complex task automation.

Why It Matters for Users

For ML researchers and content creators, this signifies a shift toward a 'reference + 3D pre-visualization -> video' paradigm. Professional users will be able to integrate these tools into existing pipelines to create more complex and coherent video productions with less effort spent on stitching segments together.

What Is Not Yet Known / Limitations

For engineering implementation, there is currently a critical lack of data regarding inference costs, latency, and the availability of a public API.

Sources

Author

Look at AI, Editorial Team