Release of the MiniMax-M3 Multimodal Model

MiniMax has introduced the MiniMax-M3 model, featuring support for up to 1 million tokens of context and the innovative MiniMax Sparse Attention mechanism.

Compiled by Sergey KostenchukPublished 2026-06-12Updated 2026-06-12

2026-06-12 Research HuggingFace

🤖 Release of the MiniMax-M3 Multimodal Model

MiniMax has released the MiniMax-M3 model with support for up to 1 million tokens of context. Thanks to the MiniMax Sparse Attention (MSA) mechanism and a dual-branch architecture, the model operates many times faster than standard solutions with significantly lower resource costs.

🌍 The emergence of efficient Sparse Attention mechanisms paves the way for scaling ultra-long context without quadratic cost growth. This makes multimodal agents more accessible for commercial use.

👤 You will be able to work with massive amounts of data, such as entire libraries of books or long videos, much faster and cheaper than with current architectures.

Source 1: https://huggingface.co/MiniMaxAI/MiniMax-M3 Source 2: https://arxiv.org/abs/2606.13392

Sources