What is MiniMax M3?

MiniMax, the Shanghai-based AI lab backed by Tencent, Alibaba, and miHoYo (the studio behind Genshin Impact), has teased its next-generation M3 AI model. The company claims the M3 model achieves 15.6x faster decoding speed and 9.7x faster prefill speed compared to M2 when processing 1M-token contexts [citation:4].

Technical Innovation: MiniMax Sparse Attention (MSA)

The secret sauce behind the M3 teaser is something MiniMax calls MiniMax Sparse Attention (MSA). It is built on a technique called GQA-driven dynamic block selection. Instead of having the model pay attention to every single piece of information in a massive context window, MSA intelligently picks which blocks of data actually matter for a given query. The result is dramatically less compute for roughly the same quality of output [citation:4].

Key Performance Claims

  • 15.6x faster decoding than M2 for 1M-token contexts
  • 9.7x faster prefill than M2 for 1M-token contexts
  • Maintains output quality comparable to M2
  • 1M-token context window support

Company Context

MiniMax listed on the Hong Kong Stock Exchange in January 2026. Its backers—Tencent, Alibaba, and miHoYo—represent a cross-section of China tech and gaming elite [citation:4]. The company previously released the M2 model series (M2, M2.5, and M2.7) with detailed technical reports on engineering innovations.

What We Don't Know Yet

  • Parameter count
  • Licensing terms
  • Release date
  • Full benchmark results

Pricing

Not yet announced.

Pros

  • Dramatic speed improvements (15.6x)
  • Maintains output quality
  • Backed by major investors
  • Built on proven M2 foundation
  • MSA is novel technical approach

Cons

  • Teaser only — no release date
  • No parameter count yet
  • No independent benchmarks
  • China-based model (data governance)

Who Should Watch?

Perfect for: AI researchers, efficiency-focused developers, and organizations needing high-speed long-context models.

Verdict

MiniMax M3 represents a potential breakthrough in efficient attention mechanisms. If MSA delivers on its promises, it could significantly reduce inference costs for long-context applications [citation:4].

Rating: N/A (Teaser) - Watch for official release.