Mixtral 8x22B

Mistral AI's sparse MoE model with 8 experts of 22B parameters each, optimized for efficiency.

7
Coding
6.5
Agentic
7
Reasoning
6.5
Math
7.5
Language

Strengths

  • Efficient MoE architecture
  • Low latency (0.36 seconds)
  • Strong performance-to-parameter ratio
  • Good multilingual capabilities

Weaknesses

  • Lower performance in specialized tasks
  • Smaller context window (65K tokens)
  • Less extensive multimodal capabilities
  • Still evolving ecosystem