Kimi-K2-Instruct-0905

Kimi K2-Instruct-0905 is the newest and most advanced release in the Kimi K2 family. It is a cutting-edge mixture-of-experts (MoE) language model, activating 32 billion parameters out of a total architecture size of 1 trillion parameters.

Key Highlights

  • Stronger agentic coding intelligence: This version delivers notable gains across public benchmarks and practical coding-agent tasks.
  • Enhanced frontend development: Kimi K2-Instruct-0905 brings improvements to both usability and design in frontend programming scenarios.
  • Extended context capacity: The model’s context window has been expanded from 128k to 256k tokens, enabling more reliable performance on long-horizon tasks.

Model Summary

ArchitectureMixture-of-Experts (MoE)
Total Parameters1T
Activated Parameters32B
Number of Layers (Dense layer included)61
Number of Dense Layers1
Attention Hidden Dimension7168
MoE Hidden Dimension (per Expert)2048
Number of Attention Heads64
Number of Experts384
Selected Experts per Token8
Number of Shared Experts1
Vocabulary Size160K
Context Length256K
Attention MechanismMLA
Activation FunctionSwiGLU

3. Evaluation Results

BenchmarkMetricK2-Instruct-0905K2-Instruct-0711Qwen3-Coder-480B-A35B-InstructGLM-4.5DeepSeek-V3.1Claude-Sonnet-4Claude-Opus-4
SWE-Bench verifiedACC69.2 ± 0.6365.869.6*64.2*66.0*72.7*72.5*
SWE-Bench MultilingualACC55.9 ± 0.7247.354.7*52.754.5*53.3*
Multi-SWE-BenchACC33.5 ± 0.2831.332.731.729.035.7
Terminal-BenchACC44.5 ± 2.0337.537.5*39.9*31.3*36.4*43.2*
SWE-DevACC66.6 ± 0.7261.964.763.253.367.1

All results for K2-Instruct-0905 are reported as mean ± standard deviation across five independent, full test-set runs. Prior to each run, we prune the repository to remove any Git objects unreachable from the target commit. This ensures that the agent has access only to code that would legitimately exist at that point in history.

With the exception of Terminal-Bench (Terminus-2), every benchmark result was obtained using our in-house evaluation harness. This harness is adapted from SWE-agent, but we apply two key modifications:

  • context windows for the Bash and Edit tools are clamped, and
  • the system prompt is rewritten to align with the task semantics.

Baseline figures marked with an asterisk (*) are taken directly from their official reports or public leaderboards. All other metrics were re-evaluated by us under the same conditions used for K2-Instruct-0905.

For SWE-Dev, we take an additional precaution: we overwrite the original repository files and remove any test file that explicitly exercises the functions the agent is tasked with generating. This eliminates the possibility of indirect hints about the target implementation.

You can access Kimi K2’s API on https://platform.moonshot.ai , we provide OpenAI/Anthropic-compatible API for you.

Also, yo Lucan access it on Groq: