Moonshot AI Releases Kimi K2.7-Code: a Coding Model Reporting +21.8% on Kimi Code Bench v2 Over K2.6

wpnews.pro

cd /news/artificial-intelligence/moonshot-ai-releases-kimi-k2-7-code-… · home › topics › artificial-intelligence › article

[ARTICLE · art-26369] src=marktechpost.com ↗ pub=2026-06-13T04:57Z topic=artificial-intelligence verified=true sentiment=↑ positive

Moonshot AI Releases Kimi K2.7-Code: a Coding Model Reporting +21.8% on Kimi Code Bench v2 Over K2.6

Moonshot AI released Kimi K2.7-Code, a 1-trillion-parameter Mixture-of-Experts coding model that outperforms its predecessor K2.6 by 21.8% on the Kimi Code Bench v2 benchmark. The model, available on Hugging Face under a Modified MIT license, targets long-horizon software engineering tasks and uses 30% fewer reasoning tokens than K2.6.

read5 min views20 publishedJun 13, 2026

This week, Moonshot AI released ** Kimi K2.7-Code**. It is a coding-focused, agentic model. The model weights ship on Hugging Face under a Modified MIT license. You can also reach it through the Kimi API and Kimi Code.

K2.7-Code targets long-horizon software engineering, not general chat. It plans, edits, runs tools, and debugs across many steps. Moonshot pairs the model with a subscription coding platform around it.

Kimi K2.7-Code

K2.7-Code is a Mixture-of-Experts model. It holds 1T total parameters and activates 32B per token. The design uses 384 experts, with 8 selected per token and 1 shared. It has 61 layers, including 1 dense layer.

Attention uses MLA, and the feed-forward path uses SwiGLU. A MoonViT vision encoder adds 400M parameters for image and video input. The model ships with native INT4 quantization. The context window is 256K tokens (262,144).

Two constraints matters: Thinking mode is mandatory; disabling it returns an API error. Sampling is fixed: temperature 1.0, top_p 0.95, n 1, penalties 0.0. Default max output is 32,768 tokens.

You can self-host with vLLM, SGLang, or KTransformers. The Hugging Face repository is large, roughly 595 GB on disk. This is a server-class deployment target, not a laptop model.

Benchmark

Moonshot team published six benchmark rows. They compare K2.7-Code against K2.6, GPT-5.5, and Claude Opus 4.8. K2.7-Code beats K2.6 on every row. The largest coding jump is Kimi Code Bench v2, from 50.9 to 62.0.

Benchmark	Kimi K2.6	Kimi K2.7-Code	GPT-5.5	Claude Opus 4.8	K2.7 vs K2.6
Kimi Code Bench v2	50.9	62.0	69.0	67.4	+21.8%
Program Bench	48.3	53.6	69.1	63.8	+11.0%
MLS Bench Lite	26.7	35.1	35.5	42.8	+31.5%
Kimi Claw 24/7 Bench	42.9	46.9	52.8	50.4	+9.3%
MCP Atlas	69.4	76.0	79.4	81.3	+9.5%
MCP Mark Verified	72.8	81.1	92.9	76.4	+11.4%

K2.7-Code does beat Opus 4.8 on MCP Mark Verified, 81.1 versus 76.4. It also lands close to GPT-5.5 on MLS Bench Lite. K2.7-Code ran in Kimi Code CLI, GPT-5.5 in Codex xhigh, and Opus 4.8 in Claude Code xhigh.

Reasoning-Token Efficiency: A Cost Claim, Not Just Quality

Moonshot team reports about 30% lower reasoning-token usage than K2.6. It frames this as ‘less overthinking.’

Reasoning tokens bill as output tokens on most price cards. Agentic coding runs hundreds or thousands of steps. Each plan, retry, and verification pays the thinking cost again. A 30% cut compounds across a long run.

The effect lands in three places at once. First, lower output-token cost per task. Second, faster steps, which helps interactive CLI sessions. Third, more steps before hitting context limits.

Use Cases With Examples

Repo-scale refactors are the main use case. Point the agent at a failing test suite. It reads files, edits across modules, then reruns tests until green.
Code review is a second fit. Feed a pull request diff and ask for risk analysis. The 256K window holds large diffs, logs, and related files together.
MCP tool-use workflows are a third fit. K2.7-Code scored 81.1 on MCP Mark Verified. That suite tests correct tool invocation through the Model Context Protocol. Think CI checks, ticket updates, and file edits in one loop.
Long-context analysis is a fourth fit. The model accepts text, image, and video input. Documentation, screenshots, and a recorded repro can share one prompt.

Marktechpost’s Interactive Explorer

Kimi K2.7-Code — Interactive Explorer #

$0.00

A Minimal Quickstart

The Kimi API is OpenAI-compatible. The model string is kimi-k2.7-code

. Do not override the fixed sampling parameters, or the request errors.

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ.get("MOONSHOT_API_KEY"),
    base_url="https://api.moonshot.ai/v1",
)

messages = [
    {"role": "system", "content": "You are a coding agent."},
    {"role": "user", "content": "Refactor utils.py to remove duplicate code."},
]

resp = client.chat.completions.create(
    model="kimi-k2.7-code",
    messages=messages,
    max_tokens=32768,  # default cap; also the maximum
)

msg = resp.choices[0].message
print(msg.content)

Two tool-use rules come from the docs. Keep reasoning_content

from the current turn in context. And set tool_choice

to only "auto"

or "none"

How K2.7-Code Compares

Model	License	Params	Context	API price (in / out per 1M)
Kimi K2.7-Code	Modified MIT (open)	1T total / 32B active	256K	$0.95 / $4.00
Kimi K2.6	Open-weight	1T-class MoE	256K	~$0.67–0.95 / ~$3.39–4.00
GPT-5.5	Closed	Not disclosed	—	Not in Moonshot table
Claude Opus 4.8	Closed	Not disclosed	1M	$5.00 / $25.00
Qwen3-Coder-480B-A35B	Open (Qwen license)	480B / 35B active	256K native	Varies by host

*K2.7-Code lists $0.19 per 1M for cached input. *

Strengths and Weaknesses

Strengths:

Open weights under Modified MIT, with a real self-host path.
Broad, consistent gains over K2.6 on coding and agent evals.
Low API pricing relative to closed frontier models.
Beats Opus 4.8 on the MCP Mark Verified benchmark (company-reported).

Weaknesses:

All headline numbers are first-party at launch.
Thinking mode cannot be disabled.
Sampling controls are locked to fixed values.
Multi-step tool calls must preserve reasoning_content

. - 595 GB weights make self-hosting a serious commitment.

Key Takeaways

All headline benchmarks are vendor-run; independent results are pending.
K2.7-Code is open-weight, coding-specialized, and built on Kimi K2.6.
Moonshot reports +21.8% on Kimi Code Bench v2 over K2.6.
The model uses roughly 30% fewer reasoning tokens than K2.6.

Check out the ** Model weight**,

and

Kimi Code

API**.** Also, feel free to follow us on

and don’t forget to join ourTwitter

and Subscribe to

150k+ML SubReddit. Wait! are you on telegram?

our Newsletter

now you can join us on telegram as well.Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us

source & further reading

marktechpost.com — original article Microsoft AI Releases MAI-Cyber-1-Flash: A 5B-Active-Parameter Cyber Model That Pushes MDASH to 95.95% on CyberGym Deploying a 1-Bit Bonsai-27B Model with PrismML llama.cpp and OpenAI-Compatible Local Inference Workflows Kimi AI and kvcache-ai Open Sources ‘AgentENV’: A Distributed System that Powers Agentic Reinforcement Learning (RL) Training for Kimi K3

~/api · this article 200

$curl api.wpnews.pro/v1/news/moonshot-ai-releases-kim…

Read original on marktechpost.com → www.marktechpost.com/2026/06/12/moonshot-ai-rele…

mentioned entities

Moonshot AI

Kimi K2.7-Code

Hugging Face

Kimi API

GPT-5.5

Claude Opus 4.8

Kimi Code Bench v2

MCP Mark Verified

metadata

slugmoonshot-ai-releases-kimi-k2-7-code-a-coding-model-reporting-21-8-on-kimi-code-6

topic#artificial-intelligence

secondary4 topics

sentimentpositive

canonicalmarktechpost.com

navigation

← prevSetup Guide: Dell XPS 16 9640 vs…

next →Linear Ensembles Can Erase LLM W…

── more in #artificial-intelligence 4 stories · sorted by recency

sourcefeed.dev · 28 Jul · #artificial-intelligence

Linear Attention Just Graduated to Frontier Scale

promptcube3.com · 28 Jul · #artificial-intelligence

AI coding communities besides Reddit

dev.to · 28 Jul · #artificial-intelligence

An AI Escaped a Sandbox to Cheat on Its Own Exam. Let's Not Bury That Lede.

localinference.io · 28 Jul · #artificial-intelligence

Local Inference – Run LLMs on Your Own Hardware (Guide and Forum)

── more on @moonshot ai 3 stories trending now

wpnews · 26 Jul · #artificial-intelligence

Nobel laureate Simon Johnson on the AI race and China’s ‘over-automation’ problem

wpnews · 26 Jul · #artificial-intelligence

China’s Moonshot, Z.AI, and DeepSeek are challenging U.S. AI labs—and beating them on cost

wpnews · 26 Jul · #ai-safety

University of Washington study reveals prompt injection risks lurking in AI agent memory

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required