{"slug": "mistral-ai-releases-leanstral-1-5-an-apache-2-0-lean-4-code-agent-model-solving", "title": "Mistral AI Releases Leanstral 1.5: An Apache-2.0 Lean 4 Code Agent Model Solving 587 of 672 PutnamBench Problems", "summary": "Mistral AI released Leanstral 1.5, an Apache-2.0 licensed Lean 4 code agent model that solves 587 of 672 PutnamBench problems. The model achieves state-of-the-art results on multiple theorem-proving benchmarks at significantly lower cost than competitors, with a free API endpoint now available.", "body_md": "Today, Mistral AI released **Leanstral 1.5**. It is a code agent model built for Lean 4. The release targets automated theorem proving and proof engineering. Weights are open under Apache 2.0. A free API endpoint, `leanstral-1-5`\n\n, is now live.\n\nLeanstral 1.5 updates the earlier Leanstral-2603 model. It belongs to the Mistral Small 4 family.\n\n**What is Leanstral 1.5**\n\nLeanstral 1.5 is a code agent model for [Lean 4](https://github.com/leanprover/lean4), a proof assistant. A proof assistant checks every logical step mechanically. Lean 4 can express objects like perfectoid spaces and properties of Rust fragments.\n\nThe architecture is a mixture-of-experts, or MoE. An MoE routes each token to a few specialized sub-networks. This keeps compute low while total capacity stays large. Leanstral uses 128 experts, with 4 active per token.\n\nTotal size is 119B parameters, with 6.5B activated per token. Context length is 256k tokens. Input is multimodal, accepting text and image. Output is text only.\n\n**How Mistral Trained Leanstral 1.5**\n\nTraining runs in three stages. These are mid-training, supervised fine-tuning, then reinforcement learning with CISPO. Two reinforcement-learning environments shaped the model’s agentic behavior.\n\nIn the **multiturn environment**, the model receives a theorem statement. It must prove or disprove it. It submits a proof, then reads Lean compiler feedback. It refines across attempts until it succeeds or exhausts its budget.\n\nIn the **code agent environment**, Leanstral works inside a raw filesystem. It edits files, runs bash commands, and uses the Lean language server. That server exposes goals, errors, and type information in real time.\n\nThis lets it complete partial proofs, build auxiliary lemmas, and persist through context compaction. Compaction compresses earlier context so long tasks still fit the window. Correctness is verified by Mistral’s fork of SafeVerify against target theorems.\n\n**Benchmarks and Performance**\n\nMistral team reports that Leanstral 1.5 saturates miniF2F. It reaches 100% on both the validation and test sets. It solves 587 of 672 PutnamBench problems.\n\nThe model sets a new state-of-the-art on the FATE-H and FATE-X algebra benchmarks. Mistral lists 87% on FATE-H and 34% on FATE-X. On FLTEval, pass@1 rises from 21.9 to 28.9. Pass@8 rises from 31.9 to 43.2.\n\nFLTEval is built from real pull requests to the Fermat’s Last Theorem repository. On it, Leanstral surpasses Opus 4.6’s 39.6 at one-seventh the cost. It also widens its lead over open-source models three to ten times larger. Pass@8 means eight attempts are allowed per problem.\n\n| Benchmark | Leanstral 1.5 | Detail |\n|---|---|---|\n| miniF2F (val + test) | 100% | Saturated, per Mistral |\n| PutnamBench | 587 / 672 | ~$4 per problem |\n| FATE-H | 87% | New state-of-the-art |\n| FATE-X | 34% | New state-of-the-art |\n| FLTEval pass@1 | 28.9 | Up from 21.9 |\n| FLTEval pass@8 | 43.2 | Beats Opus 4.6’s 39.6 |\n\nOn PutnamBench, Leanstral edges Seed-Prover 1.5 high by 7 problems. It does so at about $4 per problem. Mistral estimates Seed-Prover’s high setting near $300 or more per problem.\n\nThat setting runs a budget of 10 H20-days per problem. Mistral also compares against Goedel-Architect and AxProverBase. It notes Aleph Prover costs roughly $54 to $68 per problem.\n\nTest-time scaling is the model’s defining behavior. Raising the token budget per attempt lifts PutnamBench Pass@8. Mistral team reports 44 solved at 50k, 244 at 200k, 493 at 1M, and 587 at 4M. The interactive explorer below lets you scrub across that same curve.\n\n**Case Studies and Use Cases**\n\nLeanstral trained mainly on mathematics, but it also verifies code. Mistral team documents two case studies that matter for engineers.\n\n- First, Leanstral proved O(log n) time complexity for a real AVL tree implementation. AVL trees are self-balancing binary search trees. The proof used structural induction and monadic time tracking via the TimeM monad. It ran over 2.7 million tokens across 22 compactions. It established a bound near 48 steps per height unit, plus a constant.\n- Second, Leanstral found real bugs in open-source code. An automated pipeline used Aeneas to translate Rust into Lean. Leanstral inferred user intent and generated correctness properties. It attempted each property in four tries, then the negation in four more.\n\nAcross 57 repositories, it flagged 47 violated properties and 11 genuine bugs. Five were previously unreported on GitHub. One bug sat in the sign function for zigzag decoding in `datrs/varinteger`\n\n. On input `Std.U64.MAX`\n\n, the expression `(value + 1)`\n\noverflowed. That caused crashes in debug mode and silent corruption in release.\n\nPractical use cases follow directly from these examples. Dev teams can complete partial proofs inside a repository. They can generate correctness properties for a function automatically. They can stress-test Rust code by proving or disproving inferred invariants.\n\n**Getting Started: Code and Deployment**\n\nThe simplest path is Mistral Vibe, Mistral’s agent CLI. Leanstral runs on Mistral’s free plan. Enable ‘Labs models’ in your account, then create an API key.\n\nInstall Vibe, add the Lean agent, then launch it:\n\n```\n# 1. Set up Mistral Vibe\nuv tool install mistral-vibe\nuv tool update mistral-vibe\nvibe --setup\n\n# 2. Inside vibe, install Leanstral, then leave vibe\n/leanstall\nexit\n\n# 3. Launch the Lean agent\nvibe --agent lean\n```\n\nFor self-hosting, install vLLM 0.24.0 or newer, then serve the weights:\n\n```\n# Installs mistral_common >= 1.11.5 automatically\nuv pip install -U vllm --torch-backend=auto\n\nvllm serve mistralai/Leanstral-1.5-119B-A6B \\\n  --max-model-len 200000 \\\n  --tensor-parallel-size 4 \\\n  --attention-backend FLASH_ATTN_MLA \\\n  --tool-call-parser mistral \\\n  --enable-auto-tool-choice \\\n  --reasoning-parser mistral\n```\n\nCall the server through the OpenAI-compatible client. Set `reasoning_effort`\n\nto `high`\n\nfor complex prompts, or `none`\n\nfor speed:\n\n``` python\nfrom openai import OpenAI\n\n# Point the OpenAI client at your vLLM server\nclient = OpenAI(api_key=\"EMPTY\", base_url=\"<your-host-url>\")\n\nTEMP = 1.0\nMAX_TOK = 32000\nREASONING = \"high\"  # switch to 'none' for faster answers\n\nmodel = client.models.list().data[0].id\n\nmessages = [\n    {\"role\": \"user\", \"content\": [\n        {\"type\": \"text\", \"text\": \"Define the transition rules as an inductive proposition in Lean 4.\"}\n    ]},\n]\n\nresponse = client.chat.completions.create(\n    model=model,\n    messages=messages,\n    temperature=TEMP,\n    max_tokens=MAX_TOK,\n    reasoning_effort=REASONING,\n)\n\nprint(response.choices[0].message.content)\nprint(response.choices[0].message.reasoning)\n```\n\nLeanstral also supports OpenAI-style tool calling. You can expose a function such as `lean_run_code`\n\nto compile snippets. Mistral further recommends the `lean-lsp-mcp`\n\nserver for tighter Lean integration.\n\n**Key Takeaways**\n\n- Leanstral 1.5 is a free, Apache-2.0 Lean 4 proof-engineering model.\n- It uses a 119B mixture-of-experts with 6.5B active parameters.\n- It saturates miniF2F and solves 587 of 672 PutnamBench problems.\n- It found 5 previously unreported bugs across open-source repositories.\n- Access it via Hugging Face weights, a free API, or local vLLM.\n\nCheck out the** **.\n\n**Mistral AI announcement**,\n\n**Leanstral 1.5 model card**, and the\n\n**Hugging Face**.\n\n**Also, feel free to follow us on**\n\n**and don’t forget to join our**[Twitter](https://x.com/intent/follow?screen_name=marktechpost)\n\n**and Subscribe to**\n\n[150k+ML SubReddit](https://www.reddit.com/r/machinelearningnews/)**. Wait! are you on telegram?**\n\n[our Newsletter](https://www.aidevsignals.com/)\n\n[now you can join us on telegram as well.](https://t.me/machinelearningresearchnews)Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? [Connect with us](https://forms.gle/wbash1wF6efRj8G58)", "url": "https://wpnews.pro/news/mistral-ai-releases-leanstral-1-5-an-apache-2-0-lean-4-code-agent-model-solving", "canonical_source": "https://www.marktechpost.com/2026/07/03/mistral-ai-releases-leanstral-1-5-an-apache-2-0-lean-4-code-agent-model-solving-587-of-672-putnambench-problems/", "published_at": "2026-07-03 22:20:26+00:00", "updated_at": "2026-07-03 22:32:21.362807+00:00", "lang": "en", "topics": ["artificial-intelligence", "large-language-models", "ai-research", "ai-products", "ai-tools"], "entities": ["Mistral AI", "Leanstral 1.5", "Lean 4", "PutnamBench", "miniF2F", "FATE-H", "FATE-X", "FLTEval"], "alternates": {"html": "https://wpnews.pro/news/mistral-ai-releases-leanstral-1-5-an-apache-2-0-lean-4-code-agent-model-solving", "markdown": "https://wpnews.pro/news/mistral-ai-releases-leanstral-1-5-an-apache-2-0-lean-4-code-agent-model-solving.md", "text": "https://wpnews.pro/news/mistral-ai-releases-leanstral-1-5-an-apache-2-0-lean-4-code-agent-model-solving.txt", "jsonld": "https://wpnews.pro/news/mistral-ai-releases-leanstral-1-5-an-apache-2-0-lean-4-code-agent-model-solving.jsonld"}}