cd /news/ai-agents/show-hn-lazarus-a-coding-agent-for-l… · home topics ai-agents article
[ARTICLE · art-22812] src=github.com pub= topic=ai-agents verified=true sentiment=↑ positive

Show HN: Lazarus, a coding agent for long-horizon tasks

A developer released Lazarus, a new coding agent that uses a single persistent Python runtime instead of multiple specialized tools, achieving scores comparable to GPT-5.5 in Codex on two FrontierSWE benchmarks. The agent avoids tool selection and agent hierarchies by letting the model write Python code to inspect repos, edit files, run builds, and automate workflows. The project suggests that coding agents may have over-invested in tool collections and orchestration systems while under-investing in giving models a programmable environment they can shape themselves.

read2 min publishedJun 5, 2026

I have been interested in long-horizon coding tasks for a while, especially with benchmarks like FrontierSWE, where even the best coding agents like Codex and Claude Code struggle to complete tasks.

These agents come with a collection of tools like bash, file edits, grep, glob, etc.

Lazarus takes a different approach. The idea is to give the model exactly one tool: a persistent Python runtime.

Model writes Python code, executes it, and receives stdout/stderr. Through Python it inspects repos, reads and edits files, runs builds, executes tests, invokes linters, even build custom harnesses and automate whatever workflows it needs.

The motivation for this was: - Tool selection itself is a planning problem.

  • Specialized tools are often difficult to compose together efficiently.

  • Long-horizon tasks frequently require custom workflows that predefined tools don't provide.

  • Python is expressive enough for the model to build those workflows itself.

Another decision is avoid agent hierarchies. Lazarus runs a single tool-calling loop rather than managers, planners, and worker agents.

The intuition being current models are much better at writing code than coordinating fleets of agents. Agent orchestration consumes context, introduces extra modes of failure, and adds complexity.

How does Lazarus manage context? When the "usable" context window of a model is nearly exhausted, the model gets one final opportunity to execute a Python tool call, containing anything it wants to preserve: notes, plans, functions, summaries, partial results, etc.

The loop is then restarted with only:

  • The original user task

  • The carryover cell

  • The carryover cell's output

This allows the agent to periodically compress its own state and continue working without requiring an ever-growing context window.

I evaluated Lazarus on two FrontierSWE tasks: - git-to-zig (rewriting git in zig) - dart-style-haskell (rewriting dart-style formatter in haskell)

The runs with scores are available here: [https://github.com/ExpressGradient/frontier-swe-lazarus-runs](https://github.com/ExpressGradient/frontier-swe-lazarus-runs)

Using GPT-5.5 at medium reasoning effort, Lazarus achieved scores comparable to reported GPT-5.5 in Codex with xhigh reasoning.

The runs were not completed to exhaustion, I stopped them because I ran out of OpenAI credits. So I suspect there is still room for improvement from longer runtimes and higher reasoning.

The project is still early, but the results made me wonder whether coding agents have become over-specialized around tool collections and orchestration systems, while under-investing in giving models a programmable environment they can shape themselves.

Lazarus: [https://github.com/ExpressGradient/lazarus](https://github.com/ExpressGradient/lazarus)

Comments URL: [https://news.ycombinator.com/item?id=48416473](https://news.ycombinator.com/item?id=48416473)

Points: 1

── more in #ai-agents 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/show-hn-lazarus-a-co…] indexed:0 read:2min 2026-06-05 ·