Show HN: Self-managing codebase with long-horizon agents

wpnews.pro

cd /news/ai-agents/show-hn-self-managing-codebase-with-… · home › topics › ai-agents › article

[ARTICLE · art-14322] src=github.com ↗ pub=2026-05-26T10:24Z topic=ai-agents verified=true sentiment=↑ positive

Show HN: Self-managing codebase with long-horizon agents

A developer released a demo of a self-managing codebase using long-horizon agents that autonomously triages errors, upgrades libraries, and rolls back regressions. The system runs three Anthropic Managed Agents—a manager, reviewer, and retro agent—that monitor Vercel and Sentry, file Linear tickets, write code, run Playwright tests, and open pull requests without human intervention. The project aims to offload constant maintenance work from developers onto automated agents that operate on cron-driven loops and webhook-triggered resumes.

read2 min views12 publishedMay 26, 2026

A demo showcasing an app maintained by a long-running agent.

Production apps need constant care: errors and stack traces to triage, slow endpoints to investigate, libraries to upgrade, regressions to roll back. That work eats developer time and can require on-call rotations.

This repo is demo that showcases what is possible in pushing those tasks onto a long-horizon agent.

The demo product. The demo product is a Next.js travel-planner app.The managing agent. A cloud-hosted Anthropic Managed Agent runs every 30 minutes (and on GitHub webhooks). It monitors Vercel + Sentry, files Linear tickets for new issues, picks up tickets, writes code, runs Playwright to view it's own work in a local dev server, and opens PRs.The review agent. A separate reviewer agent reads each PR cold, builds, runs tests, posts approve / request-changes / escalate. Three rounds of changes → escalates to a human.The retro agent. The agent system is self-learning. Each session can append to.claude/memory/

. A retro agent runs daily, summarises 24h of activity in a Linear project update, and proposes memory edits as a PR.

Two cron-driven loops + webhook-driven resumes

Manager(/api/manager-tick

, every 30 min). Reads memory, checks Vercel + Sentry, files tickets, picks up one Linear ticket, runs the dev loop locally in a sandbox (install, dev server, Playwright, build, lint), opens a PR.Reviewer(/api/reviewer-tick

, every 30 min). Reads each open PR cold, builds, runs e2e, postsAGENT_REVIEW: APPROVED | REQUEST_CHANGES | ESCALATE

.Webhooks(/api/github-webhook

). PR opens → reviewer fires immediately. Reviewer approval → manager session resumes (full context) and squash-merges.

Daily retro (/api/retro-tick

, 07:00 UTC). Reads 24h of activity, posts a Linear project update with a health field, and proposes .claude/memory/

edits as a PR.

Stack

Anthropic Managed Agents— three agents (manager, reviewer, retro), all running in cloud sandboxes with a mounted clone of this repo.** Linear MCP**— Store and control plane for tasks; also the location where the agent provides project updates.** GitHub MCP**— PRs, reviews, files.** Vercel MCP**— deployments, logs.** Sentry MCP**— runtime errors.** Vercel**— hosting + cron + webhooks for the orchestration routes.** Playwright**— e2e tests inside the sandbox.

npm run manager           # interactive REPL into a manager session
npm run manager:tick      # one-shot operational loop, prints transcript
npm run manager:bootstrap # apply agent YAML changes to the live agent

Bring your own infrastructure. The agent cannot create infrastructure, set env vars, register OAuth apps, or pay for services. Bootstrapping is human-only.Further observability. Product analytics, metrics and so on are trivial to add once the agent is running, and thus not done here.Single-agent throughput. WIP limit of 1 — the manager won't pick up a second ticket while another is in flight. Keeps things simple, caps throughput.

I hope this inspires you to set up your own. If you would like to swap notes, I can be reached at willtay.com.

source & further reading

github.com — original article

~/api · this article 200

$curl api.wpnews.pro/v1/news/show-hn-self-managing-co…

Read original on github.com → github.com/WillTaylor22/self-managing-codebase

mentioned entities

Anthropic

Next.js

Vercel

Sentry

Linear

Playwright

GitHub

metadata

slugshow-hn-self-managing-codebase-with-long-horizon-agents

topic#ai-agents

secondary4 topics

sentimentpositive

canonicalgithub.com

navigation

← prevTools and skills for humans and …

next →Ucell and ZTE complete large-sca…

── more in #ai-agents 4 stories · sorted by recency

dev.to · 10 Jul · #ai-agents

Teaching Claude Code to Write and Grow Its Own Skills: A Self-Replicating Agent Environment

9to5mac.com · 10 Jul · #ai-agents

Anthropic highlights Claude Code’s in-app browser on the desktop

cryptobriefing.com · 10 Jul · #ai-agents

OpenAI’s GPT-5.6 Sol tops presentation quality benchmark, and yes, crypto Twitter noticed the name

pub.towardsai.net · 10 Jul · #ai-agents

Is Grok 4.5 Really More Token Efficient Than Claude Opus 4.8? I Checked the Numbers

── more on @anthropic 3 stories trending now

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 8 Jul · #artificial-intelligence

SpaceXAI unveils Grok 4.5 AI model ahead of July 2026 public release

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required