cd /news/large-language-models/show-hn-i-applied-lyapunov-stability… Β· home β€Ί topics β€Ί large-language-models β€Ί article
[ARTICLE Β· art-36100] src=github.com β†— pub= topic=large-language-models verified=true sentiment=↑ positive

Show HN: I applied Lyapunov stability theory to detect when LLM agents spiral

A developer released state-harness, an open-source Python library that uses Lyapunov stability theory to detect and classify failure patterns in multi-turn LLM agents without extra LLM calls. The tool monitors token consumption relative to a baseline, identifying spirals, retry storms, and policy drift while providing actionable fix suggestions. Validated across 3,175 runs with zero false positives, it aims to improve debugging and compute efficiency for production agent systems.

read16 min views1 publishedJun 22, 2026
Show HN: I applied Lyapunov stability theory to detect when LLM agents spiral
Image: source

Lyapunov-stability monitor for multi-turn LLM agents. Detects token spirals, classifies failure patterns, and tells you why a task failed β€” no extra LLM calls.

from state_harness import GrowthRatioGuard, FailureReport

guard = GrowthRatioGuard(token_budget=50_000)

with guard:
    for turn in agent_loop:
        result = llm.invoke(turn.prompt)
        guard.record_step(tokens_used=result.usage.total_tokens)

report = FailureReport.from_guard(guard)
print(report)
⚠️  STABILITY TRIPPED at turn 12

Pattern: Context Accumulation Spiral (confidence: 92%)
  β€’ Last 5 turns all exceeded 1.5Γ— baseline (4/4 were accelerating).
  β€’ Peak growth ratio: 5.2Γ— baseline.
  β€’ Without intervention, projected cost was $0.0396 (actual: $0.0039).

Energy: β–β–β–β–β–β–‚β–‚β–ƒβ–„β–†β–ˆ
  Baseline: 1050 tokens/turn
  Peak ratio: 5.2Γ— baseline

Cost: $0.0039 (saved ~$0.0357 by tripping early)

Suggested actions:
  πŸ”΄ 1. Enable RG history compression in your agent loop.
     β†’ Compressing older messages reduces prompt tokens by 40-60%.
  🟑 2. Lower the growth ratio threshold to 1.8Γ—.
     β†’ A lower threshold would have caught it earlier.
  🟒 3. Add a sliding-window context strategy.
     β†’ Send only the last N messages plus a summary of earlier ones.

Production multi-agent systems fail at rates of 41–87% (Kore.ai 2026). When an agent spirals β€” replaying full context, retrying a broken tool, drifting off-task β€” a budget cap will kill it, but tells you nothing about why.

State-harness monitors token consumption relative to a warmup baseline via a Lyapunov energy function. When the growth ratio exceeds a threshold for W consecutive steps, it trips and classifies the failure pattern (context spiral, retry storm, policy drift) with fix suggestions β€” from the energy trajectory alone, no LLM calls.

pip install state-harness

and wrap your agent loop.

Pattern Signal Example
Context Spiral
Token growth accelerating beyond baseline Agent replaying full history each turn
Retry Storm
Low-variance repeated calls Tool failing, agent retrying identically
Policy Drift
VSA similarity score dropping Agent going off-topic mid-conversation
Early Explosion
Token spike in first 3 turns Oversized system prompt or tool response
Budget Exhaustion
Cumulative spend hits ceiling Complex task, not necessarily broken

State-harness does not improve resolve rates β€” a naive budget cap achieves comparable task success (multi-trial results below). The value is:

Failure diagnosticsβ€” classified failure patterns with actionable fixes, not just "budget exceeded." No extra LLM calls.** Compute efficiency on long loops**β€” 38.6% fewer search nodes and 30% less wall time on SWE-bench by terminating dead-end branches early.

Validated across 3,175 runs (4 benchmarks, 5-condition ablation, multi-trial with bootstrap CIs). Zero false positives across 7 models incl. 4 local via Ollama. Details in Benchmarks.

Search-tree agents(MCTS, beam search) β€” per-branch caps look fine in isolation; tree-level cost explosion is silent.** Platform teams at scale**β€” failure classification at the edge, exported as OpenTelemetry attributes.** Benchmarking**β€” the ~4–5% nondeterminism floor means single-run deltas <8% are noise.

Not needed for chatbots, RAG, single-turn apps, or ReAct loops with <10 turns β€” max_iterations

  • budget cap suffice.
pip install state-harness

Python β‰₯ 3.10. Pre-built wheels for Linux, macOS, Windows (x86_64 + ARM64). No Rust toolchain needed.

git clone https://github.com/vishal-dehurdle/state-harness.git
cd state-harness

python -m venv .venv && source .venv/bin/activate

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

pip install maturin
maturin develop --release

pip install pytest
pytest tests/

GrowthRatioGuard

normalizes token usage against a baseline β€” trips only on disproportionate growth, not natural context-window accumulation.

from state_harness import GrowthRatioGuard, StabilityViolation

guard = GrowthRatioGuard(
    token_budget=100_000,     # hard ceiling
    ratio_threshold=2.0,      # trip when turn is 2Γ— the baseline
    window=3,                 # 3 consecutive escalating turns to trip
    budget_gate=8_000,        # don't trip until 8K tokens spent
)

with guard:
    for turn in agent_loop:
        try:
            result = llm.invoke(turn.prompt)
            guard.record_step(
                tokens_used=result.usage.total_tokens,
                errors=0,
            )
        except StabilityViolation as e:
            print(f"Agent killed: {e}")
            break

print(f"Total cost: {guard.total_tokens} tokens")
print(f"Baseline: {guard.baseline} tokens/turn")
print(f"Peak ratio: {guard.current_ratio}Γ—")

After any execution (tripped or not):

from state_harness import FailureReport

report = FailureReport.from_guard(guard, model="gemini-2.5-flash")

print(report)

import json
print(json.dumps(report.to_dict(), indent=2))

Classifies the failure pattern, provides evidence, estimates cost, and suggests fixes β€” no LLM calls.

For lower-level control using raw token counts (no normalization):

from state_harness import BoundaryGuard

with BoundaryGuard(token_budget=100_000, lambda_=1.0, window=5) as guard:
    for turn in agent_loop:
        result = llm.invoke(turn.prompt)
        guard.record_step(
            tokens_used=result.usage.total_tokens,
            errors=0,
            tool_name="search",
        )
python
from state_harness import boundary_guard

@boundary_guard(
    token_budget=50_000,
    token_counter=lambda r: r.usage.total_tokens,
)
def agent_step(prompt: str):
    return llm.invoke(prompt)
python
from langgraph.prebuilt import create_react_agent
from state_harness.adapters import monitor_graph

agent = create_react_agent(model, tools=[search, calculate])
safe = monitor_graph(agent, token_budget=100_000)

result = safe.invoke({"messages": [("user", "Fix the login bug")]})

print(safe.total_tokens)  # cumulative usage
print(safe.tripped)       # did stability trip?
print(safe.report)        # full FailureReport with pattern + suggestions

For streaming:

for chunk in safe.stream({"messages": [("user", "Refactor this module")]}):
    print(chunk)

With a trip callback (e.g., for Slack alerts):

safe = monitor_graph(
    agent,
    token_budget=100_000,
    on_trip=lambda report: slack.send(f"Agent tripped: {report.pattern}"),
)

Advanced: per-tool wrapping with LangGraphMiddleware #

from state_harness import BoundaryGuard
from state_harness.adapters import LangGraphMiddleware

guard = BoundaryGuard(token_budget=150_000)
middleware = LangGraphMiddleware(guard)

@middleware.wrap_tool
def search_database(query: str):
    return db.search(query)

with guard:
    result = agent.invoke({"messages": [...]})
python
from crewai import Agent, Task, Crew
from state_harness.adapters import CrewAICallback

callback = CrewAICallback(token_budget=200_000)

crew = Crew(
    agents=[researcher, writer],
    tasks=[research_task, write_task],
    step_callback=callback.step_callback,
    task_callback=callback.task_callback,
)

result = crew.kickoff()
print(callback.report)  # FailureReport
callback.close()
python
from state_harness import BoundaryGuard
from state_harness.adapters import VanillaHook

guard = BoundaryGuard(token_budget=50_000)
hook = VanillaHook(guard)

with guard:
    for step in agent_loop:
        hook.before_call(tool_name="search")
        result = execute_tool(step)
        hook.after_call(tokens_used=result.tokens)
state-harness simulate 1000 1200 1500 2000 3000 5000 8000 --budget 50000

state-harness analyze report.json
state-harness analyze report.json --json    # JSON output
state-harness analyze report.json --otel    # OpenTelemetry attributes

state-harness batch --dir ./reports/ --output results.csv

FailureReport

supports multiple output formats:

report = FailureReport.from_guard(guard)

report.to_json()            # pretty-printed
report.to_json(indent=None) # compact, single line

with open("results.csv", "w") as f:
    f.write(FailureReport.csv_header() + "\n")
    for r in reports:
        f.write(r.to_csv_row() + "\n")

from opentelemetry import trace
span = trace.get_current_span()
span.set_attributes(report.to_otel_attributes())

Three mechanisms, implemented in Rust (via PyO3):

graph TD
    A["Agent Loop"] --> B["GrowthRatioGuard\n(Python SDK)"]
    B --> |"Normalizes tokens β†’ growth ratio\nWarmup baseline Β· Budget gate"| C{" "}
    C --> D["Lyapunov Monitor\nV(k) = S + λθ\nΞ”V β‰₯ 0?"]
    C --> E["RG Decimator\nTF-IDF\nCompression"]
    C --> F["Holographic Engine\n(VSA)\nDrift Detection"]
    
    style D fill:#1a1a1a,stroke:#555,color:#e8e8e8
    style E fill:#1a1a1a,stroke:#555,color:#e8e8e8
    style F fill:#1a1a1a,stroke:#555,color:#e8e8e8
    style B fill:#0d1117,stroke:#30363d,color:#e6edf3
Component Purpose Speed
Lyapunov Monitor
Tracks energy derivative Ξ”V(k). Trips when Ξ”V β‰₯ 0 for W consecutive steps. ~1ΞΌs/step
RG Decimator
RG-inspired decimation of conversation history (TF-IDF scoring). Retains structurally important messages. ~100Β΅s/compress
Holographic Engine
VSA-based policy drift detection. Binds domain invariants to high-dimensional vectors. ~10ΞΌs/check

5-condition ablation across 4 benchmarks (3,175 total runs). Full methodology in the research paper.

Condition Lyapunov RG Decimation VSA Dual-Gate Description
A. Baseline
β€” β€” β€” Unmonitored agent
B. Lyapunov-only
βœ… β€” β€” Energy monitoring, no intervention
C. Lyapunov+RG
βœ… βœ… β€” + history compression on violation
D. Full-stack
βœ… βœ… βœ… + policy drift gating
E. Naive Cap
β€” β€” β€” Hard budget cap (control)
Benchmark Runs Stability Trips Cost Savings (D vs A) Resolve-Rate Ξ” Diagnostics
MINT (reasoning + coding)
1,136 0 ~0% βˆ’0.7pp (noise) N/A (no trips)
τ³-bench (customer service)
750 0 8.1% within Β±12pp nondeterminism N/A (no trips)
SWE-bench Verified (coding)
333 + 148 ~38% 38.6% (nodes) βˆ’3.6pp (within Β±4–5% noise) Pattern classification
Custom Local (4 models)
240 3 (true pos.) 15.2% 0pp Pattern classification
MINT Local (Qwen3:4B)
568 0 ~0% +1.8pp N/A (no trips)

Resolve-rate deltas fall within LLM nondeterminism (~4–5% stdev). No trips on short/medium loops (1,886 runs). Savings concentrate on long-loop search trees.

37 Django instances, SWE-bench Verified. Agent: moatless-tools SearchTree, 50-node budget. Model: Gemini 2.5 Flash.

Condition Resolved Rate Total Nodes Wall Time Nodes/Resolve
A. Baseline
15 / 37 40.5% 945 80 min 63.0
B. Lyapunov
16 / 37 43.2% 620 69 min 38.8
D. Full-stack
14 / 37 37.8% 580
56 min
41.4
E. Naive Cap
21 / 37 56.8% 876 77 min 41.7

Note:Single-trial resolve rates have ~Β±8pp standard error. E's apparent 56.8% is not statistically significant vs A's 40.5%. Multi-trial results below confirm this.

Full-stack monitoring: 38.6% fewer nodes (945 β†’ 580), 30% less wall time (80 β†’ 56 min). Baseline had 7 tasks burning the full 50-node budget (all failed); with monitoring, zero hit ceiling. Lyapunov alone (Condition B, ~5 lines of code) delivers ~90% of the savings.

Ablation β€” each mechanism contributes independently:

Layer Added Compute (nodes) Ξ” vs Baseline Cumulative Reduction
A. No monitoring 945 β€” β€”
B. + Lyapunov 620 βˆ’325 34.4%
D. + RG + VSA 580 βˆ’40 38.6%

Lyapunov alone delivers ~90% of the benefit. RG and VSA add incremental value.

3 trials per condition (A, D, E) across all 37 instances β€” 333 total runs. 12 runs stuck in Docker (28+ min), counted as failures:

Condition Trial 1 Trial 2 Trial 3 Mean Β± Οƒ
A. Baseline
18/37 (48.6%) 16/37 (43.2%) 15/37 (40.5%) 44.1% Β± 4.1%
D. Full-stack
15/37 (40.5%) 16/37 (43.2%) 14/37 (37.8%) 40.5% Β± 2.7%
E. Naive Cap
19/37 (51.4%) 15/37 (40.5%) 17/37 (45.9%) 45.9% Β± 5.4%

Cross-condition variance (2.9%) ≀ within-condition nondeterminism (4.1%). All differences fall within the noise band.

The ~4% within-condition stdev converges with τ³-bench (Β±4.6%), establishing a ~4–5% nondeterminism floor for Gemini 2.5 Flash on code tasks. Single-run deltas <8% are unreliable.

Bootstrap CIs (10,000 resamples) and Welch's t-tests: Aβˆ’D = +3.6pp [βˆ’0.9, +8.1], p β‰ˆ 0.17; Aβˆ’E = βˆ’1.8pp [βˆ’8.1, +4.5], p β‰ˆ 0.68; Dβˆ’E = βˆ’5.4pp [βˆ’10.8, 0.0], p β‰ˆ 0.09. Full analysis in paper Β§7.3.1.

50 tasks Γ— 3 trials Γ— 5 conditions = 750 total runs. Agent handles airline reservations via tool calls. Model: Gemini 2.5 Flash. Concurrency=1.

Condition Trial Pass Rate Task Pass (maj) Rate Cost Cost Ξ”
A. Baseline
99/150 66.0% 35/50 70.0% $2.47 β€”
B. Lyapunov-only
83/150 55.3% 28/50 56.0% $2.42 βˆ’2.0%
C. Lyapunov+RG
79/150 52.7% 26/50 52.0% $1.69 βˆ’31.8%
D. Full-stack
86/150 57.3% 30/50 60.0% $2.28 βˆ’8.1%
E. Naive Cap
81/150 54.0% 26/50 52.0% $2.33 βˆ’5.7%

Key findings:

Zero stability trips across 750 runs. All airline tasks classified as stable; no interventions.Pass-rate variance is nondeterminism. Naive cap (E, zero monitoring) drops βˆ’16pp from baseline β€”worsethan full-stack (D, βˆ’10pp). The ~10–16pp spread is intrinsic variance.25% of tasks flip pass/fail within the same condition across trials (~Β±12pp nondeterminism floor).8.1% cost savings from passive monitoring (zero interventions).

284 tasks Γ— 4 conditions = 1,136 total runs across GSM8K (48), MATH (100), HumanEval (45), MBPP (91). Agent uses up to 5 turns per task.

Condition GSM8K MATH Total Tokens
A. Baseline
91.7% 39.0% 29.2%
1,909,582
B. Lyapunov
91.7% 41.0% 29.9%
1,904,421
C. Lyapunov+RG
89.6% 37.0% 28.2%
1,910,926
D. Full-stack
87.5% 39.0% 28.5%
1,949,708

Zero stability violations across 1,136 runs. Token usage invariant (<2% overhead).

Failed tasks cost disproportionately more:

Task Success Avg Failure Avg Ratio
GSM8K 2,613 tok 8,857 tok 3.4Γ—
MATH 5,154 tok 8,188 tok 1.6Γ—

HumanEval and MBPP show 0% across all conditions β€” a MINT framework limitation in code execution evaluation, consistent across conditions (harness does not introduce new failure modes).

20 custom tasks (5 easy, 10 medium, 5 hard) Γ— 4 models Γ— 3 conditions = 240 runs. Hardware: Apple M4 MacBook Pro, 16 GB RAM, Ollama local inference.

Model Size Baseline Harness Naive Cap Token Savings FP
Llama 3.2:3B
2.0 GB 45% 45% 60% 1.2% 0
Phi-4-Mini
2.5 GB 30% 30% 40% 20.7% 0
Qwen3:4B
2.5 GB 30% 30% 40% 0.9% 0
Gemma4:E4B
9.6 GB 35% 35% 70% 37.9% 0

Key findings:

Zero false positives across 80 harness runsβ€” 4 model families, 3 difficulty tiers. Growth-ratio generalizes without threshold retuning.** Small-model self-sabotage:*Naive cap beats baseline by +17.5pp avg (+12.5pp median). Small models solve early turns correctly, then destroy solutions in later turns. Strongest on Gemma4:E4B (+35pp).Model-family behavioral signatures: Llama 3.2:3B:Classic spirals (ratios: 2.3Γ—, 5.9Γ—, 7.6Γ—) β€” 3 true-positive tripsPhi-4-Mini:Spike-and-recover β€” 20.7% passive savingsQwen3:4B:255K tokens but flat ratios (≀1.06Γ—) β€” stable despite 3Γ— volumeGemma4:E4B:*Decreasing ratios β€” 37.9% passive savings, zero trips

Deploying ≀4B models via Ollama? State-harness works out of the box (zero false positives). The self-sabotage finding suggests adding a turn limit (2–3 turns) for open-ended code generation.

Task Harness (max=5) Naive Cap (max=2) Ξ”
GSM8K 37.5% 27.1% +10.4pp
MATH 0.0% 0.0% β€”
HumanEval 11.1% 11.1% β€”
MBPP 14.3% 14.3% β€”
Total
12.7%
10.9%
+1.8pp

Zero interventions across 284 tasks. With max 5 turns and W=3, the monitor cannot trigger within available post-warmup turns β€” a structural guarantee.

Full reproduction steps (all three benchmarks) #

git clone https://github.com/vishal-dehurdle/state-harness.git
git clone https://github.com/sierra-research/tau-bench.git tau3-bench

cd state-harness
python -m venv .venv && source .venv/bin/activate
pip install maturin && maturin develop --release

cd ../tau3-bench
uv sync
cp ../state-harness/tau3_integration/harness_agent.py src/tau2/agent/
cp ../state-harness/tau3_integration/naive_cap_agent.py src/tau2/agent/

export GOOGLE_CLOUD_PROJECT=your-project-id
export VERTEXAI_LOCATION=asia-south1

bash benchmarks/tau3/run_5phase_airline.sh

bash benchmarks/swe_bench/run_benchmark.sh
bash benchmarks/swe_bench/run_benchmark_dbe.sh

bash benchmarks/mint/run_mint_fullstack.sh

Ablation conditions are controlled via environment variables:

Variable Values Effect
HARNESS_RG
on / off
Enable/disable RG history compression
HARNESS_VSA
on / off
Enable/disable VSA policy drift detection
HARNESS_RATIO_THRESHOLD
float (e.g., 2.0 )
Override growth ratio threshold
HARNESS_BUDGET_GATE
int (e.g., 8000 )
Override minimum spend before trip

See benchmarks/ for setup, configs, and reproduction instructions.

Multi-trial SWE-benchβ€” 333 runs (3 trials Γ— 3 conditions Γ— 37 instances) confirming non-invasiveness within Β±4% noise band - Local model validationβ€” 240 runs across 4 open-weight models (Llama, Phi, Qwen, Gemma) + 568 MINT runs on Qwen3:4B - Terminal-Benchβ€” Terminal-based agent tasks; command-line tool loops where spirals manifest as repeated failed commands - SWE-bench Proβ€” Harder, contamination-resistant variant of SWE-bench - Cross-model validationβ€” 7 models total: GPT-4o-mini, Claude Haiku 4.5, Gemini 2.5 Flash + Llama 3.2:3B, Phi-4-Mini, Qwen3:4B, Gemma4:E4B

37 SWE-bench instancesβ€” A larger sample would improve statistical power (n=3 trials gives limited degrees of freedom for t-tests).** No causal intervention**β€” The harness currently kills spiraling tasks. Redirect/repair is on the roadmap.** Physics-inspired, not physics-equivalent**β€” Terms like "Renormalization Group" and "Lyapunov stability" are used as structural inspirations. The mathematical mapping is analogical, not isomorphic.Custom benchmark scaleβ€” The 20-task local battery is smaller than standard benchmarks. The self-sabotage finding (mean +17.5pp, median +12.5pp) is consistent across 4 models but requires larger-scale replication.

Parameter Default Description
token_budget
100,000 Hard ceiling on cumulative tokens
ratio_threshold
2.0 Growth ratio above which a turn counts as "escalating" (domain-tuned: airline=2.0, retail=2.5, telecom=2.0)
window
3 Consecutive escalating turns before circuit breaker trips
warmup_turns
3 Turns used to establish baseline (no monitoring during warmup)
budget_gate
8,000 Minimum cumulative tokens before the monitor can trip (retail: 12,000)
lambda_
1.0 Error weighting in the Lyapunov energy function

Environment variable overrides (highest precedence, for threshold sweeps):

Env Var Description
HARNESS_RATIO_THRESHOLD
Override ratio_threshold (e.g., 2.5 )
HARNESS_BUDGET_GATE
Override budget_gate (e.g., 12000 )

Tuning tips:

More aggressive(catch spirals earlier):ratio_threshold=1.8, window=2

More conservative(fewer false positives):ratio_threshold=2.5, window=3

High-value tasks: Increasebudget_gate

to 20K+ to let expensive tasks run longerComplex domains(retail, multi-tool): Start withratio_threshold=2.5

Lyapunov stability: V(k) = S(k) + λθ(k) models token consumption as a dynamical system. Ξ”V β‰₯ 0 for W consecutive steps β†’ unstable.** Renormalization Group (RG): Message compression via coarse-graining β€” eliminates high-frequency noise, preserves scale-invariant task objectives. Vector Symbolic Architecture (VSA)**: Domain policies bound to high-dimensional bipolar vectors (10,000-d, i8), enabling constant-time drift detection outside the LLM context window.

Implements the framework from:

Empirical Lyapunov Stability: Growth-Ratio Energy Functions as Leading Indicators of Agent Task FailureVishal Verma, 2026[Read the full paper β†’]

Full ablation, multi-trial validation, local-model results, and failure taxonomy. Key results reproduced in Benchmarks above.

If you use this library or refer to these findings in your research, please cite the preprint:

@misc{verma2026empirical,
  author       = {Verma, Vishal},
  title        = {Empirical Lyapunov Stability: Growth-Ratio Energy Functions as Leading Indicators of Agent Task Failure},
  month        = jun,
  year         = 2026,
  publisher    = {Zenodo},
  version      = {1.0.0},
  doi          = {10.5281/zenodo.20722987},
  url          = {https://doi.org/10.5281/zenodo.20722987}
}

Based on the theoretical framework from:

The Fluid Dynamics of Multi-Agent AI: Resolving d'Alembert's Paradox of Generative WorkflowsVishal Verma, 2026[Read β†’]

See CONTRIBUTING.md for dev setup, code style, and PR guidelines.

Adaptive thresholdβ€” Auto-tune Ο„ based on task complexity signal from early turns - Causal interventionβ€” Instead of killing spiraling tasks, redirect them (prompt injection, tool restriction) - Streaming supportβ€” Token-level monitoring for streaming LLM responses - Multi-model validationβ€” 7 models validated: GPT-4o-mini, Claude Haiku 4.5, Gemini 2.5 Flash + 4 local models via Ollama - Dashboard / observabilityβ€” Optional lightweight UI for monitoring energy trajectories in real-time

See SECURITY.md. Do not open public issues for security reports.

Split-core licensing:

Component License Notes
Rust Core (src/ )
BSL 1.1 Free for non-commercial + ARR < $1M. Converts to Apache 2.0 on May 26, 2030.
Python SDK (python/ )
Apache 2.0 Fully permissive.

See LICENSE.md for full details.

── more in #large-language-models 4 stories Β· sorted by recency
── more on @state-harness 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain β€” perfect for shipping the agent you just read about.

$git push zahid main
β†’ Live at https://your-agent.zahid.host βœ“
Get free account β†’ Pricing
from €0/mo Β· no card required
LIVE [news/show-hn-i-applied-ly…] indexed:0 read:16min 2026-06-22 Β· β€”