cd /news/ai-agents/stop-asking-ai-for-answers-start-ask… · home topics ai-agents article
[ARTICLE · art-43199] src=dev.to ↗ pub= topic=ai-agents verified=true sentiment=· neutral

Stop Asking AI for Answers. Start Asking If the Evidence Is Ready.

Vassiliy Lakhonin's Agenda Intelligence MD project shifts the focus of AI agents from generating answers to assessing whether evidence is ready for human trust. The open-source Python package provides a runtime for evidence-readiness and trust-routing in high-stakes AI-assisted decisions, such as regulated procurement and clinical review. It surfaces evidence gaps and decision-readiness rather than producing polished summaries.

read5 min views1 publishedJun 29, 2026

Most AI agents are optimized to produce an answer.

But in serious workflows, the answer is not the hard part.

The hard part is knowing whether that answer is supported well enough for a human to trust it, act on it, or escalate it.

That is the problem I am working on with Agenda Intelligence MD:

An evidence-readiness and trust-routing runtime for high-stakes AI-assisted decisions.

GitHub: vassiliylakhonin/agenda-intelligence-md

Summarization is useful.

But many real-world decisions are not blocked by the lack of a summary. They are blocked by uncertainty:

This matters in workflows like:

In those settings, a polished AI-generated memo can be dangerous if it hides evidence gaps.

Agenda Intelligence MD is built around a different idea:

The next layer of agent infrastructure is not better summarization. It is knowing when an AI-generated brief is not ready to be trusted.

Agenda Intelligence MD turns messy input packs into structured human-review packets.

The inputs can be things like:

The output is not just a summary.

It is a structured review layer that surfaces:

The goal is not to replace human judgment.

The goal is to make the review surface clearer before a human makes a decision.

A normal summarizer asks:

“What does this document say?”

Agenda Intelligence MD asks:

“Is this document ready to support a decision?”

That distinction changes the architecture.

Instead of treating the AI output as the final deliverable, the project treats it as something that must pass through a readiness layer.

For example, a vendor might claim that their AI product is safe for regulated enterprise use.

A summarizer can compress that claim into a nice paragraph.

Agenda Intelligence MD is designed to ask a more useful set of questions:

That is the difference between generating text and routing trust.

The project is implemented as a Python package with multiple delivery surfaces around one core service layer.

It includes:

This makes it usable in several different modes.

You can inspect it locally through the CLI.

You can integrate it into an agent workflow through MCP.

You can expose structured behavior over HTTP.

You can experiment with A2A-style agent routing.

The interesting part is not just that these interfaces exist. It is that they point toward the same product idea: evidence-readiness should be a reusable layer, not a one-off prompt.

After installing the package, the basic local flow looks like this:

pip install agenda-intelligence-md

agenda-intelligence doctor
agenda-intelligence validate-brief examples/agenda-brief.json
agenda-intelligence score examples/agenda-brief.json --evidence examples/source/evidence-pack.json
agenda-intelligence weekly-delta examples/strategic-infrastructure-bankability/status.synthetic.md

The commands are designed to answer practical questions:

That last question is the most important one.

Because in real decision workflows, “what is missing?” is often more valuable than “what is the answer?”

One of the current discovery wedges for the project is AI vendor evidence-readiness for regulated procurement.

Imagine a buyer reviewing an AI vendor for an enterprise or regulated environment.

The buyer has:

A normal AI assistant can summarize the vendor.

But a buyer does not only need a summary.

They need a review packet:

That is the kind of workflow Agenda Intelligence MD is designed to support.

It is not trying to be the decision-maker.

It is trying to prepare the decision surface.

The repository also includes vertical profiles and demo surfaces for several high-stakes workflows, including:

These are not generic chatbot personalities.

They are structured reasoning surfaces for evidence-heavy review workflows.

The pattern is:

input pack -> structured review packet -> evidence gaps -> owner actions -> decision-readiness route

That pattern is useful because many high-stakes workflows fail in the handoff between AI output and human responsibility.

Agenda Intelligence MD focuses on that handoff.

This project is intentionally bounded.

It is not:

The scoring is heuristic.

It evaluates structure, source coverage, evidence labeling, and decision-readiness signals.

It does not prove that a claim is true.

That boundary matters.

The point is not to say:

“The AI is right.”

The point is to say:

“Here is what the AI-assisted packet can support, here is what it cannot support, and here is where a human needs to review.”

MCP and A2A are interesting because they push agent systems toward composable infrastructure.

But composability also increases risk.

If agents can call tools, route tasks, and generate structured outputs, then they also need a way to communicate uncertainty, missing evidence, and escalation requirements.

Otherwise, agent systems become very good at moving unsupported claims through a workflow faster.

Agenda Intelligence MD is an experiment in making the trust layer explicit.

Not hidden in a prompt.

Not buried in a paragraph.

Not left to the final reviewer to reconstruct manually.

Instead, the runtime exposes readiness, gaps, and routing as structured outputs.

I started from a simple observation:

A lot of AI work focuses on making outputs more fluent.

But in serious workflows, fluency is not the bottleneck.

The bottleneck is whether the output is usable for a decision.

A beautiful memo with missing evidence is still a weak memo.

A confident recommendation with unclear source coverage is still risky.

A summary that does not show what it cannot support is not enough.

I wanted a system that treats evidence gaps as first-class objects.

You may find the project interesting if you are working on:

The repo is especially relevant if you are asking:

How do we make AI-assisted workflows more reviewable before they become more autonomous?

If you open the repository, I would suggest looking at four areas:

The CLI flow

Start with the examples and validation commands.

The schemas

The schemas show what the project treats as structured review output.

The MCP integration

This is useful if you are thinking about agent-tool interoperability.

The vertical profiles

These show how the same evidence-readiness pattern can be adapted to different domains.

I do not think every AI agent needs to make more decisions.

I think many AI agents need to become better at saying:

That is less flashy than autonomous decision-making.

But it is much closer to what many real organizations need.

The future of AI infrastructure will not only be about agents that can act.

It will also be about systems that know when not to act yet.

That is the layer Agenda Intelligence MD is exploring.

GitHub: vassiliylakhonin/agenda-intelligence-md

If this direction is interesting to you, I would appreciate your reactions, issues, critiques, or architecture reviews.

── more in #ai-agents 4 stories · sorted by recency
── more on @vassiliy lakhonin 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/stop-asking-ai-for-a…] indexed:0 read:5min 2026-06-29 ·