# I got tired of vibe investing, so I built an AI committee that shows its work

> Source: <https://dev.to/campnoumiracle2022/i-got-tired-of-vibe-investing-so-i-built-an-ai-committee-that-shows-its-work-44md>
> Published: 2026-06-29 19:34:16+00:00

We have vibe coding now — so I guess we have vibe investing too. Most "AI stock picker" tools work the same way: you feed in a ticker, and out comes **"BUY — confidence 87%"** with no way to see *why*. You're not analyzing anything; you're trusting a vibe. If I can't inspect the reasoning, I can't trust the call, and I definitely can't learn from it.

So I built the opposite: a multi-agent committee where the output isn't a score — it's an auditable trail from raw evidence to a final trade thesis. Bull and bear analysts argue the ticker out, a risk manager signs off, and the final decision is *required to cite the specific evidence it rests on*. I called it VerumTrade (Latin *verum*, "truth"). It's open source (Apache-2.0), and this post is about how it's put together, not a pitch.

The design constraint that drove everything: **at every step, you should be able to read the reasoning and inspect the evidence it was built on.** No black box. That turned out to map naturally onto a multi-agent graph, where each stage produces structured output the next stage consumes.

The pipeline runs six stages:

Each arrow in that chain is a structured handoff, not a vibe. The debate step was the one that surprised me most — forcing an explicit adversarial pass (a Bear agent whose only job is to attack the thesis) catches a lot of motivated reasoning that a single "analyst" agent happily glosses over.

I'll be upfront: VerumTrade started from [TradingAgents](https://github.com/TauricResearch/TradingAgents), an excellent open-source multi-agent trading framework. The committee, the bull/bear debate, the risk discussion — that lineage is theirs, and credit where it's due.

What bugged me about most of these systems (mine included, early on) is that the "reasoning" is just *printed*. You get nice markdown reports, but the final BUY/SELL/HOLD is effectively text-extracted, and nothing structurally ties the verdict to the facts that produced it. So I added the layer that was missing:

`id`

, supporting/contradicting fact IDs, a confidence, and a falsifier — not free-floating prose.`rationale_evidence_ids`

field — a non-empty list of the evidence IDs the call rests on. If the model can't point to what justifies the trade, validation fails.`stop_loss`

and `take_profit`

are required numeric fields; limit/stop prices are checked for coherence. No "BUY, idk, maybe set a stop somewhere."The point isn't that the predecessor is bad — it's that "show your work" should mean *structured, linked, and validated*, not *printed*.

**A two-tier LLM setup.** Running every agent on a frontier model is slow and expensive; running everything on a cheap model is unreliable on the steps that matter. So routine extraction and summarization run on a fast/cheap tier, while the debate and risk judgment run on a stronger tier. This kept cost sane without gutting quality on the decisions that actually move the recommendation.

**Provider independence.** I started on one provider and immediately regretted hardcoding it. The pipeline now runs against OpenAI-compatible endpoints generally — I've run full pipelines on Qwen and other backends by overriding the base URL. If you're building anything multi-agent, decouple from a single vendor early; retrofitting it later is painful.

The piece I'm most proud of is a **crowding / macro-pullback awareness** check — a guard that flags when a thesis is leaning on a crowded, macro-sensitive setup that looks great right up until it doesn't. It came directly from watching the naive version confidently recommend names that were one Fed headline away from unwinding.

There's a web app, a CLI, and a plain Python API. The programmatic entry point is about as minimal as I could make it:

``` python
from verumtrade import run_pipeline

# Returns the full state plus a structured trade decision
result = run_pipeline(ticker="MU")
print(result.decision.rationale)       # the human-readable thesis
print(result.decision.rationale_evidence_ids)  # the evidence IDs the call rests on
print(result.traces)                   # evidence -> debate -> decision, step by step
```

The output isn't just a verdict — `result.traces`

is the whole reasoning chain, and every decision points back at the evidence that justifies it.

This is a **research and decision-support** tool, not an oracle and not financial advice. Market data can be delayed or wrong, LLM outputs can be wrong, and trading involves real risk of loss. I treat its output as a structured second opinion — something to challenge, not obey.

It's early and I'm actively building. If the architecture is interesting to you, or you want to poke holes in the multi-agent design, the repo is here: [https://github.com/muye1202/VerumTrade](https://github.com/muye1202/VerumTrade)

I'd genuinely like feedback on the evidence-graph and the "decision must cite its evidence" constraint — what would *you* want to see in the trace to actually trust a recommendation? That question is most of why I open-sourced it.
