I've been building PR Focus, a Chrome extension that helps developers triage GitHub pull requests. One of the first decisions I had to make was: how do I actually sort PRs by priority?
The obvious answer is "use AI to score the risk". But I didn't want to rely 100% on an LLM because:
So I built a hybrid system: deterministic signals (CI status + PR age) form the floor, and the AI risk score is a tiebreaker on top. Failing CI always floats to the top, regardless of what the AI says.
I wrote up the full decision, including the trade-offs and what it cost, in my new Build Logs repo.
If you're building dev tools or wrestling with AI reliability, the full log might be useful:
🔗 [Why PR risk scoring is a hybrid, not a pure AI verdict](https://github.com/projekta2/build-logs/blob/main/001-pr-focus-risk-scoring.md)