{"slug": "i-let-the-ai-write-the-report-not-decide-the-alerts", "title": "I let the AI write the report, not decide the alerts", "summary": "A developer built a SOC triage tool called TriageLens that separates detection from AI-generated reporting. The tool uses deterministic TypeScript code for parsing, detection, and risk scoring, with AI only writing the final analyst-style summaries. This architecture ensures consistent, auditable findings every time, eliminating the unreliability of AI-driven verdicts.", "body_md": "I've been building a SOC triage tool called TriageLens, and the whole thing started from one annoyance. Every \"AI security analyst\" demo I tried was just a chatbot with a log pasted into the prompt. Ask it twice, get two different verdicts. For triage that's useless. If the tool says \"brute force, critical\" one run and \"looks fine\" the next, I can't trust either answer.\n\nSo I drew a hard line early. The AI doesn't get to decide what's a finding. It only gets to write the finding up.\n\nParsing, detection, and risk scoring are plain TypeScript. No model involved. The pipeline normalizes Windows Security 4688, Sysmon Event 1, Linux SSH `auth.log`\n\n, and generic JSON into one event shape, runs a list of detection rules over those events, and scores the result 0-100. All deterministic. Same logs in, same findings out, every time.\n\nThe AI layer sits at the very end. It takes the structured findings that already exist and turns them into analyst-style prose: a summary, per-finding notes, prioritized next steps. If I swap the provider from the built-in demo one to Ollama to Claude, the findings and the MITRE mapping don't move at all. Only the wording changes.\n\nThat property is the part I actually care about. The detections are auditable. The model is just the writer.\n\nEach detection rule is a pure function that looks at the events and returns evidence strings. Empty array means it didn't fire. Here's the one I'm happiest with, the chained one:\n\n```\n{\n  id: 'successful-auth-after-brute-force',\n  title: 'Successful login after brute-force activity',\n  severity: 'critical',\n  techniques: techniques('T1110', 'T1078'),\n  detect: (events) => {\n    const failsByIp = countFailuresByIp(events)\n    const evidence: string[] = []\n    for (const e of events) {\n      if (\n        e.eventId === 'auth-success' &&\n        e.sourceIp &&\n        (failsByIp[e.sourceIp] ?? 0) >= 5\n      ) {\n        evidence.push(\n          `Successful login for \"${e.user}\" from ${e.sourceIp} after ${failsByIp[e.sourceIp]} failures`,\n        )\n      }\n    }\n    return evidence\n  },\n}\n```\n\nOn its own, a pile of failed SSH logons is just noise. Lots of hosts get sprayed all day. What changes the picture is a success from the same IP that just failed a bunch. The brute-force rule alone is `high`\n\n. The success-after-brute-force rule is `critical`\n\n, mapped to T1110 and T1078, because at that point you're probably looking at a real compromise, not background scanning.\n\nWriting it as a plain function means I can unit test it with a handful of fake events and know it fires on exactly the case I want. No prompt tuning, no \"please respond in JSON.\" `countFailuresByIp`\n\nis about six lines and counts `auth-failure`\n\nevents per source IP. The whole rule file reads top to bottom like a checklist.\n\nMy first version actually did hand the raw logs to the model and ask it to return findings as JSON. It worked in the demo and fell apart the moment I fed it anything weird. Sometimes it invented an event ID that wasn't in the log. Once it confidently flagged a normal `svchost`\n\nas a LOLBin. And the JSON would occasionally come back with a trailing comment or markdown fence that broke the parser.\n\nI spent a day trying to prompt my way out of that and then gave up on the approach entirely. Moving detection into code wasn't a performance decision, it was a \"I need to be able to trust this\" decision. The model is great at writing the summary. It's bad at being the source of truth.\n\nThe honest part. It only reads EVTX if you've already exported it to JSON. There's no native `.evtx`\n\nbinary parser yet, which is the next thing on the roadmap, and right now that export step is an annoying manual hop. The rule set is small, seven rules, so it catches the obvious stuff (encoded PowerShell, Office spawning a child process, log clearing, the SSH chain) and misses plenty. I want Sigma import so I'm not the only one writing detections in my own format.\n\nIt also isn't a SIEM and I'm not pretending it is. It's a learning project and a triage aid. It does not replace tuned detection content or a human deciding what matters.\n\nIt runs with zero setup though. `npm install`\n\n, `npm run dev`\n\n, a sample log is already loaded, click Analyze. The default provider needs no API key, so you can see the whole loop without signing up for anything.\n\nRepo's here if you want to poke at it or tell me which rule is wrong: [https://github.com/TiltedLunar123/triagelens](https://github.com/TiltedLunar123/triagelens)\n\nBuilt with React, TypeScript, Vite, and vitest for the rule tests. Happy to take detection ideas, that's the part I most want to grow.", "url": "https://wpnews.pro/news/i-let-the-ai-write-the-report-not-decide-the-alerts", "canonical_source": "https://dev.to/tiltedlunar123/i-let-the-ai-write-the-report-not-decide-the-alerts-593o", "published_at": "2026-05-31 10:45:18+00:00", "updated_at": "2026-05-31 11:12:32.128963+00:00", "lang": "en", "topics": ["ai-tools", "ai-safety", "artificial-intelligence", "ai-products", "ai-agents"], "entities": ["TriageLens", "Windows Security 4688", "Sysmon Event 1", "Linux SSH", "Ollama", "Claude", "MITRE"], "alternates": {"html": "https://wpnews.pro/news/i-let-the-ai-write-the-report-not-decide-the-alerts", "markdown": "https://wpnews.pro/news/i-let-the-ai-write-the-report-not-decide-the-alerts.md", "text": "https://wpnews.pro/news/i-let-the-ai-write-the-report-not-decide-the-alerts.txt", "jsonld": "https://wpnews.pro/news/i-let-the-ai-write-the-report-not-decide-the-alerts.jsonld"}}