I looked for some material that might make this easier to move forward:
I think the abstraction is useful, but I would read it less as “prediction” and more as consequence-aware action admission.
The concrete unit here seems to be the action-preflight SYLLOG, with ORCA as the broader runtime architecture around it. In that reading, the problem is not agent freedom itself. The problem is unqualified candidate actions entering execution: actions that are underspecified, off-target, too broad, externally consequential, irreversible, or not yet authorized.
So the strongest version of the idea, for me, is something like:
before a candidate action becomes executable,
make the target, scope, missing inputs, constraints, side effects,
reversibility, uncertainty, and likely consequences explicit,
then route the action to proceed / clarify / revise / approve / escalate / block
That framing also makes “precognition” feel less like extra mystical reasoning and more like a reusable preflight / admission contract for agent actions.
A useful decision-tree reading might be:
internal + reversible
→ lightweight trace
useful but underspecified
→ clarify before execution
useful but too broad
→ narrow / revise
external / private / side-effecting / delegated
→ consent / authorization / approval / escalation
risky but repairable
→ generate a safer alternative and re-enter preflight
high-impact or disallowed
→ block before execution
conditions satisfied
→ execute with an audit trace
The thing I like about this framing is that it does not require every preflight step to be a full LLM deliberation. A lot of useful preflight can be cheap and structural:
schema / required fields
→ target / scope / destination
→ side-effect class
→ reversibility / idempotency / compensation
→ consent / authorization / policy
→ SYLLOG only for ambiguous or consequential cases
→ human approval only for high-impact cases
That may be a practical bridge between “be careful” prompts and heavy runtime safety systems.
#
How I would position the SYLLOG
I would separate three things:
| Layer | Role | Example output | Awareness | Make the candidate action and its likely consequences visible | “This sends data to an external recipient.” | Admission | Decide whether the action may enter execution | proceed / clarify / revise / approve / block | Enforcement | Make that decision non-optional at runtime | tool wrapper, policy engine, HITL , sandbox, OS-level policy |
The action-preflight SYLLOG seems strongest as the cognitive contract between awareness and admission.
I would not describe it as replacing guardrails, authorization, sandboxing, tracing, or HITL approval. I would describe it as something that can feed those layers with structured evidence:
candidate action
+ intended goal
+ context
+ constraints
+ uncertainty
+ affected entities
+ side effects
+ safer alternatives
+ continuation decision
That also helps avoid a common ambiguity: “guardrail” can mean input filtering, output moderation, tool-call validation, human approval, policy enforcement, or audit logging. The interesting boundary here is narrower:
candidate action → execution
This is different from final-answer moderation. It is also different from post-hoc observability. The point is to prevent the wrong action from becoming executable too early.
#
Existing hook points and neighboring layers
I would not treat these as replacements for the SYLLOG. I would treat them as useful integration targets or neighboring layers.
OpenAI Agents SDK: tool guardrails / human review
OpenAI’s Agents docs separate guardrails for input, output, and tool behavior. The tool-guardrail docs are especially relevant because input tool guardrails run before a function tool executes and can skip the call, replace the output, or raise a tripwire: OpenAI Agents SDK guardrails.
The API guide also frames guardrails and human review as mechanisms that can continue, , or stop a workflow, and notes that side-effecting tools should be validated close to where the side effect happens: OpenAI guardrails and human review.
Possible mapping:
| SYLLOG / admission concept | OpenAI-ish runtime concept | | proceed | continue / allow tool call | | clarify | stop before execution and ask | | revise | replace tool output / modify tool arguments | | escalate | human review | | block | tripwire / no execution |
The gap the SYLLOG could fill is: what should the tool guardrail know before making that decision?
LangGraph HITL: approve / edit / reject / respond
LangGraph’s HITL middleware is a useful vocabulary because it treats tool-call review as a runtime interrupt with decisions like approve
, edit
, reject
, and respond
: LangGraph human-in-the-loop docs.
That maps naturally to a preflight decision tree:
| Preflight state | HITL-style mapping | | safe as-is | approve | | useful but too broad / wrong args | edit | | missing information | respond / ask user | | unsafe or disallowed | reject | | high impact | interrupt for approval |
This is why I would not reduce the abstraction to allow/block. The useful middle states are important.
Claude Code hooks: cheap pre-tool checks
Claude Code has PreToolUse
hooks that run before tool execution. Hooks can return permission decisions such as allow
, ask
, deny
, or defer
, and the docs show examples of blocking destructive shell commands: Claude Code hooks reference.
This is a good example of a low-cost enforcement hook:
agent proposes tool call
→ PreToolUse hook checks command / path / args / policy
→ allow / ask / deny / defer
A SYLLOG could be heavier than this, but it does not have to replace this layer. It can sit above it, or only run when the cheap hook says “ambiguous” or “consequential”.
MCP consent / authorization
The MCP specification says tools represent arbitrary code execution and should be treated with caution. It also says hosts must obtain explicit user consent before invoking tools, and users should understand what each tool does before authorizing it: MCP specification.
This gives a nice connection point:
MCP consent / authorization asks:
“May this tool/action run?”
Action-preflight helps answer:
“What exactly is this action, what does it touch, and what is the user consenting to?”
OPA / policy-as-code
Open Policy Agent now explicitly describes AI-agent use cases: enforcing fine-grained policies over which tools an AI agent may call, what parameters are permitted, and how those tools can be used: Open Policy Agent.
I would separate responsibilities like this:
SYLLOG:
candidate action model
uncertainty
consequence awareness
alternatives
human-readable rationale
Policy engine:
deterministic allow / deny / require approval / step-up decision
That makes the SYLLOG useful even when enforcement is handled by something else.
#
Framework issues that suggest this is a recurring integration need
These do not prove that this SYLLOG is the answer. They are useful signals that people building agent frameworks are hitting nearby execution-boundary problems.
OpenAI Agents Python #2970 proposes pre-execution validation for tool calls: tool name, parameters, calling agent/context, target system, validity window, nonce/replay protection, and rejection before execution: issue #2970.
OpenAI Agents Python #2868 proposes per-tool authorization middleware. The issue distinguishes content guardrails from permission checks and proposes decisions like ALLOW
, DENY
, MODIFY
, DEFER
, and STEP_UP
: issue #2868.
OpenAI Agents Python #2515 asks for tool-execution governance: policy enforcement, threat detection, audit trails, and tool-call controls beyond input/output guardrails: issue #2515.
AutoGen #7405 proposes a GuardrailProvider
protocol for pre-execution interception, policy-based approval, audit logging, and argument sanitization: AutoGen issue #7405.
Haystack #10821 asks for automated tool-call policy enforcement beyond human confirmation, including rate limits, argument validation, scope restrictions, and audit logging: Haystack issue #10821.
I would not cite these as “this already exists.” I would cite them as evidence that the hook point is a real need:
model proposes action
→ runtime needs a place to validate / revise / approve / block
→ SYLLOG could provide the reusable cognitive preflight that informs that decision
#
Low-cost implementation vocabulary from outside LLM agents
Some of the most useful vocabulary may come from older, non-LLM design disciplines.
Design by Contract
Design by Contract uses preconditions, postconditions, and invariants to specify when an operation may run and what must remain true: Eiffel Design by Contract.
For agent actions, the analogy is:
tool/action preconditions:
- required inputs are present
- target is specified
- scope is bounded
- authority is available
- consent is satisfied
- side effects are classified
tool/action postconditions:
- expected state change is known
- audit trace is emitted
- rollback / compensation is known if applicable
invariants:
- do not leak secrets
- do not mutate outside allowed scope
- do not contact external parties without authorization
This is a useful way to phrase the admission rule:
A tool/action should not execute merely because the model generated it.
It should execute only when its preconditions are satisfied.
Source inspection / mistake-proofing
In quality engineering, source inspection / mistake-proofing tries to prevent defects by checking conditions before the process step, rather than only inspecting after the defect is produced. ASQ’s overview of mistake-proofing / poka-yoke is a good general reference: ASQ mistake-proofing.
For agents, this suggests a cheap preflight layer:
do not wait until the bad action has already become a tool call;
make the wrong action harder to admit into execution
Job Hazard Analysis
OSHA’s Job Hazard Analysis guide asks, for each task step, what can go wrong, what the consequence is, how it can happen, what contributes to it, and how likely it is: OSHA Job Hazard Analysis.
A lightweight agent version could be:
| JHA term | Agent equivalent | | worker | agent | | task step | candidate action | | tool | API / file / shell / database / email | | environment | user context / external state | | hazard | off-target execution, privacy leak, irreversible mutation, wrong recipient | | control | clarify, narrow, approve, sandbox, block |
This keeps the idea practical without requiring every case to become a large safety-engineering exercise.
Idempotency / compensation / reversibility
Stripe’s idempotency docs are useful for thinking about retry-safe operations: Stripe idempotent requests.
The Saga pattern is useful for thinking about compensating actions and non-compensable pivot steps: Azure Saga pattern.
For agent action classes, I would separate:
read-only
idempotent write
retry-safe write
compensable write
irreversible / non-compensable action
This matters because “risky” is too vague. A file edit with rollback, a SQL DELETE
, an email send, a payment, and a workflow trigger should not all be handled by the same generic risk label.
#
Research neighbors, with limits
I would use these as neighboring references, not as substitutes.
AEGIS: pre-execution firewall / audit layer
AEGIS frames the issue as tool calls with real side effects: database queries, shell commands, file read/write, network requests. It argues that post-execution observability can record what happened but cannot prevent side effects before they occur, so it proposes a pre-execution firewall and audit layer: AEGIS paper.
This is close to the enforcement side. The SYLLOG seems closer to the cognitive preflight that can feed such enforcement.
OAP: deterministic pre-action authorization
“Before the Tool Call” / Open Agent Passport frames the gap as a pre-action authorization problem and proposes deterministic policy enforcement before individual tool calls: OAP paper.
This is closer to authorization. The SYLLOG could provide structured action understanding before such authorization decisions.
ToolSafe / TS-Flow: proactive step-level guardrails
ToolSafe studies tool invocation safety at the step level and introduces proactive intervention before unsafe execution: ToolSafe paper.
This is relevant because it treats safety as something that happens during the action trajectory, not only at final output time.
TraceSafe: mid-trajectory evaluation
TraceSafe argues that as LLMs move from chatbots to autonomous agents, the vulnerability surface shifts from final outputs to intermediate execution traces: TraceSafe paper.
This is useful evaluation vocabulary: action-preflight should probably be evaluated at the action / trajectory level, not only by final task success.
ActPlane: tool-layer coverage is not enough
ActPlane points out that tool-call guardrails can miss system actions that bypass the tool layer, while OS sandboxes often lack semantic feedback: ActPlane paper.
This is a useful caution: classify by side effect / state change, not only by tool name.
Capability gates are not authorization
“Capability Gates Are Not Authorization” argues that exposing or hiding tools is not the same as authorizing a particular action with particular values in context: paper.
This connects to the need for per-call action admission.
#
A small adapter/eval matrix that might make the idea easier to inspect
A small demo may be more useful than a large benchmark at first.
The Action Preflight quickstart already suggests that decision.action-preflight-forecast
can be called as a standalone skill and that outputs.continuation_decision
, outputs.human_readable
, and outputs.safer_alternatives
are the main stable outputs to inspect: Action Preflight quickstart.
The external guide also points to freeze / reproducibility material: Action Preflight external guide.
A small matrix might make the behavior easier for framework builders to understand:
Rows:
1. read-only search
2. internal note
3. external email
4. file write inside workspace
5. file write outside workspace
6. SQL SELECT
7. SQL DELETE / UPDATE
8. private-data export
9. workflow trigger
10. delegation to a sub-agent
Suggested columns:
- intended_goal
- candidate_action
- missing_inputs?
- target / destination / scope
- side_effect_class
- reversible / idempotent / compensable?
- consent / authorization needed?
- cheap structural decision
- SYLLOG continuation_decision
- runtime mapping: execute / clarify / revise / approve / block
The main thing to inspect would not be “does it block scary actions?” only. I would look for these behaviors:
| Case | Desired behavior | | missing required input | clarify, do not infer silently | | too-broad action | narrow / revise | | wrong target or destination | block or ask | | private data export | require consent / authorization | | irreversible action | approval / escalation | | repairable risky action | safer alternative | | clean low-risk action | proceed without unnecessary friction |
That would also help show that preflight does not have to be all-or-nothing.
#
Cautions I would keep visible for future readers
- Do not classify risk only by tool name
A “safe” tool can still produce an unsafe state change. A “dangerous” side effect can sometimes be reached through a different tool path. This is one reason I would classify by:
state touched
external destination
side effect
reversibility
authority
data flow
not only by:
tool name
- Schema validation is necessary but not sufficient
Strict schemas and required fields are useful. But they do not answer every important question.
recipient: valid email
does not mean:
recipient: correct person authorized for this data
Likewise:
path: valid string
does not mean:
path: within allowed scope and safe to modify
This is one place where the SYLLOG can add value above structural validation.
- Human approval is useful, but not a universal answer
Human approval should probably be reserved for high-impact, ambiguous, or irreversible cases. If every action asks for approval, users may stop reading the prompts carefully.
So the path I would try first is:
cheap structural checks
→ SYLLOG when ambiguous / contextual / consequential
→ human approval only for high-impact or policy-required cases
- Preflight, authorization, sandboxing, and tracing are complementary
I would keep these separate:
| Mechanism | Main job | | preflight | decide whether candidate action is ready to execute | | authorization | decide whether this actor may perform this action | | sandboxing | limit blast radius if execution occurs | | tracing | record what happened and why | | HITL | bring a human into uncertain/high-impact branches | | policy engine | enforce deterministic rules |
The SYLLOG seems useful because it can produce the structured cognitive artifact that the other layers consume.
- Repo metrics should be treated as inspectable material, not as universal proof
The reproducibility docs are useful, but I would avoid overclaiming from them without independent reruns in other stacks. For adoption, I would emphasize the adapter/eval matrix and concrete integration path more than headline numbers.
So my current best reading is:
The action-preflight SYLLOG is not a replacement for guardrails,
authorization, sandboxing, tracing, or HITL.
It is a reusable cognitive contract that can feed those layers.
It makes candidate actions explicit before execution,
then lets the runtime route them to proceed, clarify, revise,
approve, escalate, or block.
That seems like a practical abstraction: not “agents predicting the future,” but agent actions earning admission into execution.