Show HN: An LLM agent that emits typed intent

A developer released an open-source reference implementation of an LLM agent that emits typed intents using a domain vocabulary, separating intent from execution. The project introduces the Structured Intent Format (SIF) as a typed contract, with a deterministic layer that validates and executes intents without exposing the LLM to storage primitives. It includes working examples in legal and vet domains, exploring ontology-driven architectures for business applications.

What this is.A reference implementation of apattern:intent-execution separationfor LLM agents — the model emits typed intent in domain vocabulary and never touches the substrate, while a deterministic layer validates, translates, scopes, and executes it. The project ships two deliverables, in deliberate priority order: Primary — SIF and the intent-execution separation pattern it makes concrete.The typed intent contract, and the deterministic layer beneath it that validates, translates, scopes, and executes. The load-bearing contribution; the rest of the repo exists to give it a worked, running instance. See SIF is the core .Secondary — the experimental "application as declarative content" approach.A working, agent-driven business application assembled per domain from three declarative artifacts — anontology a typed domain model in plain YAML , a set ofbusiness rules plain text the agent reads , and anagent prompt persona + interaction protocol — over a small domain.json descriptor, withno hand-written service layer. Whether a genuinely useful application can emerge from "ontology + rules + prompt" instead of imperative code is the open question; the legal and vet domains are the evidence offered. This builds on SIF and is deliberately lower-priority. Detail: Where does the business logic live? .Full statement of the pattern: . docs/design-intent-principle.md Experimental, non-academic research project.Explores the boundaries of what LLMs can do as a business logic layer in ontology-driven architectures. Every hypothesis here is tested empirically against the example application shipped in this repo — thelegaldemo domain serves as the experimental harness, not as a showcase demo. Conclusions reached only by argument, without a running example to back them up, don't count. How to read this.If you want to see the designrunning— file layout, demo domains, REST surface, what the agent does in practice — keep reading this README and explore the domains/ folder. If you want the designrationale— the tool-surface problem, the vocabulary discipline, why state machines live where they do — read . Both views describe the same system from opposite ends. docs/design-sif.md SIF is the core sif-is-the-core See it in action see-it-in-action Beyond business applications beyond-business-applications How it works how-it-works Where does the business logic live? where-does-the-business-logic-live Tech stack tech-stack Prerequisites prerequisites Quickstart quickstart Configuration configuration REST surface rest-surface Adding a new domain adding-a-new-domain Tests tests Windows notes windows-notes Design docs design-docs License license SIF — Structured Intent Format — is a typed intent contract over ontology entities. It is the load-bearing piece of this project; everything else is built around it. - The LLM emits structured intents find , create , update , delete , link , unlink , transition using only the ontology vocabulary of the active domain — classes, properties, relations, transitions, value sets. It never sees SQL, physical table names, user IDs, or any storage primitive. - The framework validates every intent against the ontology, then routes it to whichever adapter can fulfil it. Adapters live behind the federation SPI DataSource interface and speak ontology types, not storage primitives. - An adapter can be SQL, document, key-value, REST, an internal method call, a vector store — anything that can fulfil ontology-level intents. The LLM never knows — and need not know — which adapter served a given entity. Today two adapters ship: SqlDataSource on PostgreSQL full CRUD plus transitions and MongoDataSource on MongoDB find and create ; the remaining write verbs — update / delete / link / unlink / transition — are planned and currently return a recoverable "not supported" . The legal demo exercises the Mongo read path. The storage layer is a private implementation detail behind the SIF contract, not the framework's identity. Adding a method-call, REST, or vector-store adapter is a matter of writing a new DataSource implementation — the SIF surface, the ontology, and the LLM prompt stay unchanged. That separation buys two things at once: Trust inversion. The LLM is the natural-language intent router; the framework is the trusted executor. The LLM is sandboxed by the JSON-Schema-bounded SIF grammar — it can express only declared verbs and ontology names, never SQL, shell, URLs, or any storage primitive — so a prompt-injected or hallucinating model has no expressible way to reach the substrate directly. Authorization is a separate concern: because everything below the SIF surface runs deterministically, there is one place after the LLM to enforce it. The demo ships a deliberately simple owner-scoping example there; a real RBAC/IdP plugs in at the same seam see Where authorization fits below . It is not an authorization model. Backend neutrality. The LLM expresses what it wants; the framework decides how to fulfil it. No prompt change, no ontology change, no SIF change when an entity moves from a SQL table to a service method behind a REST endpoint. A single ontology file in the project's compact YAML format is the source of truth for each domain. "Ontology" is informal shorthand throughout: the vocabulary is OWL-inspired, but what ships is a typed domain model in plain YAML, not an OWL ontology in the strict semantic-web sense — no reasoning, no SPARQL, no equivalence/restriction axioms, no subClassOf -driven inference. See docs/design-sif.md /gabert/ontocortex/blob/main/docs/design-sif.md "About the word ontology " for what is kept from that tradition and what is dropped. Two domains ship. legal — a law-firm matter manager covering matters, conflict checks, documents, billable time, invoicing and hearings — is the rich proof-of-concept. vet — a veterinary clinic owners, pets, appointments, visits, treatments, billing — is a deliberately simple, SQL-only second domain: the same engine serving a different ontology, with nothing but declarative content swapped. Both run a single managing identity that sees the whole domain. The shipping legal demo routes one ontology to two substrates. Most entities — Lawyer , Paralegal , Client , Matter , TimeEntry , Invoice , Hearing , ConflictCheck — live in PostgreSQL through SqlDataSource . The MatterNote entity — free-form working memos attached to matters — lives in MongoDB through MongoDataSource . The LLM sees one unified ontology and emits the same shape of SIF intent for both: { "operations": { "op": "find", "entity": "Lawyer", "filters": { "barNumber": "SBA-2026-1042" } }, { "op": "find", "entity": "MatterNote", "filters": { "createdDate": "2026-06-12" } } } How the framework dispatches it: | Entity | Adapter | What the adapter does | |---|---|---| Lawyer | SqlDataSource | Translates to SELECT … FROM lawyers WHERE bar number = ? against Postgres. When a domain declares a scoping identity, an identity predicate is injected here too — the current single-role demos declare none see Where authorization fits . | MatterNote | MongoDataSource | Translates the intent to a MongoDB query against the matter notes collection. In the demo this collection is unscoped — notes are shared working memos. The adapter can scope a collection per session when its mapping declares an identityField injected into every query ; the legal demo's MatterNote mapping declares none, so every session sees all notes. | The LLM never knows one entity lives in a relational store and another in a document store. It emits the same shape of intent for both. The framework picks the adapter per resolved op's entity class via DataSourceRegistry.findFor entity . If tomorrow a third entity moved to a method-call adapter or a REST endpoint, only the adapter implementation would change — the SIF intent, the ontology, and the LLM prompt would stay identical. Configuration is per-domain. The legal demo's domain.json declares two data sources primary: sql for the relational entities, documents: mongo for MatterNote , and per-adapter mapping files mapping.yaml , mapping.mongo.yaml bind each class to its substrate. Adding a third data source is a config addition plus a third mapping file — the SIF surface, the LLM prompt, and the rest of the ontology stay unchanged. That is the federation primitive: not "the framework can talk to Postgres or Mongo," but "the framework serves a single ontology from multiple substrates at the same time, transparently to the LLM." Where this fits in the value story. The heterogeneous case is where ontology most visibly earns its keep — a unified semantic vocabulary above disparate substrates. Ontology also pulls weight even within a single adapter typed LLM tool surface, pre-execution validation, lifecycle protection, cross-domain prompt reuse, audit framing — see docs/design-sif.md /gabert/ontocortex/blob/main/docs/design-sif.md . LLM agents are reliable with a handful of tools and less so as the count grows — each new tool is another choice the model can pick wrong on, another set of arguments it can hallucinate. SIF keeps the agent's per-turn choice small even as the domain gets bigger: seven op verbs and an ontology of typed names — entities, properties, relationships, transitions — injected into the tool schema as enums. The agent picks slots inside a small, typed grid; it doesn't navigate a sprawling tool catalog. That is the principle the whole architecture rests on — the LLM is an intent router, not an executor — and it is given its full treatment in docs/design-intent-principle.md /gabert/ontocortex/blob/main/docs/design-intent-principle.md . Beyond the two payoffs already described — trust inversion and backend neutrality — the same seam yields, with no per-feature work: Audit & replay. Every operation is a structured, human-readable intent record. The on-disk session log is the audit trail. Conversational atomicity. One user intention "set up the cosigner and link her to my new loan" = one submit sif batch = one transaction at the adapter. No bespoke saga/compensation code per use case. Refactor-safe. Physical schema changes or moving an entity from a table to a service method touch the adapter layer, not the LLM prompt. The same ontology can target multiple physical schemas e.g. multi-tenant with isolated DBs or mixed backends. Debuggable. When something breaks, the trace shows the LLM's original intent unchanged — the bug is either in the intent or strictly downstream of it. No "did the LLM hallucinate?" guessing. The fair version of the critique: SIF's modifier vocabulary filters , aggregate , sort , limit , relations , resolve looks SQL-shaped. It is — because relational stores serve those modifiers uniformly, so the SQL-flavoured shape is the pragmatic surface for SQL-backed entities. What that observation misses: SQL is one adapter, not the framework. A method-backed entity and a SQL-backed entity look identical to the LLM. The framework decides which adapter serves which class. The LLM never writes SQL, never reads SQL, never knows SQL is involved. The verbs are backend-neutral. find , create , update , delete , link , unlink , transition are ontology-level intents, not relational operations. A REST adapter fulfils them with HTTP calls; a method-call adapter dispatches them to service methods. The modifiers degrade gracefully. Non-SQL adapters honour the subset of modifiers their backend supports and reject the rest with a structured FederationException — the same recoverable refusal pattern the SQL translator already uses for "cannot build join path." SIF is deliberately smaller than SQL. No GROUP BY , no HAVING , no UNION , no window functions, no raw expressions, no subqueries outside resolve . Not more expressive than SQL — less expressive than SQL, in exactly the dimensions where less is safer. An established pattern in the same direction. GraphQL, OData $filter , DynamoDB ExpressionAttributes — all chose smaller-than-SQL typed DSLs for the same reason: the calling layer cannot be fully trusted, so the language it speaks must be one the server can mechanically validate, scope, and execute. The screenshots below are the legal demo running end to end — nothing is mocked. In each, the left pane is the business user's conversation; the right "under the hood" pane is the live SIF trace: the typed intent the LLM emitted, the SQL or Mongo query it was deterministically translated to, and the result handed back. A matter's structured record lives in PostgreSQL; its free-form working notes live in MongoDB. The user neither knows nor cares which store holds what — they ask one question, and the framework fans a single batch of intent out to both substrates. Prompt:In a single batch, look up two things for matter MTR-2023-001: the Matter record itself, and its working notes entity MatterNote, where aboutMatter is MTR-2023-001 . One submit sif batch carries two find ops. The trace shows the first op translated to SELECT … FROM lgl matters WHERE matter number = :p1 against Postgres, and the conversation already shows the assembled answer — the matter record together with its three working notes. Scrolling the same trace down to the second op: a green MONGO find against the matter notes collection with query {"matterId": "MTR-2023-001"}, returning the notes from MongoDB. One intent, two substrates — and no SQL or Mongo syntax ever crosses the LLM boundary. A single natural-language instruction that touches two entities. The agent resolves the matter and both staff members by name, then writes a billable time entry and a follow-up task as one atomic batch. Prompt:On matter MTR-2023-001, log 3.5 hours of billable time for Derek Okafor reviewing the expert report, and open a task for Sofia Ramirez to assemble the exhibit bundle due 2024-04-12. The conversation confirms both records were created. The trace opens with the resolution finds — locating the matter and the two staff members by name before any write happens. The create intent itself: two ops — a TimeEntry and a Task — submitted together in a single submit sif batch. The deterministic translation: two INSERT … VALUES …, SELECT lgl matter id … , SELECT lgl staff id … statements. Foreign keys are resolved by sub-select from the names the user gave, and both inserts commit in one transaction — one instruction, one atomic unit of work. Lifecycle changes are governed by a declared state machine, not by the model's judgement. Here the user asks for a transition that isn't legal from the matter's current state. Prompt:Run the Engage transition on matter MTR-2023-001 now. The LLM emits a transition intent — and the engine refuses it: current state 'active' is not in the legal source set conflict check . Engage is only legal from conflict check, and this matter is already active. The model cannot override the guard; it receives the structured error and explains the refusal back to the user. The LLM proposes; the deterministic layer decides. The shipping demo exercises a business-data domain legal , but the construction is general. Anywhere an agent acts on a typed, structured surface — files, devices, messages, deployments, regulated records — the same shape applies: a typed intent surface declared up front, a deterministic translator from intent to effect, identity injection after the LLM, and, for anything with a side effect, a declared transition bound to a post-action. The LLM only ever emits SIF — never a shell command, an SMTP exchange, or a device API call. Two concrete intents, in different domains: Send an email — an action with a side effect. The model drafts the message field by field with create , then fires it with a transition ; both ops ride in one batch, so they commit together: { "operations": { "op": "create", "entity": "Email", "data": { "to": "client@acme.com", "subject": "Engagement letter", "body": "Your engagement letter is ready to sign." } }, { "op": "transition", "entity": "Email", "transition": "send" } } send is a declared transition draft → sent ; its smtpSend post-action opens the connection and dispatches the message. The wire protocol, auth, and retries live in code the model never authors. And like the vehicle below, send can carry preconditions that run before the message leaves — a recipient allowlist, a content scan that blocks sensitive data in the body or attachments DLP , or a human-approval gate. The model fills the fields; the framework decides whether the mail is allowed to go out. Set a vehicle's speed — a safety-gated physical command. Vehicle is the entity; the target speed is one of its fields, and applying it is a gated transition. The session is scoped to its own vehicle, so the intent names no VIN — it just writes the setpoint, then asks to enact it. Writing the field is inert, and the model does not get to decide whether enacting is safe: { "operations": { "op": "update", "entity": "Vehicle", "data": { "targetSpeedKph": 30 } }, { "op": "transition", "entity": "Vehicle", "transition": "applySpeed" } } Writing targetSpeedKph only records a wish — nothing moves. The applySpeed transition is where authority lives: it runs its declared preconditions first — surroundingsClear sensor fusion , withinPostedSpeedLimit map and sign data , withinDynamicLimits traction and road conditions — and only if all pass does the commandPowertrain post-action send the setpoint to the drive controller. If a gate fails, the transition is rejected with a structured error the agent reads and reasons about; it cannot override the gate, raise the limit, or reach the controller directly. The agent supplies intent ; the framework keeps authority over whether the car moves. None of these domains ship the demos run SQL + Mongo , but nothing about them is special-cased: each is a new adapter plus a few declared transitions behind the unchanged SIF surface. For brevity the enacting snippets elide the row-locating filters that update and transition carry in real SIF — here the target is the entity created in the same batch, or the session's identity-scoped row. For more, framed as where the architecture could go rather than what ships, see docs/design-bounded-agency.md /gabert/ontocortex/blob/main/docs/design-bounded-agency.md . Authorization is not a pillar of this project, and the framework does not ship an authorization model. What it offers is a seam and a simple demonstration of it. Because the LLM emits typed intent and never touches the substrate, there is a single deterministic point — after translation, before execution — where authorization can be enforced. That is a genuinely useful place: the model can't read or remove what's applied there, and every operation that passes through is a structured, auditable record verb, entity, typed args, session identity, transaction rather than a shell transcript or pixel stream. The framework provides that seam: a flat post-translation injector scopes a session to the rows it owns e.g. a client seeing only their own matters . That is a teaching example, deliberately minimal — not a real authorization system. Roles, fine-grained read/write, delegation, and hierarchies are the domain of dedicated engines Keycloak, OPA, an RBAC/ABAC service , and they plug in at exactly this seam. The framework does not try to reimplement them. Caveat — the shipped demos don't exercise scoping today.Both legal and vet now run a single managing identity CaseManager / ClinicManager that sees the whole domain: their identity entity is referenced by no other table, so the injector adds no predicate. The seam, the injector, and the IdentityScope mechanism are exactly as described; they're simply dormant in the current single-operator demos. A domain that declares an owner-scoped identity e.g. a client, a pet owner reactivates the live demonstration with no code change — only content. What makes the seam worth having, whichever engine fills it: enforcement is below the LLM. Identity predicates and preconditions are applied by the deterministic layer after the model runs; the LLM neither sees nor can bypass them. That the LLM also can't emit effect code in the first place is the trust-inversion point from SIF is the core sif-is-the-core . There is also an approval seam between resolve and execute — the framework produces a structured plan there that could be gated an interactive prompt, RBAC plus dry-run, cryptographic signoff . Nothing pauses there today; it is an extension point, not a feature. Schema design pipeline one-time per domain, fully deterministic — no LLM, no API key <domain .ontology.yaml → Planner deterministic → <domain schema plan.yaml → Builder deterministic → module <name .json per module → Reconciler deterministic → <domain schema.json → SchemaInstaller deterministic → CREATE TABLE … in Postgres → loadSeedSql deterministic → INSERT … from the checked-in <domain seed data.sql fixture Every step is exposed under POST /api/domains/{domain}/provision/... . SIF Engine deterministic, LLM-free at runtime SIF operation ontology vocabulary → SifParser → typed Find / Create / Update / Link / Transition / ... → SifResolver → resolved against the OntologyModel → SifExecutor → picks the DataSource per resolved op's entity class ↓ DataSource adapter per ontology class ─ SqlDataSource → translator → IdentityInjector → JDBC → Postgres ─ MongoDataSource → translator → MongoDB collection find + create ─ future MethodCallDataSource → dispatch to a service method ─ future RestDataSource → HTTP request to an external API The SIF surface and the federation SPI DataSource interface speak ontology types. Each adapter privately decides how to fulfil the intent. Today SqlDataSource full CRUD plus transitions and MongoDataSource find + create ship; the others are concrete extension points the architecture already supports — same interface, same SIF contract, no LLM-facing changes when an adapter is added. LLM Agent one consumer of the SIF Engine User question → ConversationAgent Claude ├── system prompt = ontology + business rules + persona ├── tool = submit sif entity/relation/transition names baked in ├── loop: LLM → tool use → SifExecutor → tool result → LLM … └── final text reply The agent speaks ontology vocabulary; everything below the SIF surface runs deterministically. This section is the project's second deliverable after SIF : the experiment of assembling a working application from declarative content — ontology + business rules + agent prompt — rather than from hand-written code. The rest of this section is the evidence for whether that holds up. Switching domain does not switch applications.The sidebar's domain dropdown swaps the activeontology, its business rules, and the LLM persona — nothing else. The same engine, REST API, JVM, and codebase serve every domain. Adding a new domain is purely a content addition: four small files describe the domain, and the agent immediately speaks that vocabulary. Side effects that aren't pure content:the per-domain DB schema must be installed the provisioner builds and runs it deterministically from the ontology and any transition handler classes are Java implementations shipping alongside the ontology under domains/<name / generated/ . Those are the only places real code lives. A working domain in this repo carries very little code. Take domains/legal/ : domains/legal/ ├── domain.json — descriptor: data sources, identity entities, file pointers ├── legal.ontology.yaml — entities, properties, relationships, transitions ├── legal.rules — business rules in plain text === SECTION === headers └── legal prompt.txt — agent persona + identity protocol That's the whole per-domain surface. Four small files of declarative content — the demos ship ontology-only; the physical schema and mapping are generated by provisioning. Everything else is produced by the framework: - the physical schema CREATE TABLE statements, FK declarations — generated deterministically from the ontology by the provisioner pipeline; - the SQL the agent runs at runtime — produced per request by the translator from the resolved op + mapping; - input validation that catches typos and out-of-vocabulary names before they reach the database — driven entirely by the ontology; - the audit trail, identity scoping, and transaction control — generic, applied uniformly to every domain; - demo data is a checked-in seed data.sql fixture loaded deterministically in one transaction — beyond that baseline, records are created by talking to the agent, not by any LLM seed-generation step the provisioner has no LLM call at all . The exception is the demonstration transitions . Generated Java handlers under domains/<domain / generated/ are wired to the precondition and post-action names declared in the YAML — conflictCheckCleared , createEngagementLetter , notifyClientInvoiceIssued , and so on. In the demo these are no-op stubs : preconditions return success, post-actions trace the call and return. They exist to show where imperative business code plugs in — the registered-interceptor escape hatch described in docs/reference-sif-vocabulary.md — not as a worked implementation. Even so, the names that bind to Java live in the ontology; the Java is just the implementation slot behind a declarative entry. What's worth noting once the pieces come together: the reason it works isn't that the LLM is unusually powerful — it's that: - most of a domain structure, lifecycle, constraints really does fit in declarative form once you have the right primitives; - the framework does a lot of generic lifting below the surface — type resolution, SQL translation, identity injection, transaction control; - the LLM never writes code or SQL. It only picks ontology terms out of the typed grid the tool-schema builder injects, so what looks like business reasoning is actually a constrained selection task. This isn't free. The LLM has cost and non-determinism, prompts need iteration, the ontology format is opinionated, and not every domain fits cleanly. But for the kind of problem this targets — an agent operating on real business data with rules, lifecycles, and audit trails — the amount of new code needed to add a working application stays small. That is the observation worth highlighting. | Component | Technology | |---|---| | Backend | Java 21, Maven multi-module | | Web framework | Spring Boot 3.3 only in ontocortex-app | | Ontology | Compact YAML format, OWL-inspired vocabulary Apache Jena 5 retained for the legacy TTL parser | | Federation SPI | DataSource interface in ontocortex-federation — backend-neutral | | Shipping adapters | SqlDataSource on PostgreSQL 15 JDBC and MongoDataSource on MongoDB 7 find + create — the SIF contract is the same for both | | LLM | Claude Anthropic Messages API via the JDK HttpClient | | Frontend | Vue 3 + Vite + TypeScript | | Containers | Docker — PostgreSQL + MongoDB and Testcontainers spins up both for integration tests | - JDK 21 configured as a Maven toolchain — see ~/.m2/toolchains.xml - Node 20+ for the frontend - Docker for the bundled Postgres + Mongo and the Testcontainers-backed tests - An Anthropic API key — console.anthropic.com https://console.anthropic.com/ . Only required for the LLM Agent /chat ; SIF and the entire deterministic provisioner pipeline run without it. The full path from a clean checkout to a working chat is the five steps below — do them in order. Provisioning is manual and REST-driven.Creating the database tables and loading the demo data isnotdone by the UI anddoes nothappen automatically when the backend starts. You trigger it yourself by POSTing to the /provision/ endpoints step 4 . The docker-compose databases start empty; until you provision, the agent answers every question with no data. There is no CLI shim and no "provision" button — any HTTP client works. Run the commands below from the repo root. The provisioning calls use curl , which is cross-platform; any HTTP client works. On Windows substitute mvn.cmd for mvn see Windows notes windows-notes . git clone https://github.com/gabert/ontocortex.git cd ontocortex docker-compose up -d Brings up both containers — ontocortex postgres the primary SQL source and ontocortex mongo the documents source for MatterNote . On a fresh volume the Postgres container auto-creates the per-domain databases legal demo , vet demo from docker/postgres-init/ ; step 4 then creates the tables inside them and loads the data. The stores otherwise come up empty. If you have an old volume from before this init script existed, recreate it: docker-compose down -v && docker-compose up -d . Runs on port 8080 — leave it running. export ANTHROPIC API KEY=sk-... only needed for /chat; provisioning is keyless mvn -f engine/pom.xml install -DskipTests build + install all modules mvn -f engine/pom.xml -pl ontocortex-app spring-boot:run On Windows use mvn.cmd and set the key with $env:ANTHROPIC API KEY = "sk-..." — see Windows notes windows-notes . Without an API key the app still boots and the entire /provision/ pipeline works — only /chat returns HTTP 409 with a clear error. Required, manual. This is the step that's easy to miss. The containers from step 2 are empty, so you must create the tables and load the demo data yourself. The deterministic build artifacts schema plan.yaml , module .json , schema.json , seed data.sql are already committed in the repo , so for the shipped legal and vet demos you only need the two calls that touch the live databases — create the tables, then load the seed data: curl -X POST http://localhost:8080/api/domains/legal/provision/install CREATE TABLE … curl -X POST http://localhost:8080/api/domains/legal/provision/install/seed load seed data install returns the created table names. install/seed loads every data source the domain declares in one call — it runs seed data.sql into Postgres and, for a domain with a document store legal's MatterNote , loads seed.mongo.json into MongoDB. It returns { "inserts": <sql-rows , "documents": <mongo-docs } ; for legal expect 6 documents. There is no separate Mongo seed step and nothing platform-specific — the same curl does both stores.Repeat for vet or any domain you add by swapping the name in the path vet is SQL-only, so its documents count is 0 : curl -X POST http://localhost:8080/api/domains/vet/provision/install curl -X POST http://localhost:8080/api/domains/vet/provision/install/seed That's everything a fresh checkout needs — the committed schema artifacts already cover the planning steps. The full pipeline plan → build → reconcile → install → install/seed , plus install/drop to start over, lives in the REST surface rest-surface table; you only need it if you change a <domain .ontology.yaml . Runs on port 5173; proxies /api/ to the backend on :8080. cd frontend npm install npm run dev Open http://localhost:5173 http://localhost:5173 . Pick a domain in the sidebar, optionally set X-User-Id for identity scoping, and chat. The right pane shows, per turn, the LLM's submitted intent, each SIF operation including transitions , the generated SQL or MongoDB query, and the row count for each. All runtime config lives in engine/ontocortex-app/src/main/resources/application.yaml under the engine: key. Spring Boot's relaxed binding accepts environment-variable overrides ENGINE API KEY , ENGINE MODELS CHAT , etc. . | Key | Purpose | |---|---| engine.api-key | Anthropic API key typically ${ANTHROPIC API KEY} | engine.models.{chat,analyzer} | Model IDs per pipeline role chat agent + error-analysis post-mortem | engine.chat.{max-tokens,max-retries,retry-delay,max-iterations} | Conversation runtime | engine.log-dir | Per-session debug JSONL + error post-mortems | engine.domains-dir | Where domain content folders live | engine.data-sources.<name .{dbname,user,password,host,port} | Per-data-source DB credentials | | Method + path | Description | |---|---| GET /api/domains | List available domains | GET /api/domains/{domain} | Domain summary | GET /api/domains/{domain}/schema | Reconciled schema.json | GET /api/domains/{domain}/schema/description | Human-readable schema dump | POST /api/domains/{domain}/provision/plan | Run the deterministic planner | POST /api/domains/{domain}/provision/build | Deterministic builder | POST /api/domains/{domain}/provision/reconcile | Merge module builds into schema.json | POST /api/domains/{domain}/provision/install | CREATE TABLE … | POST /api/domains/{domain}/provision/install/seed | Seed every data source: seed data.sql → Postgres, seed.mongo.json → Mongo | POST /api/domains/{domain}/provision/install/drop | Drop every table | POST /api/domains/{domain}/sif | Execute a SIF batch no LLM — X-User-Id header for identity scoping | POST /api/domains/{domain}/chat | Talk to the agent needs API key — X-User-Id for identity scoping | DELETE /api/domains/{domain}/chat | Clear conversation history for this user | - Create domains/<name / with: domain.json — descriptor name, data sources, file pointers <name .ontology.yaml — ontology in the project's compact YAML format see docs/reference-ontology-yaml-format.md <name .rules — business rules plain text with === SECTION === headers <name prompt.txt — agent persona / system prompt fragment - Add a data-sources.<name block to application.yaml . - Drive the provisioner pipeline via the REST endpoints above. The framework discovers the new domain automatically on the next request — no code changes, no restart for content edits beyond reloading the affected application.yaml . mvn -f engine/pom.xml test All 431 tests pass with Docker running. 62 of them are Testcontainers-backed integration tests — SifExecutorTest , SchemaInstallerTest , StiIntegrationTest , DemoDomainProvisioningTest , MongoDataSourceIntegrationTest , and the 11-step EndToEndTest — which spin up real Postgres + MongoDB containers; the remaining 369 are pure unit tests with no external dependency. All LLM-using code BaseAgent retry, ConversationAgent tool-use loop, AgentPipeline rollback is unit-tested against a fake LlmClient — no live API calls in CI. Use PowerShell mvn.cmd not the bash mvn shell script . The bash environment on Windows often picks up the wrong JAVA HOME and a different trust store, which surfaces as toolchain mismatches or PKIX TLS errors when fetching dependencies. Maven 3.9+ with the JDK 21 toolchain config already in engine/pom.xml works out of the box from cmd or PowerShell. For Git over HTTPS the same PKIX wall can appear — use SSH for origin to sidestep it. In-depth design notes and reference specs live in docs/ /gabert/ontocortex/blob/main/docs — see for an annotated index that tags each doc as reference or design. /gabert/ontocortex/blob/main/docs/README.md docs/README.md Copyright 2026 Robert Gallas.