cd /news/large-language-models/protocol-spec-for-row-separated-key-… · home topics large-language-models article
[ARTICLE · art-34373] src=github.com ↗ pub= topic=large-language-models verified=true sentiment=· neutral

Protocol spec for row separated key value format

A new lightweight data format called Record-Separated Key-Value (RSKV) has been proposed for structured data exchange between large language models and software systems. Designed to be easy for LLMs to generate and humans to inspect, RSKV uses visible delimiters instead of braces or indentation, aiming to reduce syntax errors common with JSON, YAML, or XML. The format includes Python utilities for SQLite interop and is hosted on GitHub under the Symbol Grounding Framework.

read5 min views1 publishedJun 19, 2026
Protocol spec for row separated key value format
Image: source

Record-Separated Key-Value Format — a lightweight, line-oriented format for LLM-mediated structured data exchange.

RSKV is a structured transcript format: easy for LLMs to generate, easy for humans to inspect, and easy for small programs to parse. It represents data as named sets, records, and key: value

cells, using visible delimiters instead of braces, indentation, quoting, or nested syntax.

Repository folder: rskv-spec

RSKV is designed for the boundary between language models and software systems.

A minimal RSKV document looks like this:

#SET: users
#SCHEMA: id:int, name:str, plan:str
id: 1
name: Alice
plan: pro
---ROW---
id: 2
name: Bob
plan: free

The core model is simple:

#SET: name

starts a named collection.key: value

adds a field to the current record.---ROW---

separates records.#SCHEMA:

optionally declares field types and column order.#META:

optionally records provenance or operational metadata.\N

represents explicit null.- An empty value represents an empty string.

  • A missing key represents an absent value.

LLMs are often better at emitting repeated local patterns than maintaining global syntax state across deeply nested JSON, YAML, or XML.

RSKV leans into that strength. It avoids braces, commas, quotes, indentation semantics, and required nesting. The result is a format intended to be:

  • LLM-friendly
  • Human-readable
  • Streamable
  • Diffable
  • Grep-able
  • Parser-simple
  • Schema-flexible
  • Suitable for lightweight ETL and model-to-application handoff

RSKV is not intended to replace JSON, CSV, Parquet, Protocol Buffers, or databases. It is a text-first interchange format for sparse, reviewable, LLM-facing structured data.

File Purpose
rskv_spec.md

claims.md

essay.md

rskv_to_sqlite.py

sqlite_to_rskv.py

Use this prompt fragment when asking an LLM to emit RSKV:

Output only RSKV. Do not use Markdown, JSON, code fences, bullets, or commentary.

Start each collection with:
#SET: set_name

Write one field per line:
key: value

Separate records with:
---ROW---

Use #SCHEMA: after #SET: when field names and types are known.
Use #META: after #SCHEMA: only when provenance is useful.

Use \N for null.
Use an empty value after colon-space for an empty string.
Omit unknown or not-applicable fields.
Escape newlines as \n, backslashes as \\, a literal # at the start of a value as \#, and a literal ---ROW--- value as \---ROW---.
#SET: people
#SCHEMA: id:int, name:str, role:str, notes:str
#META: source=example, version=1
id: 1
name: Alice
role: engineer
notes: Works on data pipelines
---ROW---
id: 2
name: Bob
role: analyst
notes: \N

This folder includes two small Python utilities for SQLite interop.

python rskv_to_sqlite.py input.rskv output.db
python sqlite_to_rskv.py input.db output.rskv

Exact command-line options may vary depending on the script version. Run the scripts directly or inspect their argument handling for supported flags.

RSKV is strongest for:

  • LLM-generated structured output
  • Sparse records
  • Multi-set documents
  • Human-reviewable intermediate data
  • Prompt and context exchange
  • Lightweight ETL
  • Logs and structured transcripts
  • Database staging

RSKV is less natural for:

  • Deep inline object graphs
  • Compact binary transport
  • Dense analytics storage
  • High-throughput schema-first RPC contracts

Those cases can still be handled through conventions such as embedded json

fields, base64

fields, URI references, normalized sets, strict schemas, or downstream conversion to databases and columnar formats.

A set is a named collection of records.

#SET: tickets

Records are separated by ---ROW---

.

ticket_id: 1001
status: open
---ROW---
ticket_id: 1002
status: closed

A cell is one key: value

line. Parsers split on the first :

only.

summary: User reported error: timeout after login
name:        empty string
name: \N     explicit null
             missing key means absent

Schemas are optional and advisory.

#SCHEMA: id:int, name:str, active:bool, created_at:datetime

Metadata is optional and applies to the current set.

#META: source=crm, version=2026-06-19, owner=data-eng

RSKV 1.0 is an initial specification draft intended for experimentation, review, and implementation.

The current collection includes:

  • The formal specification
  • A claims/rationale document
  • An explanatory essay
  • SQLite import/export utilities

— start here for the motivation.essay.md

— read this for the design argument.claims.md

— read this for the normative format.rskv_spec.md

andrskv_to_sqlite.py

— inspect these for practical interop.sqlite_to_rskv.py

RSKV is one component of the broader Symbol Grounding Framework (SGF) project.

SGF is a stack of languages, grammars, protocols, and tooling for grounded machine meaning:

Core Lexicon— sense-disambiguated concepts grounded in ~65 semantic primes.** Synapses**— hub-and-spoke event structures with 15 fixed semantic roles for representing who did what to whom.** HFF Wire Protocol**— a versioned, machine-to-machine message format that lets services exchange grounded semantics without prior integration contracts.AFP— an act protocol (INFORM

,QUERY

,COMMAND

, etc.) with receiver sovereignty.Omega— a formal governance grammar with 13 primitives and a deterministic Safety Kernel.** WML**— a workflow-map language for composing AI software from primitives.** RSKV**— a lightweight record-separated key-value format for LLM-facing structured data exchange, prompt/output capture, structured transcripts, and simple ETL handoff.

Within SGF, RSKV is best understood as an interchange and tooling format rather than a semantic representation layer. It provides a simple, human-readable way to move structured records between LLMs, scripts, databases, examples, test fixtures, and documentation.

RSKV can be used to:

  • Capture LLM outputs in a parseable text format.
  • Store example records for SGF components.
  • Exchange lightweight structured data between scripts.
  • Convert records to and from SQLite.
  • Represent test fixtures for workflows, protocols, and semantic transformations.
  • Provide a readable staging format before data is converted into stricter SGF representations.

See also:

Formal RFC specification for SGF

Add a license file before public reuse or distribution.

Suggested options:

  • MIT for permissive software/spec reuse
  • Apache-2.0 for permissive reuse with patent language
  • CC-BY-4.0 for documentation-oriented reuse

Maintained as part of the Symbol Grounding Framework work.

Repository: SymbolGroundingFramework/SGF-manifest

── more in #large-language-models 4 stories · sorted by recency
── more on @symbol grounding framework 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/protocol-spec-for-ro…] indexed:0 read:5min 2026-06-19 ·