{"slug": "til-streaming-data-in-go-with-iter-and-yield", "title": "TIL: Streaming Data in Go with iter and yield", "summary": "A developer building RagPack, a Go library for chunking files for embedding, used the iter package introduced in Go 1.23 to create a streaming parser interface. The iter.Seq2 type allows parsers for various file formats (CSV, PDF, DOCX, etc.) to yield parsed units and errors one at a time, enabling a common ingestion loop that handles early termination efficiently. This design keeps memory usage flat for streaming formats and simplifies adding new parsers.", "body_md": "While building [RagPack](https://github.com/eozsahin1993/ragpack), a library that chunks files for embedding, I needed a common way to stream parsed content from multiple file formats. RagPack supports CSV, PDF, DOCX, HTML, XLSX, Markdown, JSON and more. Each format has its own parser, but the ingester that consumes them should not care which one it is talking to. I needed a shared contract. In Java I would have reached for an `Iterator<T>`\n\nor an `InputStream`\n\n, but in Go the answer turned out to be the `iter`\n\npackage, introduced in Go 1.23.\n\nThe `iter`\n\npackage introduces two types. `Seq[V]`\n\nyields a single value at a time, and `Seq2[K, V]`\n\nyields a pair:\n\n```\ntype Seq[V any]     func(yield func(V) bool)\ntype Seq2[K, V any] func(yield func(K, V) bool)\n```\n\n`Seq2`\n\nis the right fit here because each iteration naturally produces two things: a parsed unit and any read error. This matches Go's standard `(value, error)`\n\nconvention and lets the caller handle errors inline without wrapping them in a struct.\n\nThat made `iter.Seq2[Unit, error]`\n\na natural return type for the `Parser`\n\ninterface:\n\n```\ntype Parser interface {\n    Parse(ctx context.Context, r io.ReadCloser) iter.Seq2[Unit, error]\n}\n```\n\nEvery sub-parser, `CSVParser`\n\n, `PDFParser`\n\n, `DocxParser`\n\n, `HTMLParser`\n\nand so on, implements this one method. The ingester does not need to know which format it is dealing with.\n\nHere is what a parser implementation looks like:\n\n```\nfunc (p *Parser) Parse(_ context.Context, r io.ReadCloser) iter.Seq2[Unit, error] {\n    return func(yield func(Unit, error) bool) {\n        defer r.Close()\n\n        reader := bufio.NewReader(r)\n        for {\n            line, err := reader.ReadString('\\n')\n            if err == io.EOF {\n                break\n            }\n            if err != nil {\n                yield(Unit{}, err)\n                return\n            }\n            if !yield(Unit{Text: strings.TrimRight(line, \"\\n\")}, nil) {\n                return\n            }\n        }\n    }\n}\n```\n\nThe `if !yield(...) { return }`\n\npart is the key. If the caller breaks out of the loop early, `yield`\n\nreturns `false`\n\nand we stop reading. No wasted work.\n\nBecause all parsers return the same type, the ingester ranges over any of them the same way:\n\n```\nfor unit, err := range parser.Parse(ctx, file) {\n    if err != nil {\n        // handle error\n    }\n    embed(unit)\n}\n```\n\nSwap in a different parser and the loop does not change. That is one big win. Memory was also in our minds when designing this. For streaming formats like CSV, JSON, or plain text, yielding one unit at a time keeps the footprint flat no matter how large the file is. For formats like PDF it is a bit more nuanced since the underlying parser has to load the full file first to parse it.\n\nHappy coding!", "url": "https://wpnews.pro/news/til-streaming-data-in-go-with-iter-and-yield", "canonical_source": "https://dev.to/emrecodes/til-streaming-data-in-go-with-iter-and-yield-c93", "published_at": "2026-07-04 03:06:54+00:00", "updated_at": "2026-07-04 03:48:59.485132+00:00", "lang": "en", "topics": ["developer-tools", "machine-learning", "large-language-models"], "entities": ["RagPack", "Go", "iter", "Seq2", "CSVParser", "PDFParser", "DocxParser", "HTMLParser"], "alternates": {"html": "https://wpnews.pro/news/til-streaming-data-in-go-with-iter-and-yield", "markdown": "https://wpnews.pro/news/til-streaming-data-in-go-with-iter-and-yield.md", "text": "https://wpnews.pro/news/til-streaming-data-in-go-with-iter-and-yield.txt", "jsonld": "https://wpnews.pro/news/til-streaming-data-in-go-with-iter-and-yield.jsonld"}}