TIL: Streaming Data in Go with iter and yield

wpnews.pro

cd /news/developer-tools/til-streaming-data-in-go-with-iter-a… · home › topics › developer-tools › article

[ARTICLE · art-47588] src=dev.to ↗ pub=2026-07-04T03:06Z topic=developer-tools verified=true sentiment=↑ positive

TIL: Streaming Data in Go with iter and yield

A developer building RagPack, a Go library for chunking files for embedding, used the iter package introduced in Go 1.23 to create a streaming parser interface. The iter.Seq2 type allows parsers for various file formats (CSV, PDF, DOCX, etc.) to yield parsed units and errors one at a time, enabling a common ingestion loop that handles early termination efficiently. This design keeps memory usage flat for streaming formats and simplifies adding new parsers.

read2 min views1 publishedJul 4, 2026

While building RagPack, a library that chunks files for embedding, I needed a common way to stream parsed content from multiple file formats. RagPack supports CSV, PDF, DOCX, HTML, XLSX, Markdown, JSON and more. Each format has its own parser, but the ingester that consumes them should not care which one it is talking to. I needed a shared contract. In Java I would have reached for an Iterator<T>

or an InputStream

, but in Go the answer turned out to be the iter

package, introduced in Go 1.23.

The iter

package introduces two types. Seq[V]

yields a single value at a time, and Seq2[K, V]

yields a pair:

type Seq[V any]     func(yield func(V) bool)
type Seq2[K, V any] func(yield func(K, V) bool)

Seq2

is the right fit here because each iteration naturally produces two things: a parsed unit and any read error. This matches Go's standard (value, error)

convention and lets the caller handle errors inline without wrapping them in a struct.

That made iter.Seq2[Unit, error]

a natural return type for the Parser

interface:

type Parser interface {
    Parse(ctx context.Context, r io.ReadCloser) iter.Seq2[Unit, error]
}

Every sub-parser, CSVParser

, PDFParser

, DocxParser

, HTMLParser

and so on, implements this one method. The ingester does not need to know which format it is dealing with.

Here is what a parser implementation looks like:

func (p *Parser) Parse(_ context.Context, r io.ReadCloser) iter.Seq2[Unit, error] {
    return func(yield func(Unit, error) bool) {
        defer r.Close()

        reader := bufio.NewReader(r)
        for {
            line, err := reader.ReadString('\n')
            if err == io.EOF {
                break
            }
            if err != nil {
                yield(Unit{}, err)
                return
            }
            if !yield(Unit{Text: strings.TrimRight(line, "\n")}, nil) {
                return
            }
        }
    }
}

The if !yield(...) { return }

part is the key. If the caller breaks out of the loop early, yield

returns false

and we stop reading. No wasted work.

Because all parsers return the same type, the ingester ranges over any of them the same way:

for unit, err := range parser.Parse(ctx, file) {
    if err != nil {
        // handle error
    }
    embed(unit)
}

Swap in a different parser and the loop does not change. That is one big win. Memory was also in our minds when designing this. For streaming formats like CSV, JSON, or plain text, yielding one unit at a time keeps the footprint flat no matter how large the file is. For formats like PDF it is a bit more nuanced since the underlying parser has to load the full file first to parse it.

Happy coding!

source & further reading

dev.to — original article AI For Fun! Electric Chats at Hack the Kitty, Built with Kiro. Privacy Is Not a Feature — It's Architecture. How Swipe Cleaner Processes Everything On-Device AI Won’t Replace Developers — But Developers Who Use AI Will Replace Those Who Don’

~/api · this article 200

$curl api.wpnews.pro/v1/news/til-streaming-data-in-go…

Read original on dev.to → dev.to/emrecodes/til-streaming-data-in-go-with-i…

mentioned entities

RagPack

iter

Seq2

CSVParser

PDFParser

DocxParser

HTMLParser

metadata

slugtil-streaming-data-in-go-with-iter-and-yield

topic#developer-tools

secondary2 topics

sentimentpositive

canonicaldev.to

navigation

← prevBuilding Instant Translation Ass…

next →OXMIQ Raises $35M to License GPU…

── more in #developer-tools 4 stories · sorted by recency

dev.to · 4 Jul · #developer-tools

I built an entire agency management platform by myself. Here's what actually happened.

dev.to · 4 Jul · #developer-tools

David Just Beat Goliath on Terminal-Bench 2.1

mailkite.dev · 4 Jul · #developer-tools

You can't prompt your way out of prompt injection

dev.to · 4 Jul · #developer-tools

GNN vs. Trees: High-Speed Hybrid Architecture for XLA Runtime Prediction

── more on @ragpack 3 stories trending now

wpnews · 27 May · #artificial-intelligence

How I Run Two Claude Accounts as One

wpnews · 30 May · #ai-safety

Nightcord Security Analysis Report - Threat Investigation

wpnews · 28 May · #ai-startups

The Niche SaaS Opportunity Map 2026: Highly Demanded Subscribed Categories Beyond Mainstream

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required