38/60 Days System Design Questions

wpnews.pro

cd /news/large-language-models/38-60-days-system-design-questions · home › topics › large-language-models › article

[ARTICLE · art-26318] src=dev.to ↗ pub=2026-06-13T16:24Z topic=large-language-models verified=true sentiment=· neutral

38/60 Days System Design Questions

A developer poses a system design question about processing a 150K-word document with a 128K-token LLM, listing four strategies: fixed-size chunking, sliding window, progressive summarization, and truncation. They ask which approach would be best for a 200-page legal contract where the answer could be anywhere, hinting that one method has a hidden failure mode.

read1 min views16 publishedJun 13, 2026

Your LLM has 128K tokens.

Your document has 150K words.

Something has to give. What do you do?

A) Chunk the document into fixed-size pieces and embed each one — retrieve the top-k at query time.

B) Use a sliding window — process the document in overlapping chunks, stitch the outputs together.

C) Summarize each section progressively — feed the running summary forward as context.

D) Truncate to the most recent tokens and hope the answer is near the end.

Three of these are real strategies teams ship to production. One of them will silently give you wrong answers on a predictable class of questions.

Pick one — and tell me which you'd actually use on a 200-page legal contract where the answer can be anywhere.

I'll drop the full breakdown in the comments — including the failure mode most engineers don't see until they're in production.

Drop your answer 👇

source & further reading

dev.to — original article Building a Legal Document Analyzer in typescript with NodeJS RAG - Query Transformation and Expansion Building a WhatsApp AI Agent with Gemini Using Gemini as Your Copilot

── more in #large-language-models 4 stories · sorted by recency

promptcube3.com · 29 Jul · #large-language-models

Multi-Model Context Management: A Practical Workflow

promptcube3.com · 29 Jul · #large-language-models

build AI agents

vernllm.vercel.app · 29 Jul · #large-language-models

Show HN: VernLLM – The AI resilience layer for TypeScript

promptcube3.com · 29 Jul · #large-language-models

Tokenless: Reducing AI Spend via Dynamic Model Routing

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required