Open-Sourcing PostAll's Content Formatting Engine: A Gift to the Dev Community

wpnews.pro

PostAll generates content once and ships it to three places: a blog post, a LinkedIn update, and an email newsletter. For the first four months, that meant three separate prompts, three separate LLM calls, and three versions of the same idea that quietly drifted apart from each other.

I rewrote the formatting layer three times trying to fix that. The version I'm open-sourcing today is the fourth — not because I ran out of competitors to out-build, but because I realized formatting infrastructure isn't where PostAll's actual value lives. Generation quality and content strategy are the moat. A renderer that turns structured content into blog HTML, a tweet, and an email is just... useful plumbing. So here it is.

Each output format has its own constraints, and they don't overlap:

<h2>

s and <h3>

s are what search engines actually crawl.My first version asked the LLM for all three directly — one prompt for the blog post, a second prompt for the tweet, a third for the email teaser. It worked, technically. But the three outputs drifted: the email would reference a stat the tweet didn't mention, the tone would shift slightly between versions, and every format change meant editing three prompts instead of one.

That's the actual bug. Not "formatting is hard" — maintaining three sources of truth for one idea is hard.

The pattern that solved this isn't new — it's the same idea compilers use. Parse your input into an intermediate representation once, then run format-specific renderers against that one representation. Generate the content once, structure it into blocks, and let each renderer decide how to express those blocks.

from dataclasses import dataclass, field
from enum import Enum
from typing import Optional

class BlockType(Enum):
    HEADING = "heading"
    PARAGRAPH = "paragraph"
    LIST_ITEM = "list_item"

@dataclass
class ContentBlock:
    type: BlockType
    text: str
    level: Optional[int] = None  # heading level (1-3) — blog SEO needs this, social doesn't care
    metadata: dict = field(default_factory=dict)

This is the whole "data model" — deliberately small. I tried a richer schema early on (nested blocks, inline spans as their own objects) and immediately regretted it. The renderers got more complex than the problem justified.

PostAll's generation step already outputs markdown, so the parser just needs to handle the subset of markdown that actually shows up in those outputs:

import re

def parse_markdown(raw: str) -> list[ContentBlock]:
    """Turn raw LLM markdown output into a list of ContentBlock objects.
    Intentionally narrow — it only covers what PostAll's prompts actually produce.
    """
    blocks = []
    for chunk in raw.strip().split("\n\n"):
        chunk = chunk.strip()
        if not chunk:
            continue

        heading_match = re.match(r"^(#{1,3})\s+(.*)", chunk)
        if heading_match:
            level = len(heading_match.group(1))
            blocks.append(ContentBlock(BlockType.HEADING, heading_match.group(2), level=level))
            continue

        if chunk.startswith(("- ", "* ")):
            for line in chunk.splitlines():
                blocks.append(ContentBlock(BlockType.LIST_ITEM, line.lstrip("-* ").strip()))
            continue

        blocks.append(ContentBlock(BlockType.PARAGRAPH, chunk))

    return blocks

This is not a general-purpose markdown parser, and I'd point you at mistune

or markdown-it-py

if you needed one. It's a parser for exactly the shape of content PostAll generates — narrow on purpose, because narrow is what made it maintainable.

This is where the constraints from each format actually get handled:

class BlogRenderer:
    """SEO-structured HTML. Heading levels are preserved — that hierarchy
    is what gets crawled, so the renderer never flattens it."""

    def render(self, blocks: list[ContentBlock]) -> str:
        html = []
        for block in blocks:
            if block.type == BlockType.HEADING:
                html.append(f"<h{block.level}>{block.text}</h{block.level}>")
            elif block.type == BlockType.PARAGRAPH:
                html.append(f"<p>{block.text}</p>")
            elif block.type == BlockType.LIST_ITEM:
                html.append(f"<li>{block.text}</li>")
        return "\n".join(html)

class SocialRenderer:
    """Flattens everything into one post and truncates at a word boundary —
    never mid-sentence, never mid-word."""

    def __init__(self, char_limit: int = 280):
        self.char_limit = char_limit

    def render(self, blocks: list[ContentBlock]) -> str:
        flat = " ".join(
            b.text for b in blocks if b.type in (BlockType.HEADING, BlockType.PARAGRAPH)
        )
        return self._truncate(flat, self.char_limit)

    def _truncate(self, text: str, limit: int) -> str:
        if len(text) <= limit:
            return text
        cutoff = text[: limit - 1].rsplit(" ", 1)[0]  # back off to the last full word
        return cutoff + "…"

EmailRenderer

follows the same interface but outputs table-based layout with inline styles instead of semantic HTML — more on why below. I left it out of the post for length; it's in the repo.

Here's the whole thing running end to end:

raw_llm_output = """## Why Caching Matters
Caching cuts your API costs and your latency at the same time.

## The Tradeoff
Stale data is the price you pay for that speed."""

blocks = parse_markdown(raw_llm_output)

blog_html = BlogRenderer().render(blocks)
tweet = SocialRenderer(char_limit=120).render(blocks)

print(tweet)

One generation step. Two outputs, structurally consistent, neither one a re-prompt of the other.

A few things bit me building this, and they'll probably bite you too if you extend it:

Inline formatting spanning a truncation point breaks. If a bold span opens before the cutoff and its closing **

lands after it, you ship literal asterisks instead of bold text. I added a tag-balance check that backs the cutoff off word-by-word until it lands outside any open inline marker. It's not in the snippet above — it's a genuinely annoying 15 lines, and it's the part of the repo I'd most welcome a cleaner PR for.

Outlook is still rendering HTML with Word's engine, not a browser engine. Semantic tags like <section>

get silently ignored. EmailRenderer

outputs table-based layout with every style inlined — ugly to write, but it's the only thing that renders consistently across Gmail, Outlook, and Apple Mail.

I tried asking the LLM to "improve" each format directly, once per format, instead of using the renderer. Quality went up slightly. Voice consistency went down immediately — the exact problem this whole architecture exists to prevent. If you're tempted to skip the renderer "just this once," don't. That's how you end up rewriting this for a fourth time, like I did.

This is running in production, formatting roughly 12,000 pieces of content a month across blog, social, and email, with zero additional LLM calls per format. Parsing and rendering a typical 800-word article into all three outputs runs in single-digit milliseconds — it's pure string processing, no I/O.

I thought about this longer than I expected to. The honest answer: this layer isn't PostAll's differentiator. The generation quality, the prompt strategy, the content pipeline around it — that's the part I'm not open-sourcing. The renderer is infrastructure, not strategy, and infrastructure gets better when more people poke at it.

I also genuinely don't have time to build a PDF renderer, an RSS renderer, or a Slack-message renderer myself. If this is useful to you, extending it is a five-minute job: implement one render()

method against the same ContentBlock

list.

It's MIT licensed. No catch.

The repo includes the full EmailRenderer

, the tag-balance truncation logic I skipped above, and a test suite covering the markdown edge cases that broke me during development. Link's in the comments below — I'd rather you find it there than trust a link I typed into an article.

What output format would you actually want from something like this — Slack messages, RSS, PDF? I'm planning the next renderer based on whatever gets the most replies here.

source & further reading

dev.to — original article Turn any novel into a playable browser game in 30 minutes — meet novel-game skill RAG Pipeline Chunking Strategies: Split Documents for Better Retrieval I Took the Udacity AWS Machine Learning Engineer Nanodegree. Here's What It Actually Teaches (2026)

Open-Sourcing PostAll's Content Formatting Engine: A Gift to the Dev Community

Run your AI side-project on zahid.host