A flat per-call endpoint for summarize / classify / extract in your n8n and Make automations

wpnews.pro

cd /news/developer-tools/a-flat-per-call-endpoint-for-summari… · home › topics › developer-tools › article

[ARTICLE · art-42550] src=dev.to ↗ pub=2026-06-28T13:00Z topic=developer-tools verified=true sentiment=↑ positive

A flat per-call endpoint for summarize / classify / extract in your n8n and Make automations

A developer introduced Modelis, an OpenAI-compatible gateway that charges a flat per-call price for bounded-output tasks like summarization, classification, and extraction, making it suitable for high-volume automations in n8n and Make. The service caps output at ~1024 tokens and auto-routes requests to a fitting model, ensuring predictable costs. The developer also released an open-source adapter for local use.

read2 min views1 publishedJun 28, 2026

If you run automations that summarize, classify, or pull fields out of text at volume, the LLM step is where per-token pricing turns budgeting into a guessing game: one batch of long inputs and the bill spikes. For these bounded-output jobs, a flat price per call fits better than a per-token frontier model. Here is how I wire it into n8n / Make, and when not to.

Automation runs are repetitive and high-volume, and the outputs are short by nature: a summary, a label, a few extracted fields. I route them through Modelis, an OpenAI-compatible gateway that auto-routes each request to a fitting model and charges a flat price per call with output capped at ~1024 tokens. Because the output is bounded, each run costs the same and your monthly total stays predictable no matter the input size.

It is a standard OpenAI-compatible POST /v1/chat/completions

. Use an HTTP Request node:

POST

https://modelis-auto-chat.p.rapidapi.com/v1/chat/completions

x-rapidapi-host: modelis-auto-chat.p.rapidapi.com

, x-rapidapi-key: YOUR_KEY

, content-type: application/json

{"model":"modelis-auto","messages":[{"role":"user","content":"Label sentiment (positive/negative/neutral): {{ $json.text }}"}]}

The curl equivalent:

curl --request POST \
  --url https://modelis-auto-chat.p.rapidapi.com/v1/chat/completions \
  --header 'content-type: application/json' \
  --header 'x-rapidapi-host: modelis-auto-chat.p.rapidapi.com' \
  --header 'x-rapidapi-key: YOUR_KEY' \
  --data '{"model":"modelis-auto","messages":[{"role":"user","content":"Summarize in 2 sentences: ..."}]}'

If you would rather use a built-in OpenAI node that expects an Authorization: Bearer

key and a custom base URL, run the tiny open-source adapter next to your workflow runner:

npx modelis-openai      # local proxy on 127.0.0.1:8787, MIT, ~120 lines

Then point the node at http://127.0.0.1:8787/v1

with model modelis-auto

Summarize in 2 sentences: ...

Label sentiment (positive/negative/neutral): ...

Return JSON with {name, email, company} from: ...

All produce short outputs, so the flat per-call price keeps high-volume runs cheap to reason about.

Long-form generation (articles, whole files, large code) will hit the ~1024-token cap and get truncated. Keep a high-output model for those. Use this for the short, structured outputs that automations actually need.

I built the adapter. I am most curious which extraction and classification tasks the routing handles well versus badly. If you point an automation at it, I would love to hear how it routed.

source & further reading

dev.to — original article How I Built GitPulse: A Cinematic Developer Storyteller (and why standard GitHub profiles are boring) Quitter Vercel : héberger son app Next.js sur un VPS OKF for Claude Code: structured, portable memory your agent (and team) can read

~/api · this article 200

$curl api.wpnews.pro/v1/news/a-flat-per-call-endpoint…

Read original on dev.to → dev.to/chenxiao5580cmd/a-flat-per-call-endpoint-…

mentioned entities

Modelis

RapidAPI

n8n

Make

OpenAI

metadata

sluga-flat-per-call-endpoint-for-summarize-classify-extract-in-your-n8n-and-make

topic#developer-tools

secondary4 topics

sentimentpositive

canonicaldev.to

navigation

← prevRevenue at Risk from AI Displace…

next →6 Claude Code Workflows That Hav…

── more in #developer-tools 4 stories · sorted by recency

lennysnewsletter.com · 28 Jun · #developer-tools

OpenAI Codex lead on the new shape of product work | Andrew Ambrosino

dev.to · 28 Jun · #developer-tools

Hardcoded System Prompts: An Anti-Pattern in Production

dev.to · 28 Jun · #developer-tools

hack with Hyd 2.0

agent-watch.dev · 28 Jun · #developer-tools

Show HN: AgentWatch – Prevent runaway AI agents with runtime budget enforcement

── more on @modelis 3 stories trending now

wpnews · 25 May · #artificial-intelligence

Maia-3: free and open source

wpnews · 28 May · #ai-startups

[AINews] Cognition raises $1B in $26B Series D

wpnews · 5 Jun · #ai-agents

Miasma Worm Targets AI Coding Agents via GitHub Repos

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required