If you're using AI agents like Manus AI, Claude, or ChatGPT with API access, you've probably noticed something frustrating: every task gets the same expensive model, regardless of complexity.
A simple "rename this variable" task burns the same credits as "analyze this 50-page legal document." That's like hiring a senior architect to hang a picture frame.
After burning through my monthly Manus credits in just 2 weeks, I decided to build a solution.
The core idea is simple: analyze task complexity BEFORE execution, then route to the appropriate model tier.
Here's the decision tree:
Task Input → Complexity Analyzer → Score (1-10)
↓
Score >= 8 → Opus/GPT-4 (expensive, high quality)
Score 4-7 → Sonnet/GPT-4o (balanced)
Score <= 3 → Flash/GPT-4o-mini (cheap, fast)
The scoring considers multiple factors:
| Factor | Weight | Examples |
|---|---|---|
| Token count | 20% | Long prompts = higher complexity |
| Domain keywords | 25% | "analyze", "research", "compare" = high |
| Output requirements | 25% | Code generation, multi-step = high |
| Context dependency | 15% | References previous work = higher |
| Creativity demand | 15% | "brainstorm", "innovate" = high |
def route_task(task_description: str) -> str:
score = 0
tokens = count_tokens(task_description)
if tokens > 2000: score += 2
elif tokens > 500: score += 1
high_complexity_keywords = [
"analyze", "research", "compare", "synthesize",
"architect", "design system", "debug complex"
]
low_complexity_keywords = [
"rename", "format", "list", "simple", "quick"
]
for kw in high_complexity_keywords:
if kw in task_description.lower():
score += 2
for kw in low_complexity_keywords:
if kw in task_description.lower():
score -= 1
score = max(1, min(10, score))
if score >= 8:
return "opus" # Most expensive, highest quality
elif score >= 4:
return "sonnet" # Balanced
else:
return "flash" # Cheapest, fastest
After implementing this system on my Manus AI workflow:
| Metric | Before | After | Improvement |
|---|---|---|---|
| Monthly credit usage | 100% in 14 days | 100% in 30+ days | 2x+ duration |
| Simple task cost | Same as complex | 70% cheaper | -70% |
| Complex task quality | Baseline | Same or better | No degradation |
| Average response time | 8-12s | 3-8s (simple tasks faster) | -40% |
The key insight: ~60% of daily tasks are simple enough for the cheapest model tier, but without routing, they all consume premium credits.
I packaged this into a skill called Credit Optimizer that works as a pre-processing layer:
The architecture is model-agnostic — it works with any AI service that offers multiple model tiers:
Because quality matters. Complex tasks genuinely need powerful models. The optimizer ensures you get the RIGHT model for each task — not always the cheapest, not always the most expensive.
The Credit Optimizer is available at creditopt.ai — it includes:
I'm working on:
Have you built something similar? I'd love to hear about different approaches to AI cost optimization. Drop a comment below or find me on creditopt.ai.