CHE MCP — Building Argentina's First National MCP Ecosystem: 5-Stage Classifier, WMA Online Learning, 748 Datasets

A developer in Bahía Blanca, Argentina built CHE MCP, the country's first national MCP ecosystem that connects AI agents to 80+ official Argentine data sources through a single server. The system uses a 5-stage classifier with a Weighted Majority Algorithm for online learning, achieving 95.45% accuracy on MCPAgentBench, and serves 748 Parquet datasets with natural language to SQL conversion.

Argentina just got its first national MCP ecosystem — and it was built from Bahía Blanca. CHE MCP is an intelligent gateway that connects any AI agent with real-time Argentine data. Dollar exchange rates, weather, football, tax compliance ARCA , inflation, public transit — 80+ official data sources through a SINGLE MCP server. Why does this matter? Because right now, if you want your AI to answer "¿cuánto está el dólar blue?", you either Google it yourself or install 80 different MCP servers. CHE MCP solves that with a gateway that understands natural language in Spanish and routes queries automatically. Query: "dolar blue hoy" │ ┌────▼─────┐ Stage 1 — Keyword matching │ Keyword │ 3,000+ keywords across 182 classified domains └────┬─────┘ │ ┌────▼─────┐ Stage 2 — WMA weighted routing │ WMA │ Weighted Majority Algorithm: learns from every query └────┬─────┘ │ ┌────▼─────┐ Stage 3 — Semantic embeddings │ Embedding │ 384-dim vectors all-MiniLM-L6-v2 with Jaccard fallback └────┬─────┘ │ ┌────▼─────┐ Stage 4 — Data Node search │ Data Node │ DuckDB SQL over 748 Parquet datasets + NL-to-SQL └────┬─────┘ │ ┌────▼─────┐ Stage 5 — LLM fallback │ LLM │ External endpoint optional, configurable └────┬─────┘ │ ┌────▼─────┐ │ Response │ "Dólar blue: $1,245 / $1,265 compra/venta" └──────────┘ The Weighted Majority Algorithm WMA is an online learning system embedded directly in the router. Every domain starts with equal weight 1.0 . When a query succeeds, the winning domain gets reinforced +0.1 . When it fails, the domain gets penalized −0.1 . Weights are bounded at 0.1, 5.0 and persisted to disk — the router starts warm and improves with every query. Benchmark: 95.45% Top-First-Score accuracy on MCPAgentBench 66 diverse queries . 748 Parquet datasets from datos.gob.ar Argentina's open data portal , compressed 9.92× with Zstd 404 MB vs 3.92 GB CSV . The Data Node converts natural language to SQL: User: "¿Cuánto aumentó la inflación en 2024?" → DuckDB generates: SELECT AVG valor FROM indice precios consumidor WHERE fecha BETWEEN '2024-01-01' AND '2024-12-31' → Result: 117.8% anual SQL injection guardrails, read-only enforcement, 5-second timeout, 1,000-row result limit. | Pattern | Implementation | |---|---| 3-tier cache | In-memory LRU 200 entries → disk atomic writes → live CKAN | Circuit breaker | Per-dataset, 3-failure threshold, 60s cooldown, serves stale data | Request collapsing | Concurrent identical queries share a single upstream fetch | Predictive pre-fetch | Top-10 hot datasets refresh every 15 minutes | Rate limiting | Token bucket per API key, 100 req/min, noisy neighbor isolation | The Model Context Protocol is undergoing its biggest architectural update in July 2026 — mandatory Streamable HTTP transport, stateless architecture. CHE MCP was architected for this from day one: Built from Bahía Blanca, Argentina 🇦🇷 with Gentle AI https://github.com/Gentleman-Programming/gentle-ai 's SDD orchestration + Engram https://github.com/Gentleman-Programming/engram persistent memory. Full technical documentation: github.com/Albano-schz/che-mcp-docs https://github.com/Albano-schz/che-mcp-docs What questions do you have about building MCP ecosystems at national scale?