# How an AI Terminal Assistant Became My Team's Most Productive Engineer - Opencode + Claude + MCP

> Source: <https://dev.to/velumal09/how-an-ai-terminal-assistant-became-my-teams-most-productive-engineer-opencode-claude-mcp-362i>
> Published: 2026-06-24 18:31:54+00:00

It was 11pm on a Tuesday. A cache migration in our production environment had just caused thousands of authentication failures for two of our largest enterprise customers. Our VP of Product wanted answers. Our support team was fielding escalations. And our engineers were alt-tabbing between AWS console, Datadog, GitHub, Azure DevOps, and PagerDuty trying to piece together what happened.

Three weeks later, when we needed to attempt the same change again, an engineer typed this into a terminal:

"Review the ADO change ticket, compare the MOP against the actual ElastiCache configuration in prod region, check the K8s config repo for how Redis env vars are wired on the Green cluster, and tell me if this approach avoids the token validation failure that caused the previous customer impact."

**Fourteen seconds later**, the system had pulled the work item, queried AWS ElastiCache across four regions, read the Kubernetes configuration from GitHub, cross-referenced the deployment patches, and delivered a precise technical assessment including a risk it identified that the team hadn't documented: in-flight tokens during the 30–60 second Global Accelerator propagation window.

That system is **OpenCode** — an AI-powered CLI assistant connected to our entire operational stack through the MC(Model Context Protocol). And it has fundamentally changed how a 20-person platform engineering team manages infrastructure serving thousands of enterprise tenants and processing millions of authentication requests daily.

OpenCode is deceptively simple in concept. A terminal application on an engineer's laptop. You type questions or tasks in plain English. It responds with answers pulled from **live production systems**.

```
  Engineer (terminal)
        │
        ▼
    OpenCode (Claude AI)
        │
        ▼
    MCP Servers
   ╱  │  │  │  ╲
  ▼   ▼  ▼  ▼   ▼
 AWS  DD  GH ADO PD  RD

 AWS = Amazon Web Services (prod + non-prod)
 DD  = Datadog (logs, metrics, monitors)
 GH  = GitHub (repos, PRs, code)
 ADO = Azure DevOps (tickets, sprints, wikis)
 PD  = PagerDuty (incidents, schedules)
 RD  = Rundeck (jobs, executions)
```

The magic is in those MCP servers. Each one is a lightweight connector to a backend platform. When you ask a question, the AI doesn't guess — it makes **real API calls** against **real systems** and works with **real data**.

Ask *"what's our AWS spend this month?"* — it queries Cost Explorer. Ask *"which tenant generates the most provisioning traffic?"* — it aggregates Datadog logs. Ask *"what did that PR change in the K8s config repo?"* — it reads the actual file diff from GitHub. Ask all three in the same sentence and it does them in parallel.

No pre-built dashboards. No saved queries. No runbooks to follow. You just **ask**.

The entire configuration is a single JSON file. Each MCP server gets a block: here's the server binary, here's the credentials, connect.

```
{
  "mcpServers": {
    "aws-prod": {
      "command": "aws-mcp-server",
      "env": { "AWS_PROFILE": "prod" }
    },
    "datadog": {
      "command": "datadog-mcp-server",
      "env": {
        "DD_API_KEY": "...",
        "DD_APP_KEY": "..."
      }
    },
    "github": {
      "command": "github-mcp-server",
      "env": { "GITHUB_TOKEN": "..." }
    }
  }
}
```

The AI model never sees the credentials. It calls tools by name — *"search logs in Datadog"* or *"describe EKS clusters"* and the MCP server handles authentication, pagination, error handling, and response formatting.

**Adding a new system takes about ten minutes.** Write a config block, provide credentials, restart.

Here's something that changes how you think about AI assistants: **you can create focused sessions with a single purpose**.

Right now, as I write this article, I have an OpenCode session that's been running for days as a **documentation advisor**. It's reviewed my architecture docs, drafted technical articles, generated formal roadmap documents, and is tracking project milestones. When I start a new conversation about something unrelated, I can tell the session: *"This session is reserved for documentation work only"* — and it keeps me focused.

This pattern works for any focused workstream:

| Session | Purpose | Tools Used |
|---|---|---|
Documentation Advisor |
Article drafting, roadmap generation, technical writing | Doc Agent, GitHub, web search |
Incident Responder |
Active incident investigation and RCA | Datadog, GitHub, PagerDuty, AWS |
Cost Analyst |
Monthly spend review, waste identification | AWS (Cost Explorer, EC2, RDS, S3) |
Sprint Planner |
Ticket creation, backlog grooming, capacity planning | Azure DevOps, GitHub |
Security Reviewer |
Code review, vulnerability assessment | GitHub, AWS (IAM, SecurityHub) |

Each session maintains context across the entire conversation. The AI remembers what you discussed 3 hours ago. It builds on previous findings. It doesn't start from zero every time.

Beyond focused sessions, you can create **sub-agents** — specialized configurations trained for specific domains:

Generates formal documents — postmortems, RCA reports, roadmaps, technical specs. It knows document templates, formatting standards, and outputs polished Word/PDF files.

I used this to generate formal migration roadmaps, architecture documents, and execution playbooks — all properly formatted, ready to share with leadership.

Creates, updates, and queries Azure DevOps work items. It understands your project structure, sprint cadence, and ticket hierarchy (Epic → Feature → Task).

One prompt: *"Create a Feature under the cleanup Epic with 8 tasks — one per batch"* — and 9 tickets exist with proper hierarchy, descriptions, and assignments.

Queries across all regions, all services. Cost analysis, resource inventory, security posture review. Runs in read-only mode with separate IAM profiles for prod vs. non-prod.

Connected to PagerDuty + Datadog + GitHub. When an alert fires, it pulls the monitor definition, searches logs, checks recent deployments, and synthesizes findings. This is the agent that eventually became **FRIDAY** — but more on that later.

Code review, PR analysis, repository search. It reads actual code and configs, not summaries. When someone asks *"what changed in the proxy config last week?"* — it reads every commit.

Any system with an API can become an MCP server. The pattern is:

``` python
# A minimal custom MCP server (simplified):
@mcp.tool()
def query_my_system(query: str) -> str:
    """Query our internal API"""
    response = requests.get(
        f"https://internal-api.company.com/search",
        params={"q": query},
        headers={"Authorization": f"Bearer {API_KEY}"}
    )
    return response.json()
```

This isn't a proof of concept. Here's what production operational work looks like with OpenCode:

Finance asked: *"What does each customer cost us?"* In shared infrastructure where a single proxy pod serves all tenants — the conventional answer is *"we can't really tell you."*

We asked OpenCode. One session:

Output: a **361-line Word document** with every number traced to an API response. Not estimates. Not SWAGs. Production telemetry.

Across two AWS accounts and four regions:

Output: an Excel workbook, color-coded, with subtotals. Combined waste: ~$3,200/month plus the $97K overlap.

When the retry was planned, one prompt produced a full technical assessment:

**Total time: 14 seconds.**

The first time I used OpenCode during a live incident, I realized something: **the AI was doing incident investigation faster and more consistently than our engineers.**

Not because it's smarter — because it doesn't context-switch.

A human investigating an incident opens:

That's **15-45 minutes** for an experienced engineer. More for a junior. And the cognitive overhead of switching between tools while sleep-deprived leads to missed signals and wrong conclusions.

With OpenCode, the same investigation is one conversation:

"PagerDuty alert fired on proxy 5xx errors in EU. Check Datadog for error rates by backend and affected tenants. Check GitHub for any recent deployments to the primary EU cluster. What changed?"

**90 seconds to 3 minutes.** Every time. No context switching. No missed signals. No investigating the wrong region.

[FRIDAY](https://dev.to/velumal09/how-i-built-an-autonomous-incident-investigation-agent-that-reduced-mttr-by-65-42ae) is essentially **OpenCode's incident investigation pattern, extracted into a Lambda that runs without a human typing the questions**.

The evolution:

| Stage | System | Human Involvement | Response Time |
|---|---|---|---|
Before |
5 dashboards + manual investigation | 100% human | 15-45 minutes |
OpenCode |
AI-assisted investigation (human asks) | Human types the prompt | 90 seconds |
FRIDAY |
Autonomous investigation (webhook triggers) | Human reads the findings | 90 seconds (automated) |

Same tools. Same reasoning pattern. Same output format. But no human in the loop for the investigation phase — the on-call engineer wakes up to **finished analysis** instead of a raw alert.

Results after months in production:

Once FRIDAY proved that an AI agent could reliably investigate production incidents (read-only), the natural question was: **can it also fix things?**

Not incidents — those require human judgment in the moment. But **vulnerability remediation** — the routine security fixes that follow a predictable pattern

JARVIS is designed to handle that 80% — the routine fixes where the remediation is well-understood and the verification is automatable. Human approval gates at every stage. Automatic rollback if anything breaks.

Here's something that surprised even me: **a complex tenant cleanup operation was largely driven through OpenCode sessions — by someone who didn't write the rake task.**

The rake task existed. But executing it required understanding:

OpenCode handled all of this conversationally:

"What's the current status of Phase 2? Check the Rundeck execution and the Datadog dashboard."

"Create a task under the cleanup Feature for Phase 4 execution. Include the batch count, estimated timeline, and dependencies."

"The runner seems stuck. Check processes on the worker for any rake tasks. What's happening?"

"The DBA says the events database CPU spiked. Pull the top queries from the RDS monitoring dashboard. Cross-reference the account IDs with our cleanup CSV."

Each of these would normally require logging into 2-3 systems, running manual queries, and synthesizing results. With OpenCode, it's a conversation.

When I joined the team, understanding the full platform took months. Multiple microservices. AWS regions. EKS clusters. Proxy backends. RabbitMQ queues. Aurora databases. Redis caches.

No single engineer understands all of it. The knowledge is distributed across dozens of people, hundreds of documents, and thousands of configuration files.

OpenCode changed how new team members (and existing ones exploring unfamiliar areas) learn the platform:

"How does the push notification service work? What's its architecture? Where does it run, what does it depend on?"

The AI reads the K8s config repo, checks which clusters the service is deployed to, reads the deployment YAML for dependencies (RabbitMQ queues, SNS topics, Redis), and synthesizes a technical overview — **from live configuration, not stale documentation**.

"What happens when a user logs in via SAML? Trace the request path from the browser through the proxy to the backend services."

It reads the proxy backend configuration from GitHub, identifies the routing rules, checks which services handle SAML assertions, and traces the dependency chain — all from actual config files and Datadog service maps.

This isn't replacing documentation. It's **making the infrastructure self-documenting**. The source of truth isn't a wiki page someone wrote 18 months ago — it's the live configuration that the AI reads in real-time.

Every engineer has pasted error messages into ChatGPT. That's not what this is.

| ChatGPT | OpenCode + MCP | |
|---|---|---|
Data source |
General training data | Live production systems via API |
Specificity |
"A 401 error usually means..." | "Your API gateway generated 1.3 million of them yesterday" |
Infrastructure |
Doesn't know your systems | Queries your actual AWS, Datadog, GitHub |
Freshness |
Training cutoff | Real-time data |
Hallucination |
Common for specifics | Can't hallucinate API responses |
Action |
Suggests what to do | Does it (queries, aggregates, cross-references) |

The model doesn't need to be told which tools to use. Ask *"is the cache migration approach safe?"* and it independently decides to: read the ADO ticket, query ElastiCache, read the K8s config, compare env var wiring, and synthesize. The engineer didn't specify any of those steps.

The uncomfortable truth is that most of what this system does **isn't hard**. Any senior engineer can query AWS Cost Explorer, aggregate Datadog logs, read a GitHub PR, and review an ADO ticket.

The hard part is doing all of them **in the same mental context, in the same hour, without losing the thread**.

An engineer investigating the cache migration opens AWS in one tab, Datadog in another, GitHub in a third, ADO in a fourth, terminal in a fifth. Copy cache endpoint addresses, paste into GitHub search, cross-reference with K8s config, check ADO for the deployment timeline, look at Datadog for the error spike. Context switches. Tab switches. Copy-paste. Scroll. Search. Repeat.

The system doesn't eliminate the need for engineering judgment. The engineer still decides whether the approach is safe, whether the risk is acceptable, whether the cost model makes sense. What the system eliminates is **the mechanical overhead of gathering the information needed to make those decisions**.

That overhead, across a 20-person team managing multi-region production infrastructure for a global identity platform, adds up to something significant.

We're six MCP servers in. The gaps are obvious: DNS management, direct Kubernetes cluster access for kubectl operations, Confluence for documentation. Each one is a JSON config block and a credential away from being connected.

But the more interesting trajectory isn't more connectors — it's more autonomy:

```
┌────────────────────────────────────────────────────────┐
│  THE PROGRESSION                                        │
│                                                         │
│  Stage 1: Answer questions (OpenCode — today)           │
│     "What caused the 401 errors?"                       │
│                                                         │
│  Stage 2: Investigate autonomously (FRIDAY — live)      │
│     PagerDuty webhook → full analysis in 90 seconds     │
│                                                         │
│  Stage 3: Remediate autonomously (JARVIS — designing)   │
│     Vulnerability finding → PR → deploy → verify        │
│                                                         │
│  Stage 4: Predict and prevent (future)                  │
│     Detect anomaly → correlate → alert before impact    │
└────────────────────────────────────────────────────────┘
```

Each stage builds trust for the next. Read-only first. Then write-path with approval gates. Then proactive monitoring. Then autonomous prevention.

The technology is ready for all of it. The trust model is what needs to catch up. We run AWS in read-only mode for a reason. But the trajectory is clear.

The complete OpenCode + MCP ecosystem**Sub-Agents Built:**

The beauty of this setup is that you can hook up as many tools as you want and create sub-agents for each of them — or just have one agent connected to everything. There's no right answer. Some engineers on my team prefer a single session with all 6 MCP servers connected — they ask about AWS costs, then pivot to a GitHub PR review, then check a PagerDuty schedule, all in one conversation. Others prefer focused agents: an AWS-only session for cost analysis, a Datadog-only session for incident investigation, a GitHub-only session for code review. The system doesn't impose a pattern and it adapts to how you think. Start with one MCP server. Connect your observability platform, or your ticketing system, or your cloud provider — whichever one you spend the most time context-switching into. Once you see the AI pull live data from it in a conversation, you'll immediately know which system to connect next. Within a week, you'll wonder how you ever operated without it

**Articles in the AI-Native SRE Series:**

*I'm Vinothsingh Elumalai, a Platform Engineering leader building AI-native operations at enterprise scale. I lead infrastructure for a global IAM/SSO platform serving millions of users across multiple AWS regions. This article is the origin story of everything in my AI-Native SRE series.*

*Connect with me on LinkedIn — I write about the intersection of AI, DevOps, and the future of platform engineering.*