# From a Simple Web App to a Production-Style Platform: My DevOps Learning Journey

> Source: <https://dev.to/shashank0701byte/from-a-simple-web-app-to-a-production-style-platform-my-devops-learning-journey-29km>
> Published: 2026-06-13 22:40:57+00:00

**When I started building SystemCraft, my goal wasn't to learn Kubernetes, GitOps, monitoring, or cloud-native architecture.**

I just wanted to build a system design interview platform.

Fast forward a few months, and that simple web application evolved into something much bigger:

CI/CD Pipelines

Dockerized Deployments

Kubernetes

Helm Charts

ArgoCD GitOps

Prometheus Monitoring

Grafana Dashboards

AlertManager

Auto Scaling

Security Scanning

This article is the story of how that happened and what I learned along the way.

**The Original Idea**

SystemCraft was designed to solve a problem I noticed while preparing for system design interviews.

**Most preparation resources are passive:**

Reading blogs

Watching videos

Looking at architecture diagrams

But real system design interviews are interactive.

You need to make decisions, justify trade-offs, adapt to changing requirements, and explain your reasoning.

I wanted to create a platform where engineers could:

Design architectures visually

Receive AI-powered feedback

Simulate real interview scenarios

Learn through iteration

**The first version was straightforward:**

*Next.js
↓
MongoDB
↓
Gemini API*

**The Docker Phase**

My first step was containerization.

I created a Dockerfile and containerized the entire application.

At first, I thought Docker was the hard part.

I quickly learned it wasn't.

Building containers is easy.

Operating containers reliably is the real challenge.

**

Questions started appearing:

**How do I deploy updates?

How do I manage multiple replicas?

How do I scale?

How do I monitor failures?

Docker solved packaging.

It didn't solve operations.

**Building a Real CI Pipeline**

The next step was automation.

I didn't want deployments to depend on manual commands.

I created a GitHub Actions pipeline that would automatically:

*Lint & Typecheck
↓
Playwright E2E Tests
↓
Docker Build
↓
Trivy Security Scan
↓
Kubernetes Validation
↓
Deployment*

**One lesson became obvious:**

Automation isn't about speed.

It's about consistency.

The pipeline catches mistakes long before they reach production.

**Security Wasn't Optional**

One of the most valuable additions was Trivy.

Initially I wasn't thinking much about container security.

Then I started scanning images and realized how many vulnerabilities can exist inside dependencies you didn't even know you had.

**Every build now goes through:**

*Docker Build
↓
Trivy Scan
↓
Deployment*

This simple addition completely changed how I think about shipping software.

**Enter Kubernetes**

Eventually a single container stopped being enough.

I wanted:

Multiple replicas

Self-healing workloads

Rolling updates

Horizontal scaling

Kubernetes provided all of that.

**But Kubernetes introduced new challenges:**

YAML management

Service discovery

Resource limits

Health checks

Configuration management

The complexity increased significantly.

At the same time, I started understanding why Kubernetes became the industry standard.

**Helm Changed Everything**

Managing raw Kubernetes manifests quickly became painful.

I introduced Helm charts to template deployments and environments.

Instead of maintaining multiple copies of manifests, I could parameterize everything:

Image versions

Replica counts

Resource limits

Environment variables

Deployment became much more manageable.

**Discovering GitOps with ArgoCD**

This was probably the biggest mindset shift.

Originally deployment looked like:

_GitHub Actions

↓

kubectl apply

**After learning GitOps:**

Git Commit

↓

Git Repository

↓

ArgoCD

↓

Kubernetes Cluster_

The cluster state became fully declarative.

Git became the source of truth.

Rollback became dramatically easier.

Auditing changes became trivial.

I finally understood why so many engineering teams are adopting GitOps workflows.

**Monitoring: The Missing Piece**

For a long time I only cared whether the application worked.

**Then I realized:**

If something breaks in production, how would I know?

That question led me to Prometheus and Grafana.

I instrumented the application and started tracking:

API latency

Request volume

Error rates

Resource utilization

Application health

Suddenly I could see what the system was actually doing.

Monitoring transformed troubleshooting from guessing into observing.

Adding Alerting

Monitoring is useful.

Alerting is essential.

**I integrated AlertManager so that operational issues could be detected automatically.**

This forced me to think about:

Error thresholds

SLOs

Availability targets

Incident response

Topics I previously associated only with large companies.

Testing Scalability

Eventually I wanted to understand how the platform behaved under load.

I simulated 500 concurrent users.

**The results were revealing.

Single Container

Metric Value

Requests 23,381

Throughput ~155 req/s

P95 Latency 3.33s

The Node.js process became saturated.

Performance degraded rapidly.

Kubernetes with HPA

Metric Value

Requests 61,026

Throughput ~351 req/s

P95 Latency 861ms**

By distributing traffic across multiple pods, latency dropped dramatically while throughput more than doubled.

This was the first time I could actually see the benefits of horizontal scaling in practice.

**Current Architecture**

Today the deployment flow looks like this:

*Developer
↓
GitHub
↓
GitHub Actions
↓
Docker Build
↓
Trivy Scan
↓
GHCR
↓
ArgoCD
↓
Kubernetes
↓
Prometheus
↓
Grafana
↓
AlertManager*

What started as a simple web application became a complete cloud-native platform.

**What I Learned
A few lessons stood out throughout this journey.**

**What's Next**

The next phase of my learning journey involves:

**AWS
Terraform
Infrastructure as Code
Distributed Load Testing
Platform Engineering**

**I'm currently building an open-source load testing tool called Loadster, inspired by the challenges I encountered while testing SystemCraft.
**
***

**Check out the site Live:** [https://system-craft-kohl.vercel.app/](https://system-craft-kohl.vercel.app/)

If you like the article make sure to drop a like and maybe even checkout the github repo and help me contribute and make it even better
