{"slug": "stop-reading-documentation-start-reading-github-issues", "title": "Stop Reading Documentation. Start Reading GitHub Issues.", "summary": "A developer has created a checklist for mining GitHub issues to extract UX research findings from engineering conversations. The framework involves coding each issue with metadata such as user persona, product stage, and challenge type, turning raw anecdotes into a defensible dataset. The approach claims to reveal patterns invisible to casual reading, such as identifying the highest-priority UX problems by mapping multiple issues to the same stage and challenge type.", "body_md": "Reading the docs isn't enough. The most valuable developer feedback lives inside GitHub issues, bug reports, and feature discussions. In this article, I share the **checklist I use to mine repositories and coding frame work for the issue**, uncover real developer pain points, and turn engineering conversations into meaningful UX research findings.\n\nReading GitHub issues without a coding framework produces impressions. Coding them produces findings.\n\nWhen you code an issue, you are making an explicit analytical decision about what type of signal it contains. You are saying: *\"This issue is evidence of a feedback gap at Stage 4 (model loading), filed by an ML engineer with a new user experience level, in KServe v0.11, and it directly answers my research question about what information engineers need during model loading that the product currently does not provide.\"*\n\nThat sentence assembled from your coding decisions is a finding. Multiply it across 100 issues and you have a research study.\n\nWithout coding, you have 100 anecdotes. With coding, you have a dataset.\n\n**It makes patterns visible.** When you code 20 issues and notice that 14 of them map to the same stage and the same challenge type, you have found your highest-priority UX problem. You would never see that pattern by reading casually.\n\n**It makes your findings defensible.** \"Engineers struggle with KServe\" is an opinion. \"18 issues across v0.10–v0.13 filed by engineers in their first deployment show identical feedback gap patterns at Stage 4, with an average of 14 comments per issue\" is a finding.\n\n**It separates your role from the engineer's role.** Engineers read GitHub issues as bug reports. You read them as evidence of design decisions. The coding framework is the analytical lens that makes your UX reading possible.\n\nBefore any analysis, capture the foundational data for every issue. This establishes the quantitative baseline for your research report.\n\n```\nIssue Metadata Checklist\n─────────────────────────────────────────────\n[ ] Issue URL / Number        e.g. #1234\n[ ] Issue creation date\n[ ] Issue category            Bug | Question | Feature Request | Docs\n[ ] Labels                    e.g. InferenceService, Control Plane, Kubernetes\n[ ] Current state             Open | Closed | Stale | Merged\n[ ] Resolution type           Code Fix | Docs Update | Workaround | No resolution\n[ ] Total comment count\n[ ] Unique commenter count\n[ ] Ping-pong count           Back-and-forth before root cause found\n[ ] Brief issue summary\n```\n\nThe \"ping-pong count\" — the number of back-and-forth diagnostic comments between the user and maintainers before the root cause was identified — is a particularly powerful metric. High ping-pong means the product gave the user no diagnostic signal. That is a UX failure in the product itself.\n\nUnderstanding user demographics is essential in any research project because it tells you whose problems to prioritise. In interviews or surveys, you simply ask. In GitHub mining, you have to infer from signals embedded in the issue itself.\n\nIn the context of KServe, we are primarily trying to distinguish between three main personas.\n\nThe language engineers use tells you their world immediately.\n\n| Persona | Typical keywords | What they focus on |\n|---|---|---|\nData Scientist / ML Engineer |\nPyTorch, TensorFlow, HuggingFace, weights, predictor, artifact, S3, inputs/outputs | The model itself — getting a Python script to serve predictions |\nPlatform / DevOps Engineer |\nCRDs, Istio, Knative, ingress, RBAC, service account, Helm, multi-cluster, HPA | Infrastructure, networking, security, cluster stability |\nApplication Developer |\nREST API, gRPC, JSON payload, curl, SDK, timeout, 503 error, endpoint | Consuming the model — integrating the endpoint into a larger application |\n\nKServe issues typically fall into three abstraction layers:\n\n`kubectl describe`\n\noutputs, cluster events, Helm values files, or Istio configurations — they speak in Kubernetes YAML\n\nQuick triage question:Is this person treating KServe as aninfrastructure componentor as amodel-delivery tool? Infrastructure → Platform/DevOps. Model-delivery → Data Scientist/ML Engineer.\n\n```\nDemographics Checklist\n─────────────────────────────────────────────\n[ ] Inferred user persona\n      Data Scientist / ML Engineer\n      Platform / DevOps Engineer\n      Application Developer\n      ML-expert / K8s-novice  ← the hardest edge case\n      Unclear\n\n[ ] Experience level\n      New user (first issues, plain English, no version info)\n      Experienced (provides full env, logs, rules out causes)\n      Unclear\n\n[ ] Deployment environment\n      Local (Kind / Minikube)\n      Cloud managed (EKS / GKE / AKS)\n      On-premises / bare metal\n\n[ ] Deployment scale\n      Single-cluster\n      Multi-cluster\n      Multi-tenant\n\n[ ] Deployment method\n      Helm | Kustomize | ArgoCD / GitOps | Direct kubectl\n```\n\nThe hardest edge case:ML-experienced / Kubernetes-novice engineers. They write technically confident issues about model formats or serving runtimes — but are completely confused about Istio or Knative. Always code these as a separate category — they reveal a completely different class of UX failure.\n\nThe key insight for non-technical UX researchers: **you do not need to understand the technical content of an issue to identify its UX signal.** The signals are in the language, not the configuration.\n\nHere is how to read any issue in under two minutes.\n\nScan for time words before you read anything else.\n\n| Time word | Severity | What it means |\n|---|---|---|\n| \"minutes\" | Low | Minor gap, quickly resolved |\n| \"hours\" | Medium | Significant friction, real work lost |\n| \"days\" | High | Severe friction, deadline impact |\n| \"weeks\" / \"months\" | Critical | Product-level failure |\n| \"gave up\" / \"switching to X\" | Abandonment | User is leaving |\n\n**Real example:** *\"I've spent the last three days trying to figure out why my model stays in Unknown status.\"*\n\nYou do not need to know what Unknown status means. \"Three days\" tells you this is a high-severity finding.\n\nEvery friction-revealing issue has the word \"but\" at a specific moment. Everything before \"but\" is what the engineer did correctly. Everything after is where the product failed them.\n\n**Real example:** \"I followed the quickstart exactly **but** the webhook never became Ready.\"\n\nBefore \"but\" = user followed instructions. After \"but\" = product gave instructions that led to failure. That is a documentation UX finding, not a technical bug.\n\n**Real example:** \"The status shows Ready **but** every curl request returns 503.\"\n\nThis is the \"misleading success\" friction type, the product declared success when the user's actual goal was completely unmet. One of the most trust-destroying UX failures possible.\n\nThese phrases directly reveal a **mental model gap** — the engineer built an incorrect picture of how the system works, and reality contradicted it.\n\n**Real example:** *\"I thought Transformer meant a language model component like BERT. Turns out it's just data preprocessing.\"*\n\nThis is not a bug. It is a naming failure. \"Transformer\" means attention-based neural architecture in the ML world. KServe uses it to mean \"a component that preprocesses data before the model.\" Every ML engineer who encounters this name builds the wrong mental model from it.\n\n**Real example:** *\"I assumed KServe would track model versions automatically — like a proper ML serving platform should.\"*\n\nThis is scope confusion — the engineer's expectation of what KServe is does not match what it actually is. That mismatch is a design communication failure, not a user error.\n\nEach \"I had to\" signals a missing workflow step — something the engineer needed that the product should have provided.\n\n**Real example:** *\"I had to write a polling script to check when the InferenceService became Ready, because there's no built-in wait command.\"*\n\nCount the \"I had to\" chains in an issue. An issue with four of them put four separate manual burdens on an engineer for a task that should have been automated.\n\nScan for capitalized tool names: Knative, Istio, Prometheus, MLflow, Argo, Triton, HuggingFace. Count them.\n\nWhen an engineer writes \"sorry if this is a basic question\" — that is not politeness. That is evidence the product made a competent person feel responsible for the product's communication failure.\n\nOnce you have your basic metadata and demographics, here is the full coding structure to apply to every issue.\n\nMap every issue to one of 8 stages. This tells you where in the journey the product is losing engineers.\n\n```\nDeployment Stage Checklist\n─────────────────────────────────────────────\n[ ] Stage 1 · Setup         Installing KServe and dependencies\n                             Signals: \"webhook not ready\" · \"quickstart fails\"\n\n[ ] Stage 2 · Storage       Getting the trained model accessible\n                             Signals: \"access denied\" · \"model not found\" · \"storageUri\"\n\n[ ] Stage 3 · Configuration Writing the InferenceService YAML spec\n                             Signals: \"minimum config?\" · \"deprecated field\" · \"required fields?\"\n\n[ ] Stage 4 · Loading       Applying config and waiting for model to load\n                             Signals: \"Unknown status\" · \"how long?\" · \"no logs\" · \"OOMKilled\"\n\n[ ] Stage 5 · Network       Reaching the deployed endpoint\n                             Signals: \"Ready but 503\" · \"connection refused\" · \"EXTERNAL-IP pending\"\n\n[ ] Stage 6 · Inference     Sending requests and getting predictions back\n                             Signals: \"400 error\" · \"what format?\" · \"V1 vs V2 protocol\"\n\n[ ] Stage 7 · Hardening     Making the deployment production-reliable\n                             Signals: \"zero downtime update\" · \"autoscaling conflict\" · \"SLA\"\n\n[ ] Stage 8 · Day-2 ops     Updating, monitoring, governing over time\n                             Signals: \"rollback\" · \"update my model\" · \"60 models across teams\"\n\n[ ] Cross-stage?            Root cause in Stage X, discovered at Stage Y\n                             (delayed discovery = highest severity finding)\n```\n\nThe most important finding:Stages 4 and 5 consistently produce the highest issue volume in K-Serve. The product is completely silent at the two moments when engineers are most anxious and most blind.\n\n```\nUsability Challenge Checklist\n─────────────────────────────────────────────\n[ ] U1 · Learnability breakdown\n         Cannot figure out how to do the task the first time\n         Signal: \"how do I\" · already answered in docs · confused by concepts\n\n[ ] U2 · Error recovery failure\n         Hits error, can't understand it, doesn't know which log to check\n         Signal: pastes cryptic error, \"stuck for days\", tries random things\n\n[ ] U3 · Feedback & visibility gap\n         System gives no signal — Unknown, Pending, complete silence\n         Signal: \"nothing happens\" · \"how long should this take?\" · \"no logs\"\n\n[ ] U4 · Configuration complexity\n         Too many fields, unclear defaults, no minimum viable spec\n         Signal: \"is all of this needed?\" · \"which fields are required?\"\n\n[ ] U5 · Mental model mismatch\n         Expectation contradicts how system actually works\n         Signal: \"I expected\" · \"I thought\" · \"this makes no sense\"\n\n[ ] U6 · Workaround proliferation\n         User invented their own solution to fill a product gap\n         Signal: \"I wrote a script\" · \"I had to\" · shared snippets in comments\nDeveloper Friction Checklist\n─────────────────────────────────────────────\n[ ] F1 · Invisible wall        System silent — nothing to debug\n[ ] F2 · Misleading success    \"Ready\" but goal completely unmet\n[ ] F3 · Hidden prerequisite   Required knowledge never communicated until failure\n[ ] F4 · Terminology confusion Word means something different in this context\n[ ] F5 · Broken feedback loop  Can't tell if a change had any effect\n[ ] F6 · Forced context switch Must configure Istio/Knative to complete one KServe task\n[ ] F7 · Documentation gap     Knows what they want, can't find how to do it\n[ ] F8 · Accumulated friction  5–6 small frictions in sequence → abandonment signal\nSystem Challenge Checklist\n─────────────────────────────────────────────\n[ ] Ownership ambiguity    KServe says \"that's Istio\", Istio says \"that's KServe\"\n[ ] Abstraction leakage    InferenceService was meant to hide Knative/Istio; it doesn't\n[ ] Observability gap      Logs scattered across 4+ components; no unified view\n[ ] Role boundary collision ML engineer task structurally requires platform engineer action\n[ ] Upgrade path fragility  Every version upgrade risks production breakage\n\nEnvironmental Challenge Checklist\n─────────────────────────────────────────────\n[ ] Managed K8s divergence   EKS, GKE Autopilot, OpenShift behave differently\n[ ] Corporate proxy / air-gap No public internet; private registry; air-gapped\n[ ] GPU & hardware           OOMKilled, VRAM insufficient, driver mismatch\n[ ] Org security policy      OPA, Gatekeeper, PodSecurityAdmission blocking KServe\n[ ] On-premises / hybrid     No managed LoadBalancer, NFS storage, bare metal\n[ ] Regulated / compliance   HIPAA, SOC2, GDPR, data residency requirements\n```\n\nThis is what transforms your research from a snapshot into a longitudinal UX health report.\n\n```\nVersion Tracking Checklist\n─────────────────────────────────────────────\n[ ] KServe version (exact)         e.g. 0.11.2\n[ ] Previous version (if upgrade)  e.g. 0.10 → 0.11\n[ ] Kubernetes version             e.g. 1.27\n[ ] Cloud provider                 EKS / GKE / AKS / On-prem / Local\n[ ] Version stated by:             User upfront | Maintainer had to ask | Never provided\n[ ] Upgrade experience:            Better | Same | Worse | New regression introduced\n[ ] Chronic pain signal:           Same issue present in prior version? Yes / No\n```\n\nThe chronic problem list— friction points that appear in the top-3 across three or more versions — is your most powerful finding. A problem that survived three release cycles is not a bug. It is an architectural decision.\n\n```\nLLM Inference Checklist\n─────────────────────────────────────────────\n[ ] Is this an LLM issue?          Yes | No | Hybrid\n[ ] LLM model family               Llama | Mistral | Qwen | Gemma | Custom\n[ ] LLM runtime                    vLLM | TGI | OpenAI-compatible | Custom\n[ ] Capability attempted:\n      Basic inference\n      Streaming tokens (SSE)\n      Multi-GPU / tensor parallelism\n      LoRA / adapter serving\n      HuggingFace Hub authentication\n      OpenAI API compatibility\n      Quantisation (GPTQ, AWQ)\n\n[ ] LLM-specific challenge:\n      GPU OOM / VRAM insufficient\n      Model loading with no progress signal\n      Streaming failure through gateway\n      HuggingFace auth failing in cluster\n      Runtime version lag behind vLLM/TGI ecosystem\n      No LLM-specific metrics (token throughput, TTFT)\n\n[ ] Innovation lag signal:\n      Date of capability request: ___________\n      Date KServe released support: ___________\n      Gap (days): ___________\n```\n\nFor each issue, record which research question it provides evidence for. This anchors your mining to your study goals.\n\n```\nResearch Question Mapping\n─────────────────────────────────────────────\nCurrent-state questions (what is broken today)\n[ ] RQ1 · First deployment challenges across roles and experience levels\n[ ] RQ2 · Workflow gaps between deployed model and reliable production\n[ ] RQ3 · Observability and debugging challenges by stage\n[ ] RQ4 · LLM deployment challenges vs classical ML serving\n[ ] RQ5 · Environmental factors shaping deployment experience\n[ ] RQ6 · How challenges evolved across versions (your unique longitudinal contribution)\n[ ] RQ7 · Design changes that would most reduce friction\n\nUX improvement questions (what should be designed differently)\n[ ] UX1 · Time-to-first-inference reduction\n[ ] UX3 · Model loading progress visibility (highest volume finding)\n[ ] UX4 · Self-service diagnostic experience\n[ ] UX9 · LLM mental model bridge (vLLM/HuggingFace → KServe)\n[ ] UX11 · Environment validator / dependency pre-flight checker\n```\n\nOnce you spot a pattern across multiple issues, record it here. One template per pattern — not per issue.\n\n```\nUX Finding Template\n─────────────────────────────────────────────────────────────\nFinding type:       ___________________________________________\nAffected users:     Role · Experience level · Version band\nDeployment stage:   ___________________________________________\nEvidence:           N issues · Date range · e.g. \"14 issues, 2022–2024\"\nBest quote:         Under 25 words — your strongest evidence\n─────────────────────────────────────────────────────────────\nUX finding statement:\n\"Engineers [doing X] cannot [accomplish Y] because [design gap Z],\n which means [impact on time / confidence / adoption].\"\n─────────────────────────────────────────────────────────────\nSeverity:           Low | Medium | High | Critical\nChronic?            Present across ___ versions\nDesign recommendation: ________________________________________\nResearch question answered: ___________________________________\nMining Session Completion Checklist\n─────────────────────────────────────────────\nIssue coverage\n[ ] 50+ general deployment issues coded across all 8 stages\n[ ] 30+ LLM inference issues coded and version-tracked\n[ ] 10+ issues per major version band (v0.10, v0.11, v0.12, v0.13+)\n[ ] 15+ upgrade issues with before/after UX delta recorded\n[ ] Top 30 most-commented issues reviewed (sort:comments-desc)\n[ ] 10+ abandoned issues (open 6+ months, last message unanswered)\n[ ] 15+ success cases (quickly-resolved — positive signal baseline)\n[ ] All enhancement/feature-request labels reviewed\n[ ] All competitive tool mentions captured (BentoML, Seldon, Ray Serve)\n\nBaseline measurement\n[ ] Metrics table filled per version band:\n      Issue count | Avg comments | 7-day resolution rate | Emotional language %\n[ ] Top-3 friction points per version — chronic problem list built\n[ ] LLM innovation lag calculated for 5+ capabilities\n[ ] Version reporting rate: % of issues that include version upfront\n\nQuality\n[ ] 15% of issues coded by a second researcher (inter-rater reliability)\n[ ] Every research question has at least 3 issues as evidence\n[ ] One finding statement written per significant pattern found\n```\n\nThe biggest barrier to studying developer tools as a UX researcher is the assumption that you need to understand the code to understand the problem. This framework removes that barrier entirely.\n\nWhen you code an issue, you are not evaluating the correctness of someone's Kubernetes configuration. You are recording what the issue reveals about the **human experience of using the product**. \"Status shows Unknown for 20 minutes\" tells you everything you need regardless of whether you understand what Unknown means technically. The product left a user without feedback during its most critical operation. That is a UX finding independent of any technical knowledge.\n\nThe coding sheet converts anecdote into pattern. Instead of \"users seem to struggle with deployment,\" you can say \"14 of 20 issues sampled from v0.11 show feedback gap failures at Stage 4, with an average of 18 comments per issue, suggesting loading state communication is the highest-priority improvement target for this version band.\"\n\nThat is a product roadmap argument. The coding sheet built it.\n\nThe research questions embedded in the coding sheet map directly to design recommendations with evidence behind them. A contributor who wants to make a meaningful impact on user experience now has specific, evidence-backed targets — not \"improve docs\" but \"add granular status conditions per loading phase that distinguish between 10 failure modes currently all reporting as Unknown.\"\n\nWhen researchers share findings publicly in CNCF Blogs, KubeCon talks, or articles like this one — the coding framework makes the research reproducible. Other researchers can apply the same checklist to a different version or a different tool and compare results. That cumulative body of evidence is what eventually changes product direction.\n\nGitHub issues are not a bug tracker. They are a longitudinal, naturalistic record of where real engineers encounter the gap between what a product promises and what it delivers.\n\nA coding sheet is the analytical framework that transforms that record into research. Without it, you are reading. With it, you are studying.\n\nThe framework I built for KServe — covering demographics, deployment stages, usability challenges, friction types, mental model gaps, system challenges, environmental barriers, version tracking, and LLM inference — did not emerge from theory. It emerged from reading hundreds of issues and asking the same question every time: what is the UX researcher's reading of this, beyond what the engineer sees?\n\nThe answer is always the same: engineers see symptoms. The coding sheet helps you see the design decisions that caused them.\n\nStart with the \"but\" sentence. Work backwards to the design failure. Code it. Repeat 100 times. Then write the research report.\n\nThis coding framework was developed as part of a UX research study on ML model deployment in KServe. If you are working on similar research in the cloud-native or MLOps space, I would love to hear your thoughts.", "url": "https://wpnews.pro/news/stop-reading-documentation-start-reading-github-issues", "canonical_source": "https://dev.to/priya_sajja_c336921bbda87/stop-reading-documentation-start-reading-github-issues-1a6k", "published_at": "2026-06-03 22:24:28+00:00", "updated_at": "2026-06-03 22:41:20.237276+00:00", "lang": "en", "topics": ["ai-research", "ai-tools", "ai-products", "ai-infrastructure", "mlops"], "entities": ["KServe"], "alternates": {"html": "https://wpnews.pro/news/stop-reading-documentation-start-reading-github-issues", "markdown": "https://wpnews.pro/news/stop-reading-documentation-start-reading-github-issues.md", "text": "https://wpnews.pro/news/stop-reading-documentation-start-reading-github-issues.txt", "jsonld": "https://wpnews.pro/news/stop-reading-documentation-start-reading-github-issues.jsonld"}}