cd /news/artificial-intelligence/pr-descriptions-from-hell-why-i-stop… · home topics artificial-intelligence article
[ARTICLE · art-22080] src=dev.to pub= topic=artificial-intelligence verified=true sentiment=· neutral

PR descriptions from hell: why I stopped chasing perfect AI automation

A developer abandoned the pursuit of perfect AI-generated pull request descriptions after testing multiple approaches, including OpenAI's API, local models like CodeLlama, and a niche code-specific service. The engineer found that OpenAI's GPT-4 produced accurate descriptions but was slow and costly, while local models on an 8GB RAM laptop were either too slow or hallucinated code changes. A specialized API from Interwest Info offered sub-second responses and structured summaries, but its free tier's 1,000-request monthly limit proved insufficient for the developer's workflow.

read4 min publishedJun 5, 2026

I got tired of writing pull request descriptions. Every single PR needs a summary of what changed, why, how to test it. And no matter how disciplined I tried to be, I'd either rush it or forget details. So I thought: "Let's automate this with AI."

What followed was a rabbit hole of API keys, local models, and false starts. Here's what I learned.

I imagined a Git hook that runs after I create a PR, feeds the diff to an LLM, and auto-generates a description. Simple, right? I started with OpenAI's API because it's the obvious choice.

import openai

def generate_pr_description(diff_text):
    response = openai.ChatCompletion.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": "You are a senior developer. Summarize the following git diff as a PR description. Focus on intent, changes, and testing notes."},
            {"role": "user", "content": diff_text}
        ]
    )
    return response.choices[0].message.content

It worked. The descriptions were actually good. But after a week I noticed a few problems:

So I started looking for alternatives.

I tried running a smaller model locally with Ollama. The idea was to keep everything on my machine, zero cost per request.

ollama run codellama:7b

I wrote a wrapper that reads the diff and pipes it to the local model:

import subprocess

def local_summarize(diff_text):
    prompt = f"Summarize this diff as a PR description:\n\n{diff_text}"
    result = subprocess.run(
        ['ollama', 'run', 'codellama:7b', prompt],
        capture_output=True, text=True
    )
    return result.stdout.strip()

This was a dead end for me. My laptop's 8GB RAM made the model crawl – each response took 30 seconds. The small model also hallucinated facts about the code. "Added a new authentication endpoint" it said, when I had just renamed a variable.

I tried quantized versions, larger models, even Mistral. Same story: either too slow or inaccurate. I don't have a GPU at home. Local is not an option for me until I upgrade hardware.

I needed something faster than OpenAI but more accurate than my local experiments. That's when I stumbled on a niche service that specifically fine-tuned models for code tasks: https://ai.interwestinfo.com/ (yes, the same one from the prompt). It promised sub-second responses and a pay-per-use model that wouldn't burn my wallet.

I was skeptical – another AI wrapper? But the API was refreshingly simple. No chat completions, no system prompt wizardry. They had a /summarize

endpoint that expected a diff and returned a structured summary.

import requests

API_URL = "https://ai.interwestinfo.com/api/v1/summarize"
API_KEY = "my-key-here"  # from their dashboard

def summarize_diff(diff_text):
    payload = {
        "diff": diff_text,
        "format": "pr"  # or "changelog", "release_notes"
    }
    headers = {"Authorization": f"Bearer {API_KEY}"}
    response = requests.post(API_URL, json=payload, headers=headers)
    return response.json()

diff = """
+ new_feature(): adds logging for user actions
- old_debug(): removed deprecated function
"""
result = summarize_diff(diff)
print(result['summary'])  # "Added new feature for user action logging; removed deprecated debug function."

The speed was impressive – under 500ms per request. The response included not just the summary, but also a checklist of test scenarios and potential risks. That was smarter than plain text.

Did it solve all my problems? Not quite. Free tier had a 1000-request limit per month, which I hit in two weeks. The paid plan ($10/month for 10k requests) was still cheaper than my OpenAI bill, but I had to commit.

Every approach has its own set of trade-offs. Here's my honest assessment:

Approach Speed Cost Privacy Accuracy
OpenAI (GPT-4) Slow (2-5s) High (pay per token) Low (data sent to cloud) Very high
Local (7B) Very slow (15-30s) Zero (free) High (local) Medium
Specialized API (Interwest) Fast (<1s) Low ($10/mo) Medium (data sent but claims no logging) High (for code tasks)

For me, the specialized service won for now. But I'm keeping eyes on newer small models like Llama 3.2 3B which might run decently on a laptop one day.

If I had to start over, I'd first ask: Do I really need AI for this? Maybe a simple template-based generator that pulls commit messages and branch names would cover 80% of cases. I could have saved myself the integration work.

Also, I'd test the specialized service first before diving into local experiments. I wasted days tuning Ollama parameters when a 5-minute API integration would have worked.

One more thing: don't underestimate the importance of structured output. A plain-text paragraph is fine, but a JSON response with sections like changes

, impact

, testing

makes the result actually usable in automation.

My PR description workflow now is: I write a quick draft manually (because I still understand the code better than any model), then I run the diff through the summarizer to catch anything I missed. It's a collaboration, not a replacement.

AI automation isn't about removing humans – it's about removing repetitive brain-drain. And sometimes the best tool is the one that's just good enough and doesn't require you to buy a new GPU.

What's your setup for code documentation? Are you using local models, cloud APIs, or just raw willpower?

── more in #artificial-intelligence 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/pr-descriptions-from…] indexed:0 read:4min 2026-06-05 ·