I used to be that developer who commits with messages like "fixed bug" or "updated stuff" – and I hated myself for it. Every pull request required a frantic rewrite of commit history. So I decided to automate the process with AI. My goal: generate meaningful, conventional commit messages directly from my diff, without (too much) embarrassment.
Spoiler: it wasn't as simple as slapping a prompt in front of GPT. Here's the rollercoaster I went through, the dead ends, and the surprisingly elegant solution I eventually landed on.
I was working on a medium-sized project with a dozen contributors. We enforced Conventional Commits (feat:
, fix:
, etc.), but I kept getting lazy. My brain simply didn't want to switch from code mode to prose mode after every diff. I needed a tool that could:
I didn't want a full CI pipeline – I wanted a local script I could run before git commit
.
git diff --cached | curl -X POST https://api.openai.com/v1/chat/completions -H "Authorization: Bearer $OPENAI_KEY" -d '{"model":"gpt-4", "messages": [{"role":"user", "content": "Generate a conventional commit message for this diff:"}]}'
Simple, right? The output was… verbose. It treated the diff like a novel and wrote a paragraph. Worse, it ignored the conventional
prefix and just wrote random sentences. I tried adding more instructions, but then it would hallucinate features that weren't there. Total garbage.
I copied prompts from a known blog post – few-shot examples with proper Conventional Commits. Worked about 60% of the time. But for small changes (typo fix) it still generated refactor:
instead of fix:
. And the token cost was high because I included 10 examples every time.
I ran Ollama with Llama 3. It was slow (20 seconds per message) and often wrote commit messages in the style of a 19th century novel: "In this commit, improvements were made to the authentication module…" Absolutely unusable for a team.
After two weeks of frustrating iterations, I settled on a hybrid technique that combines three parts:
Here's the core of my implementation in Node.js (using the OpenAI SDK – but you can swap in any API that supports chat completions):
import { execSync } from 'child_process';
import OpenAI from 'openai';
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
function getDiff() {
const output = execSync('git diff --cached --no-color', { encoding: 'utf-8' });
// Truncate to first 250 lines to save tokens and keep focus
return output.split('\n').slice(0, 250).join('\n');
}
function validateMessage(msg) {
// Conventional Commit regex
return /^(feat|fix|chore|docs|refactor|test|style|perf)\(?.*?\)?: .{1,72}$/.test(msg);
}
async function generateCommitMessage(diff) {
const systemPrompt = `You are a git commit message generator.
First, classify the diff into one of these types: feat, fix, chore, docs, refactor, test, style, perf.
Then write a one-line commit message in the format: type(scope): description
Scope is optional. Keep the description under 50 characters.
Do not add extra commentary.`;
for (let attempt = 0; attempt < 3; attempt++) {
const response = await openai.chat.completions.create({
model: 'gpt-4o-mini', // cheaper and faster
messages: [
{ role: 'system', content: systemPrompt },
{ role: 'user', content: `Diff:\n\`\`\` diff\n${diff}\n\`\`\`` }
],
max_tokens: 100,
temperature: 0.2,
});
const message = response.choices[0].message.content.trim();
if (validateMessage(message)) return message;
console.warn(`Attempt ${attempt + 1} failed validation: ${message}`);
}
// Fallback: just concatenate type and first line of diff summary
return `fix: ${diff.split('\n')[1]?.trim()?.slice(0, 50) || 'minor change'}`;
}
async function main() {
const diff = getDiff();
if (!diff) { console.log('No staged changes'); process.exit(0); }
const message = await generateCommitMessage(diff);
console.log('Suggested commit message:');
console.log(message);
}
main().catch(console.error);
I run this as a Git alias:
git config --global alias.aimsg '!node ~/scripts/ai-commit.mjs'
Then git aismsg
prints a suggestion. I copy/paste it into my actual commit. (Automated commit hooks are dangerous – I trust my brain more than an LLM for the final decision.)
fix:
in the wrong place.I sacrificed offline capability for reliability. If you're in a remote cabin without internet, my script won't help. You could swap in a local model, but expect lower accuracy.
I also deliberately didn't use the --amend
or automatic commit. Why? Because AI makes mistakes. If I auto-commit a message like "fix: removed console.log" when I actually changed business logic, that's a lie in history. Manual review is cheap insurance.
fix: typo
yourself. The overhead of running the script isn't worth it.I'd explore a two-model approach: use a small, fast model (like GPT-4o-mini) for type classification and a cheaper summarization model (t5-small) for the description. That would cut costs further and allow more offline flexibility.
Also, I'd build a simple TUI (terminal UI) that shows the diff and the suggestion side-by-side, letting me edit before committing. But that's a weekend project I keep pushing off.
This approach works for me, but everyone's workflow is different. Have you tried automating commit messages? Did you end up with a similar pipeline, or do you swear by handwritten messages? I'd love to hear how you handle this (or why you think it's a bad idea in the first place).