I Built a Git Commit Message Generator with AI (Here's What I Learned)

A developer built an AI-powered Git commit message generator using OpenAI's GPT-4o-mini model and a hybrid technique combining classification, validation, and fallback logic. The tool enforces Conventional Commits format by first classifying the diff type, then generating a one-line message, and validating it against a regex. After experimenting with simple prompts, few-shot learning, and local models like Llama 3, the developer settled on a three-part approach that includes truncating diffs to 250 lines and retrying up to three times before falling back to a simple fix message.

I used to be that developer who commits with messages like "fixed bug" or "updated stuff" – and I hated myself for it. Every pull request required a frantic rewrite of commit history. So I decided to automate the process with AI. My goal: generate meaningful, conventional commit messages directly from my diff, without too much embarrassment. Spoiler: it wasn't as simple as slapping a prompt in front of GPT. Here's the rollercoaster I went through, the dead ends, and the surprisingly elegant solution I eventually landed on. I was working on a medium-sized project with a dozen contributors. We enforced Conventional Commits feat: , fix: , etc. , but I kept getting lazy. My brain simply didn't want to switch from code mode to prose mode after every diff. I needed a tool that could: I didn't want a full CI pipeline – I wanted a local script I could run before git commit . git diff --cached | curl -X POST https://api.openai.com/v1/chat/completions -H "Authorization: Bearer $OPENAI KEY" -d '{"model":"gpt-4", "messages": {"role":"user", "content": "Generate a conventional commit message for this diff:"} }' Simple, right? The output was… verbose. It treated the diff like a novel and wrote a paragraph. Worse, it ignored the conventional prefix and just wrote random sentences. I tried adding more instructions, but then it would hallucinate features that weren't there. Total garbage. I copied prompts from a known blog post – few-shot examples with proper Conventional Commits. Worked about 60% of the time. But for small changes typo fix it still generated refactor: instead of fix: . And the token cost was high because I included 10 examples every time. I ran Ollama with Llama 3. It was slow 20 seconds per message and often wrote commit messages in the style of a 19th century novel: "In this commit, improvements were made to the authentication module…" Absolutely unusable for a team. After two weeks of frustrating iterations, I settled on a hybrid technique that combines three parts: Here's the core of my implementation in Node.js using the OpenAI SDK – but you can swap in any API that supports chat completions : python import { execSync } from 'child process'; import OpenAI from 'openai'; const openai = new OpenAI { apiKey: process.env.OPENAI API KEY } ; function getDiff { const output = execSync 'git diff --cached --no-color', { encoding: 'utf-8' } ; // Truncate to first 250 lines to save tokens and keep focus return output.split '\n' .slice 0, 250 .join '\n' ; } function validateMessage msg { // Conventional Commit regex return /^ feat|fix|chore|docs|refactor|test|style|perf \ ?. ?\ ?: .{1,72}$/.test msg ; } async function generateCommitMessage diff { const systemPrompt = You are a git commit message generator. First, classify the diff into one of these types: feat, fix, chore, docs, refactor, test, style, perf. Then write a one-line commit message in the format: type scope : description Scope is optional. Keep the description under 50 characters. Do not add extra commentary. ; for let attempt = 0; attempt < 3; attempt++ { const response = await openai.chat.completions.create { model: 'gpt-4o-mini', // cheaper and faster messages: { role: 'system', content: systemPrompt }, { role: 'user', content: Diff:\n\ \ \ diff\n${diff}\n\ \ \ } , max tokens: 100, temperature: 0.2, } ; const message = response.choices 0 .message.content.trim ; if validateMessage message return message; console.warn Attempt ${attempt + 1} failed validation: ${message} ; } // Fallback: just concatenate type and first line of diff summary return fix: ${diff.split '\n' 1 ?.trim ?.slice 0, 50 || 'minor change'} ; } async function main { const diff = getDiff ; if diff { console.log 'No staged changes' ; process.exit 0 ; } const message = await generateCommitMessage diff ; console.log 'Suggested commit message:' ; console.log message ; } main .catch console.error ; I run this as a Git alias: git config --global alias.aimsg ' node ~/scripts/ai-commit.mjs' Then git aismsg prints a suggestion. I copy/paste it into my actual commit. Automated commit hooks are dangerous – I trust my brain more than an LLM for the final decision. fix: in the wrong place.I sacrificed offline capability for reliability. If you're in a remote cabin without internet, my script won't help. You could swap in a local model, but expect lower accuracy. I also deliberately didn't use the --amend or automatic commit. Why? Because AI makes mistakes. If I auto-commit a message like "fix: removed console.log" when I actually changed business logic, that's a lie in history. Manual review is cheap insurance. fix: typo yourself. The overhead of running the script isn't worth it.I'd explore a two-model approach : use a small, fast model like GPT-4o-mini for type classification and a cheaper summarization model t5-small for the description. That would cut costs further and allow more offline flexibility. Also, I'd build a simple TUI terminal UI that shows the diff and the suggestion side-by-side, letting me edit before committing. But that's a weekend project I keep pushing off. This approach works for me, but everyone's workflow is different. Have you tried automating commit messages? Did you end up with a similar pipeline, or do you swear by handwritten messages? I'd love to hear how you handle this or why you think it's a bad idea in the first place .