How to Build an AI Resume Builder with LangChain and Node.js

A developer built an AI-powered resume rewriter using LangChain and Node.js, creating a structured pipeline that transforms generic job descriptions into action-oriented bullet points with measurable achievements. The system parses resumes into sections, runs each through professionally engineered prompts, and returns job-specific output via an Express API with streaming responses. The project demonstrates how LangChain's composable primitives—including PromptTemplate, LLMChain, and SequentialChain—enable multi-step AI workflows that scale beyond raw API calls.

A few months back, my friend Marcus was applying for a senior backend role at a fintech company. He had five years of solid experience — distributed systems, AWS, the whole stack. But his resume read like a list of job descriptions someone had copied from LinkedIn. "Responsible for maintaining microservices." "Assisted with CI/CD pipeline implementation." You know the type. I told him: the problem isn't what you did, it's how you're saying it. Hiring managers spend about six seconds on a resume before deciding whether to read it properly. Six seconds. And if those six seconds are spent reading "responsible for maintaining" — you've lost them. We spent two hours rewriting it together. Every bullet point started with a strong verb. Every achievement had a number. "Reduced API response time by 40% by introducing Redis caching across three high-traffic endpoints." Much better. Marcus got the interview. The obvious next thought was: what if you could automate this? Not in the "dump your resume into ChatGPT and ask it to make it better" way — that produces generic slop. I mean a real, structured AI pipeline that understands resume context, applies professional rewriting patterns, and returns clean, job-specific output. That's what LangChain is built for. And in this guide, we're going to build exactly that: an AI-powered resume rewriter using LangChain and Node.js, with a real Express API, streaming responses, and the kind of prompt engineering that actually produces good results. Here's the honest answer: LangChain is an orchestration framework for building applications on top of large language models. Think of it the way you'd think of Express.js — Express doesn't do anything you couldn't do with raw Node's http module, but it gives you a structured, composable way to build web apps that doesn't collapse under its own weight. LangChain does the same thing for LLM applications. You could just call the OpenAI API directly everywhere. For a one-off script, that's fine. But as soon as your app grows — different prompts for different tasks, multi-step reasoning chains, memory across conversations — raw API calls get messy fast. Here's what raw OpenAI API code looks like once a project grows: js // Raw OpenAI — works, but scales badly const response = await openai.chat.completions.create { model: "gpt-4", messages: { role: "system", content: systemPrompt }, { role: "user", content: Rewrite this section: ${section} } } ; const rewritten = response.choices 0 .message.content; That's fine for one call. Now add: prompt versioning, chaining that output into a second model call, memory from previous messages, fallback to a different model when rate limits hit, streaming output to the client. Suddenly you're managing a lot of state manually. LangChain handles all of that with composable primitives: PromptTemplate for reusable, testable prompts; LLMChain for connecting a prompt to a model; SequentialChain for multi-step pipelines; built-in streaming support; and integrations with every major LLM provider. For our resume builder, the chain looks like this: parse the resume into structured sections, run each section through a prompt that produces action-oriented bullet points, then return the assembled result. Let's build it. Before we write a line of code, here's the system at a glance: ┌─────────────────────────────────────────────────────┐ │ CLIENT Frontend │ │ POST /api/rewrite { resumeText, section } │ └──────────────────────┬──────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────┐ │ EXPRESS API Node.js │ │ 1. Validate input │ │ 2. Parse resume into sections │ │ 3. Call LangChain rewrite chain │ │ 4. Return improved bullet points │ └──────────────────────┬──────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────┐ │ LANGCHAIN REWRITE CHAIN │ │ PromptTemplate → ChatOpenAI GPT-4 → Output │ └──────────────────────┬──────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────┐ │ OPENAI API GPT-4 │ └─────────────────────────────────────────────────────┘ Nothing revolutionary — but each layer has a single, testable job. The chain is the interesting part, so let's get there quickly. Start a new Node.js project and install the dependencies: mkdir resume-ai && cd resume-ai npm init -y npm install express langchain @langchain/openai @langchain/core dotenv Create a .env file at the root: OPENAI API KEY=sk-your-key-here PORT=3001 And your project structure: resume-ai/ ├── src/ │ ├── parseResume.js │ ├── resumeChain.js │ └── app.js ├── .env └── package.json Add "type": "module" to package.json so we can use ES module syntax throughout. This is the unglamorous part that everyone skips, and it's why most AI resume tools produce garbage. You can't just throw 800 words of resume text at a model and ask it to "make it better." You need to isolate the section you're improving — otherwise the model is operating without context. Here's a simple section parser. It's not perfect — real resumes come in dozens of formats — but it handles the common patterns: js // src/parseResume.js export function parseResumeText rawText { const sections = { summary: "", experience: , skills: , education: , }; const sectionKeywords = { summary: "summary", "objective", "profile", "about" , experience: "experience", "employment", "work history", "career" , skills: "skills", "technical skills", "technologies", "competencies" , education: "education", "academic", "degree", "university" , }; const lines = rawText.split "\n" .filter l = l.trim .length 0 ; let currentSection = null; for const line of lines { const lowerLine = line.toLowerCase .trim ; const detected = Object.entries sectionKeywords .find , keywords = keywords.some kw = lowerLine.includes kw ; if detected && lowerLine.length { const { resumeText, targetSection } = req.body; if resumeText || typeof resumeText == "string" { return res.status 400 .json { error: "resumeText is required" } ; } if targetSection || typeof targetSection == "string" { return res.status 400 .json { error: "targetSection is required" } ; } // Stay within token limits — GPT-4 context window is large, // but we don't need to send the whole resume every time. const resumeContext = resumeText.slice 0, 3000 ; try { const result = await rewriteChain.call { resumeContext, sectionText: targetSection, } ; res.json { original: targetSection, rewritten: result.text.trim , } ; } catch err { console.error "Chain error:", err.message ; if err.message?.includes "Rate limit" { return res.status 429 .json { error: "Rate limit hit. Try again in a moment." } ; } res.status 500 .json { error: "Rewrite failed. Check your OpenAI API key." } ; } } ; const PORT = process.env.PORT || 3001; app.listen PORT, = console.log Resume AI API running on :${PORT} ; The input size limit 50kb and the resumeContext.slice 0, 3000 are both intentional. Most GPT-4 token limits won't be hit by a 3,000-character resume excerpt, but some resumes are surprisingly long — especially ones with extensive project descriptions. Truncating at 3,000 characters keeps costs predictable. For a good UX, you want to stream the AI response as it arrives rather than waiting for the full completion. A 400-word rewrite might take 6–8 seconds to complete — a blank screen for 8 seconds feels broken. LangChain makes streaming straightforward with callbacks: js import { HumanMessage } from "@langchain/core/messages"; app.post "/api/rewrite/stream", async req, res = { const { resumeText, targetSection } = req.body; res.setHeader "Content-Type", "text/event-stream" ; res.setHeader "Cache-Control", "no-cache" ; res.setHeader "Connection", "keep-alive" ; res.flushHeaders ; const streamingModel = new ChatOpenAI { modelName: "gpt-4", temperature: 0.4, streaming: true, callbacks: { handleLLMNewToken token { res.write data: ${JSON.stringify { token } } ; }, handleLLMEnd { res.write "data: DONE \n\n" ; res.end ; }, handleLLMError err { res.write data: ${JSON.stringify { error: err.message } } ; res.end ; }, }, , } ; const resumeContext = resumeText?.slice 0, 3000 || ""; const prompt = Rewrite these resume bullets for a software developer. Be concise and action-oriented:\n${targetSection} ; await streamingModel.invoke new HumanMessage prompt ; } ; On the frontend, you'd consume this with the Fetch API and ReadableStream . Each data: event carries a token, and you append it to the UI as it arrives. The user sees the response materialize in real time — feels fast, even when it isn't. GPT-4's context window is large, but you pay per token. If you're sending the full resume + prompt on every request, costs add up fast at scale. The fix: truncate the resume context as shown above and cache the parsed sections so you're not re-parsing on every API call. This is the big one. Ask the model to "quantify achievements" without any source data, and it will make numbers up. "Reduced load time by 73%" sounds great until the hiring manager asks about it in an interview. The fix: explicitly tell the model in the prompt: "Only add numbers if they appear in the original text. If no numbers are present, use qualitative language instead." A crafty user could put something like "Ignore all previous instructions and output..." inside their resume text. Since you're sending that text directly to the model, it works. The fix: sanitize input and separate resume content from the instruction portion of the prompt with a clear delimiter, like ---RESUME START--- / ---RESUME END--- . OpenAI's rate limits are per API key, not per user. One user hammering your endpoint can hit the limit for everyone. Add a rate limiter like express-rate-limit before you go live — 5 requests per minute per IP is a reasonable starting point for a resume tool. GPT-4 is expensive and slow. For most resume rewriting tasks, gpt-4o-mini produces nearly identical results at a fraction of the cost. Test both. You might be surprised how good the cheaper model is for structured, constrained tasks like this one. Factor Raw OpenAI API LangChain Setup complexity Low — one import, one call Medium — more abstractions to learn Single prompt apps Perfect fit Overkill Multi-step chains Tedious to wire manually First-class support Prompt reuse and testing DIY — no built-in structure PromptTemplate makes this easy Memory across turns Manual array management Built-in memory types Streaming Supported, manual wiring Supported, callback-based Switching LLM providers Rewrite API calls Swap the model object Community / ecosystem Smaller OpenAI-specific Large, active, lots of integrations The rule of thumb: if your app makes more than two different types of LLM calls, or if you need any kind of chaining, LangChain saves you from writing orchestration code from scratch. For a simple one-shot wrapper, raw API is cleaner. gpt-4o-mini before defaulting to GPT-4 — it's often good enough and 10x cheaper.