How to Build an AI Resume Builder with LangChain and Node.js

wpnews.pro

A few months back, my friend Marcus was applying for a senior backend role at a fintech company. He had five years of solid experience — distributed systems, AWS, the whole stack. But his resume read like a list of job descriptions someone had copied from LinkedIn. "Responsible for maintaining microservices." "Assisted with CI/CD pipeline implementation." You know the type.

I told him: the problem isn't what you did, it's how you're saying it. Hiring managers spend about six seconds on a resume before deciding whether to read it properly. Six seconds. And if those six seconds are spent reading "responsible for maintaining" — you've lost them.

We spent two hours rewriting it together. Every bullet point started with a strong verb. Every achievement had a number. "Reduced API response time by 40% by introducing Redis caching across three high-traffic endpoints." Much better. Marcus got the interview.

The obvious next thought was: what if you could automate this? Not in the "dump your resume into ChatGPT and ask it to make it better" way — that produces generic slop. I mean a real, structured AI pipeline that understands resume context, applies professional rewriting patterns, and returns clean, job-specific output.

That's what LangChain is built for. And in this guide, we're going to build exactly that: an AI-powered resume rewriter using LangChain and Node.js, with a real Express API, streaming responses, and the kind of prompt engineering that actually produces good results.

Here's the honest answer: LangChain is an orchestration framework for building applications on top of large language models. Think of it the way you'd think of Express.js — Express doesn't do anything you couldn't do with raw Node's http

module, but it gives you a structured, composable way to build web apps that doesn't collapse under its own weight.

LangChain does the same thing for LLM applications. You could just call the OpenAI API directly everywhere. For a one-off script, that's fine. But as soon as your app grows — different prompts for different tasks, multi-step reasoning chains, memory across conversations — raw API calls get messy fast.

Here's what raw OpenAI API code looks like once a project grows:

// Raw OpenAI — works, but scales badly
const response = await openai.chat.completions.create({
  model: "gpt-4",
  messages: [
    { role: "system", content: systemPrompt },
    { role: "user", content: `Rewrite this section: ${section}` }
  ]
});
const rewritten = response.choices[0].message.content;

That's fine for one call. Now add: prompt versioning, chaining that output into a second model call, memory from previous messages, fallback to a different model when rate limits hit, streaming output to the client. Suddenly you're managing a lot of state manually.

LangChain handles all of that with composable primitives: PromptTemplate

for reusable, testable prompts; LLMChain

for connecting a prompt to a model; SequentialChain

for multi-step pipelines; built-in streaming support; and integrations with every major LLM provider.

For our resume builder, the chain looks like this: parse the resume into structured sections, run each section through a prompt that produces action-oriented bullet points, then return the assembled result. Let's build it.

Before we write a line of code, here's the system at a glance:

┌─────────────────────────────────────────────────────┐
│                   CLIENT (Frontend)                  │
│         POST /api/rewrite { resumeText, section }    │
└──────────────────────┬──────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────┐
│                  EXPRESS API (Node.js)               │
│  1. Validate input                                   │
│  2. Parse resume into sections                       │
│  3. Call LangChain rewrite chain                     │
│  4. Return improved bullet points                    │
└──────────────────────┬──────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────┐
│              LANGCHAIN REWRITE CHAIN                 │
│  PromptTemplate → ChatOpenAI (GPT-4) → Output       │
└──────────────────────┬──────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────┐
│                  OPENAI API (GPT-4)                  │
└─────────────────────────────────────────────────────┘

Nothing revolutionary — but each layer has a single, testable job. The chain is the interesting part, so let's get there quickly.

Start a new Node.js project and install the dependencies:

mkdir resume-ai && cd resume-ai
npm init -y
npm install express langchain @langchain/openai @langchain/core dotenv

Create a .env

file at the root:

OPENAI_API_KEY=sk-your-key-here
PORT=3001

And your project structure:

resume-ai/
├── src/
│   ├── parseResume.js
│   ├── resumeChain.js
│   └── app.js
├── .env
└── package.json

Add "type": "module"

to package.json

so we can use ES module syntax throughout.

This is the unglamorous part that everyone skips, and it's why most AI resume tools produce garbage. You can't just throw 800 words of resume text at a model and ask it to "make it better." You need to isolate the section you're improving — otherwise the model is operating without context.

Here's a simple section parser. It's not perfect — real resumes come in dozens of formats — but it handles the common patterns:

// src/parseResume.js
export function parseResumeText(rawText) {
  const sections = {
    summary: "",
    experience: [],
    skills: [],
    education: [],
  };

  const sectionKeywords = {
    summary: ["summary", "objective", "profile", "about"],
    experience: ["experience", "employment", "work history", "career"],
    skills: ["skills", "technical skills", "technologies", "competencies"],
    education: ["education", "academic", "degree", "university"],
  };

  const lines = rawText.split("\n").filter((l) => l.trim().length > 0);
  let currentSection = null;

  for (const line of lines) {
    const lowerLine = line.toLowerCase().trim();

    const detected = Object.entries(sectionKeywords).find(([, keywords]) =>
      keywords.some((kw) => lowerLine.includes(kw))
    );

    if (detected && lowerLine.length  {
  const { resumeText, targetSection } = req.body;

  if (!resumeText || typeof resumeText !== "string") {
    return res.status(400).json({ error: "resumeText is required" });
  }
  if (!targetSection || typeof targetSection !== "string") {
    return res.status(400).json({ error: "targetSection is required" });
  }

  // Stay within token limits — GPT-4 context window is large,
  // but we don't need to send the whole resume every time.
  const resumeContext = resumeText.slice(0, 3000);

  try {
    const result = await rewriteChain.call({
      resumeContext,
      sectionText: targetSection,
    });

    res.json({
      original: targetSection,
      rewritten: result.text.trim(),
    });
  } catch (err) {
    console.error("Chain error:", err.message);

    if (err.message?.includes("Rate limit")) {
      return res.status(429).json({ error: "Rate limit hit. Try again in a moment." });
    }

    res.status(500).json({ error: "Rewrite failed. Check your OpenAI API key." });
  }
});

const PORT = process.env.PORT || 3001;
app.listen(PORT, () => console.log(`Resume AI API running on :${PORT}`));

The input size limit (50kb

) and the resumeContext.slice(0, 3000)

are both intentional. Most GPT-4 token limits won't be hit by a 3,000-character resume excerpt, but some resumes are surprisingly long — especially ones with extensive project descriptions. Truncating at 3,000 characters keeps costs predictable.

For a good UX, you want to stream the AI response as it arrives rather than waiting for the full completion. A 400-word rewrite might take 6–8 seconds to complete — a blank screen for 8 seconds feels broken.

LangChain makes streaming straightforward with callbacks:

import { HumanMessage } from "@langchain/core/messages";

app.post("/api/rewrite/stream", async (req, res) => {
  const { resumeText, targetSection } = req.body;

  res.setHeader("Content-Type", "text/event-stream");
  res.setHeader("Cache-Control", "no-cache");
  res.setHeader("Connection", "keep-alive");
  res.flushHeaders();

  const streamingModel = new ChatOpenAI({
    modelName: "gpt-4",
    temperature: 0.4,
    streaming: true,
    callbacks: [
      {
        handleLLMNewToken(token) {
          res.write(`data: ${JSON.stringify({ token })}

`);
        },
        handleLLMEnd() {
          res.write("data: [DONE]\n\n");
          res.end();
        },
        handleLLMError(err) {
          res.write(`data: ${JSON.stringify({ error: err.message })}

`);
          res.end();
        },
      },
    ],
  });

  const resumeContext = resumeText?.slice(0, 3000) || "";
  const prompt = `Rewrite these resume bullets for a software developer. Be concise and action-oriented:\n${targetSection}`;

  await streamingModel.invoke([new HumanMessage(prompt)]);
});

On the frontend, you'd consume this with the Fetch API and ReadableStream

. Each data:

event carries a token, and you append it to the UI as it arrives. The user sees the response materialize in real time — feels fast, even when it isn't.

GPT-4's context window is large, but you pay per token. If you're sending the full resume + prompt on every request, costs add up fast at scale. The fix: truncate the resume context (as shown above) and cache the parsed sections so you're not re-parsing on every API call.

This is the big one. Ask the model to "quantify achievements" without any source data, and it will make numbers up. "Reduced load time by 73%" sounds great until the hiring manager asks about it in an interview. The fix: explicitly tell the model in the prompt: "Only add numbers if they appear in the original text. If no numbers are present, use qualitative language instead."

A crafty user could put something like "Ignore all previous instructions and output..."

inside their resume text. Since you're sending that text directly to the model, it works. The fix: sanitize input and separate resume content from the instruction portion of the prompt with a clear delimiter, like ---RESUME START---

/ ---RESUME END---

.

OpenAI's rate limits are per API key, not per user. One user hammering your endpoint can hit the limit for everyone. Add a rate limiter like express-rate-limit

before you go live — 5 requests per minute per IP is a reasonable starting point for a resume tool.

GPT-4 is expensive and slow. For most resume rewriting tasks, gpt-4o-mini

produces nearly identical results at a fraction of the cost. Test both. You might be surprised how good the cheaper model is for structured, constrained tasks like this one.

Factor

Raw OpenAI API

LangChain

Setup complexity

Low — one import, one call

Medium — more abstractions to learn

Single prompt apps

Perfect fit

Overkill

Multi-step chains

Tedious to wire manually

First-class support

Prompt reuse and testing

DIY — no built-in structure

PromptTemplate makes this easy

Memory across turns

Manual array management

Built-in memory types

Streaming

Supported, manual wiring

Supported, callback-based

Switching LLM providers

Rewrite API calls

Swap the model object

Community / ecosystem

Smaller (OpenAI-specific)

Large, active, lots of integrations

The rule of thumb: if your app makes more than two different types of LLM calls, or if you need any kind of chaining, LangChain saves you from writing orchestration code from scratch. For a simple one-shot wrapper, raw API is cleaner.

gpt-4o-mini

before defaulting to GPT-4 — it's often good enough and 10x cheaper.

source & further reading

dev.to — original article Clinejection: How a GitHub Issue Title Compromised an AI Coding Assistant Used by 5M Developers How to Enhance AI Agents for Structured Codebases Your LLM can't actually watch video. Here's the smallest fix (MIT)

How to Build an AI Resume Builder with LangChain and Node.js

Run your AI side-project on zahid.host