cd /news/developer-tools/pdf-make-pdf-generation-extraction-a… · home topics developer-tools article
[ARTICLE · art-42410] src=dev.to ↗ pub= topic=developer-tools verified=true sentiment=↑ positive

PDF::Make - PDF Generation, Extraction and Modification.

A developer built PDF::Make, a Perl toolkit for generating, extracting, and modifying PDF files. The toolkit provides low-level PDF object manipulation and a high-level builder API, and supports post-processing existing PDFs by extracting structured text and drawing annotations. A practical example demonstrates creating a PDF and then highlighting matched terms using the extract_structured method.

read3 min views1 publishedJun 28, 2026

I’ve always been fascinated by PDFs. They look simple on the surface. Just a document you can open anywhere but underneath they’re a full layout engine, object graph, drawing model, and archival format all at once. I enjoy that mix of precision and complexity and that is exactly what led me to build PDF::Make

(and yes I had some help from Claude LLM). I wanted a fully featured toolkit that could both generate PDFs and let me inspect/edit them programmatically.

At the low level, PDF::Make

exposes the raw building blocks of the format: PDF objects, pages, the drawing canvas, a parser/reader, and import/merge primitives. This is the layer you reach for when you need fine grained control or want to work with the structure of a document directly.

For everyday document creation, PDF::Make::Builder

sits on top of that foundation and provides a higher level API. It handles the boilerplate of page setup, fonts, text flow, and layout so you can produce a polished PDF in just a few lines of Perl.

The same toolkit is also designed for post-processing. You can open an existing PDF, extract structured text along with its coordinates, and then draw annotations or overlays back onto the page, making it straightforward to build review, QA, or markup workflows on top of documents you didn’t originally generate.

This post shows a practical two-step flow:

PDF::Make::Builder

Script:

#!/usr/bin/perl
use strict;
use warnings;

use PDF::Make::Builder;

my $pdf = PDF::Make::Builder->new(
    file_name => 'source_demo.pdf',
    configure => {
        text => {
            font => { family => 'Helvetica', size => 12, colour => '#222222' },
        },
    },
);

$pdf->add_page(page_size => 'Letter')
    ->add_h1(text => 'PDF::Make blog demo')
    ->add_text(text => 'PDF::Make builds and edits PDF files directly from Perl.')
    ->add_text(text => 'In the next step we extract text coordinates and highlight matches.')
    ->add_text(text => 'Target terms: PDF::Make, extract_structured, highlight.')
    ->add_text(text => 'This line repeats PDF::Make so multiple boxes are drawn around matches.')
    ->save;

print "Created corpus/blog_tests/source_demo.pdf\n";

That gives us a baseline document to post process.

Now we:

extract_structured

page by page,

#!/usr/bin/perl
use strict;
use warnings;
use PDF::Make::Builder;

my $in  = $ARGV[0] // 'source_demo.pdf';
my $out = $ARGV[1] // 'source_demo_highlighted.pdf';
my $re  = $ARGV[2] // 'PDF::Make';

my $b = PDF::Make::Builder->open_existing($in, file_name => $out);
my $pad = 1.5;
my $page_count = $b->page_count;
for my $idx (0 .. $page_count - 1) {
    my $res = $b->extract_structured($in, page => $idx, invisible => 1);
    my $blocks = $res->data || [];

    $b->open_page($idx + 1);
    my $canvas = $b->page->canvas;

    for my $block (@$blocks) {
        my $lines = $block->{lines} || [];
        for my $line (@$lines) {
            my $words = $line->{words} || [];
            for my $w (@$words) {
                my $text = $w->{text} // '';
                next unless $text =~ /$re/i;

                my ($x0, $y0, $x1, $y1) = @{$w}{qw/x0 y0 x1 y1/};
                my $rx  = $x0 - $pad;
                my $ry  = $y0 - $pad;
                my $rw  = ($x1 - $x0) + (2 * $pad);
                my $rh  = ($y1 - $y0) + (2 * $pad);
                $canvas->q->w(0.8)->RG(1, 0, 0)->re($rx, $ry, $rw, $rh)->S->Q;
            }
        }
    }
}

$b->save;
print "Created $out\n";

I would recommend reading the full documentation on CPAN to get the most out of the toolkit. The PDF::Make and

PDF::Make::Builder

── more in #developer-tools 4 stories · sorted by recency
── more on @pdf::make 3 stories trending now
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/pdf-make-pdf-generat…] indexed:0 read:3min 2026-06-28 ·