The Discoverable Evidence of AI-Assisted Software Porting

A developer used OpenAI's Codex AI assistant to port the williamcotton.com website from a custom language called Web Pipe to Rust, with the AI successfully recreating the site and integrating Contentful for dynamic content. The process was documented through discoverable evidence in Codex's log files, revealing the AI's step-by-step reasoning and code generation.

The Discoverable Evidence of AI-Assisted Software Porting June 25th, 2026 We start with a simple instruction without a lot of detail: copy williamcotton.com - I've done a cargo init in wmct-copy-codex-rust/src/main.rs the goal is to recreate williamcotton.com entirely in rust And Codex is off to the races. It "thinks" for some time, checks the subfolder with my copy of this website written https://github.com/williamcotton/williamcotton.com/blob/main/app.wp in a language of my own called Web Pipe https://github.com/williamcotton/webpipe , curls the actual website, and then it gets to work. Ah, the familiar green and red text that shows my work being done for me. Copies of Google Chrome come and go with brief glimpses of my website. And then finally, finished Only it has just hardcoded all of the pages inside of wiring up to Contentful. So some further prodding is required: you've made a critical mistake - the articles are hard coded but should come from contentful instead And now the real thinking begins. I drink some coffee and in the meantime I fire up another terminal window: bash $ cd ~/.codex $ grep "I've done a cargo init in wmct-copy-codex-rust/src/main.rs" . -R ./2026/06/26/rollout-2026-06-26T07-04-14-019f03d0-f7d1-7931-a089-3cb1c1f627cd.jsonl:{"timestamp":"2026-06-26T12:06:00.155Z","type":"response item","payload":{"type":"message","role":"user","content": {"type":"input text","text":"copy williamcotton.com - I've done a cargo init in wmct-copy-codex-rust/src/main.rs \n\nthe goal is to recreate williamcotton.com entirely in rust"} ,"internal chat message metadata passthrough":{"turn id":"019f03d2-94c2-7900-bba1-9363baf4f8d5"}}} ./2026/06/26/rollout-2026-06-26T07-04-14-019f03d0-f7d1-7931-a089-3cb1c1f627cd.jsonl:{"timestamp":"2026-06-26T12:06:00.155Z","type":"event msg","payload":{"type":"user message","message":"copy williamcotton.com - I've done a cargo init in wmct-copy-codex-rust/src/main.rs \n\nthe goal is to recreate williamcotton.com entirely in rust","images": ,"local images": ,"text elements": }} What do we have here? It looks like a couple of lines from a JSONL file, one being an "event message" and then other being a "response item". The response item has a "turn id". Interesting. Oh, but Codex has finally finished I can tell it works because I load this self-same website URL of http://localhost:1234/articles/the-discoverable-evidence-of-ai-assisted-software-porting to the running instance and see the at the time burgeoning article in question. I look at the code, and in typical fashion, the entire application is in just a single main.rs file. Reading through the code I can see that it successfully ported the key parts. It most definitely relied on the render rich text function, which takes a recursive Contentful-provided JSON tree of nodes and returns fully formatted HTML ready for consumption by someone's browser of choice. It's also made sure it uses HTMX It has tests Let's look around a bit again. bash $ cd ~/.codex $ grep "I've done a cargo init in wmct-copy-codex-rust/src/main.rs" . -R Binary file ./logs 2.sqlite matches ./sessions/2026/06/26/rollout-2026-06-26T07-04-14-019f03d0-f7d1-7931-a089-3cb1c1f627cd.jsonl:{"timestamp":"2026-06-26T12:06:00.155Z","type":"response item","payload":{"type":"message","role":"user","content": {"type":"input text","text":"copy williamcotton.com - I've done a cargo init in wmct-copy-codex-rust/src/main.rs \n\nthe goal is to recreate williamcotton.com entirely in rust"} ,"internal chat message metadata passthrough":{"turn id":"019f03d2-94c2-7900-bba1-9363baf4f8d5"}}} ./sessions/2026/06/26/rollout-2026-06-26T07-04-14-019f03d0-f7d1-7931-a089-3cb1c1f627cd.jsonl:{"timestamp":"2026-06-26T12:06:00.155Z","type":"event msg","payload":{"type":"user message","message":"copy williamcotton.com - I've done a cargo init in wmct-copy-codex-rust/src/main.rs \n\nthe goal is to recreate williamcotton.com entirely in rust","images": ,"local images": ,"text elements": }} Binary file ./state 5.sqlite matches ./history.jsonl:{"session id":"019f03d0-f7d1-7931-a089-3cb1c1f627cd","ts":1782475560,"text":"copy williamcotton.com - I've done a cargo init in wmct-copy-codex-rust/src/main.rs \n\nthe goal is to recreate williamcotton.com entirely in rust"} Binary file ./state 5.sqlite-wal matches Interesting We see some SQLite databases with the matches now as well. We'll take a look at those in a second. We still see our "response item" and our "event message" but we also see a match in a JSONL file that seems to contain our message history. And this one has a session id. And this session id matches part of the name of the JSONL file. Maybe it's time to take a closer look at this rollout-2026-06-26T07-04-14-019f03d0-f7d1-7931-a089-3cb1c1f627cd.jsonl file. bash $ cat ./sessions/2026/06/26/rollout-2026-06-26T07-04-14-019f03d0-f7d1-7931-a089-3cb1c1f627cd.jsonl | wc 477 65760 1462884 Not a lot of lines but quite a number of characters. I take a look at the file in a text editor. I come across data:image/png;base64 - well this explains at least some of the file size. Let's poke around in a more civilized manner. bash $ cat ./sessions/2026/06/26/rollout-2026-06-26T07-04-14-019f03d0-f7d1-7931-a089-3cb1c1f627cd.jsonl \ | jq -sr 'map .type | group by . | map {type: . 0 , count: length} ' { "type": "event msg", "count": 125 }, { "type": "response item", "count": 349 }, { "type": "session meta", "count": 1 }, { "type": "turn context", "count": 2 } Alright, so mainly "response items". bash $ cat ./sessions/2026/06/26/rollout-2026-06-26T07-04-14-019f03d0-f7d1-7931-a089-3cb1c1f627cd.jsonl \ | jq -sr 'map .payload.type | group by . | map {type: . 0 , count: length} ' { "type": null, "count": 3 }, { "type": "agent message", "count": 46 }, { "type": "custom tool call", "count": 9 }, { "type": "custom tool call output", "count": 9 }, { "type": "function call", "count": 114 }, { "type": "function call output", "count": 114 }, { "type": "message", "count": 50 }, { "type": "patch apply end", "count": 9 }, { "type": "reasoning", "count": 52 }, { "type": "task complete", "count": 2 }, { "type": "task started", "count": 2 }, { "type": "token count", "count": 63 }, { "type": "user message", "count": 2 }, { "type": "web search call", "count": 1 }, { "type": "web search end", "count": 1 } And here we mainly see functional calling. Now let's take a look at the messages we typed into Codex itself. bash $ cat ./sessions/2026/06/26/rollout-2026-06-26T07-04-14-019f03d0-f7d1-7931-a089-3cb1c1f627cd.jsonl \ | jq 'select .payload.type == "user message" ' { "timestamp": "2026-06-26T12:06:00.155Z", "type": "event msg", "payload": { "type": "user message", "message": "copy williamcotton.com - I've done a cargo init in wmct-copy-codex-rust/src/main.rs \n\nthe goal is to recreate williamcotton.com entirely in rust", "images": , "local images": , "text elements": } } { "timestamp": "2026-06-26T12:15:38.716Z", "type": "event msg", "payload": { "type": "user message", "message": "you've made a critical mistake - the articles are hard coded but should come from contentful instead", "images": , "local images": , "text elements": } } Our foray into the land of JSONL is probably complete for now, Let's poke around in these SQLite databases now. bash $ sqlite3 ./logs 2.sqlite '.tables' sqlx migrations logs $ sqlite3 ./logs 2.sqlite '.schema logs' CREATE TABLE logs id INTEGER PRIMARY KEY AUTOINCREMENT, ts INTEGER NOT NULL, ts nanos INTEGER NOT NULL, level TEXT NOT NULL, target TEXT NOT NULL, feedback log body TEXT, module path TEXT, file TEXT, line INTEGER, thread id TEXT, process uuid TEXT, estimated bytes INTEGER NOT NULL DEFAULT 0 ; CREATE INDEX idx logs ts ON logs ts DESC, ts nanos DESC, id DESC ; CREATE INDEX idx logs thread id ON logs thread id ; CREATE INDEX idx logs thread id ts ON logs thread id, ts DESC, ts nanos DESC, id DESC ; CREATE INDEX idx logs process uuid threadless ts ON logs process uuid, ts DESC, ts nanos DESC, id DESC WHERE thread id IS NULL; Ok, so there's a logs table. What does this look like? bash $ sqlite3 -header -column ./logs 2.sqlite \ 'SELECT FROM logs LIMIT 1;' id ts ts nanos level target feedback log body module path file line thread id process uuid estimated bytes --------- ---------- -------- ----- -------------------------------- ---------------------------------- -------------------------------- -------------------------------------------------------------------------------------------------------------- ---- --------- ---------------------------------------------- --------------- 107247977 1781611464 18308000 TRACE hyper util::client::legacy::pool idle interval checking for expired hyper util::client::legacy::pool /Users/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/hyper-util-0.1.20/src/client/legacy/pool.rs 806 pid:10447:d93df5c9-186c-40c3-b9be-e7126cb2c42b 213 Oh boy, this is going to take some deeper diving to figure out. Let's see if we can even find where our text is stored in this database and in what format. After some poking I settled on this query: bash $ sqlite3 -header -column ./logs 2.sqlite " SELECT rowid, level, target FROM logs WHERE feedback log body LIKE '%I''ve done a cargo init in wmct-copy-codex-rust/src/main.rs%'; " id level target --------- ----- ---------------------------------------- 136150688 TRACE codex api::endpoint::responses websocket 136152594 TRACE codex api::endpoint::responses websocket 136154340 TRACE codex api::endpoint::responses websocket 136155560 TRACE codex api::endpoint::responses websocket 136156345 TRACE codex api::endpoint::responses websocket 136159063 TRACE codex api::endpoint::responses websocket 136161668 TRACE codex api::endpoint::responses websocket 136162148 TRACE codex api::endpoint::responses websocket 136163302 TRACE codex api::endpoint::responses websocket 136164592 TRACE codex api::endpoint::responses websocket So we at least know that this is related to web sockets in some manner? The "feedback log body" is gigantic so we'll just take a little peek. bash $ sqlite3 -noheader ./logs 2.sqlite " SELECT substr feedback log body, 1, 200 FROM logs WHERE feedback log body LIKE '%I''ve done a cargo init in wmct-copy-codex-rust/src/main.rs%' LIMIT 1; " session loop{thread id=019f03d0-f7d1-7931-a089-3cb1c1f627cd}:submission dispatch{otel.name="op.dispatch.user input" submission.id="019f03db-68d2-7cd3-9b82-e8b81ba57791" codex.op="user input"}:turn{ote Great, it's some custom log format that we're definitely not going to attempt to parse at this point. But we can poke around a bit more around a little bit: bash $ sqlite3 -noheader ./logs 2.sqlite " SELECT feedback log body FROM logs WHERE feedback log body LIKE '%I''ve done a cargo init in wmct-copy-codex-rust/src/main.rs%' LIMIT 1; " /tmp/log.txt And then after manually searching through the text for my prompt string in a text editor I find some JSON. { "type": "message", "role": "user", "content": { "type": "input text", "text": "copy williamcotton.com - I've done a cargo init in wmct-copy-codex-rust/src/main.rs \n\nthe goal is to recreate williamcotton.com entirely in rust" } , "internal chat message metadata passthrough": { "turn id": "019f03d2-94c2-7900-bba1-9363baf4f8d5" } } Ah, but this is a different kind of message. Let's see if it's contained in the JSONL we were looking at before: { "timestamp": "2026-06-26T12:06:00.155Z", "type": "response item", "payload": { "type": "message", "role": "user", "content": { "type": "input text", "text": "copy williamcotton.com - I've done a cargo init in wmct-copy-codex-rust/src/main.rs \n\nthe goal is to recreate williamcotton.com entirely in rust" } , "internal chat message metadata passthrough": { "turn id": "019f03d2-94c2-7900-bba1-9363baf4f8d5" } } } So the same payload but a different container. Let's change tact now and look at this other database. This time we're write some Python to help us search through all of the tables. python import sqlite3 needle = "I've done a cargo init in wmct-copy-codex-rust/src/main.rs" db = sqlite3.connect "state 5.sqlite" cur = db.cursor tables = r 0 for r in cur.execute "SELECT name FROM sqlite master WHERE type='table'" for table in tables: cols = cur.execute f"PRAGMA table info '{table}' " .fetchall for , col, typ, in cols: if typ.upper == "TEXT": try: sql = f'SELECT rowid, "{col}" FROM "{table}" WHERE "{col}" LIKE ?' for rowid, value in cur.execute sql, f"%{needle}%", : print f"{table}.{col} rowid={rowid}" print value :200 print except Exception: pass With our output being: threads.title rowid=375 copy williamcotton.com - I've done a cargo init in wmct-copy-codex-rust/src/main.rs the goal is to recreate williamcotton.com entirely in rust threads.first user message rowid=375 copy williamcotton.com - I've done a cargo init in wmct-copy-codex-rust/src/main.rs the goal is to recreate williamcotton.com entirely in rust threads.preview rowid=375 copy williamcotton.com - I've done a cargo init in wmct-copy-codex-rust/src/main.rs the goal is to recreate williamcotton.com entirely in rust And taking a look at the threads table, with the schema extracted from a bunch of triggers and other ephemera and cleaned up a bit. CREATE TABLE threads id TEXT PRIMARY KEY, rollout path TEXT NOT NULL, created at INTEGER NOT NULL, updated at INTEGER NOT NULL, source TEXT NOT NULL, model provider TEXT NOT NULL, cwd TEXT NOT NULL, title TEXT NOT NULL, sandbox policy TEXT NOT NULL, approval mode TEXT NOT NULL, tokens used INTEGER NOT NULL DEFAULT 0, has user event INTEGER NOT NULL DEFAULT 0, archived INTEGER NOT NULL DEFAULT 0, archived at INTEGER, git sha TEXT, git branch TEXT, git origin url TEXT, cli version TEXT NOT NULL DEFAULT '', first user message TEXT NOT NULL DEFAULT '', agent nickname TEXT, agent role TEXT, memory mode TEXT NOT NULL DEFAULT 'enabled', model TEXT, reasoning effort TEXT, agent path TEXT, created at ms INTEGER, updated at ms INTEGER, thread source TEXT, preview TEXT NOT NULL DEFAULT '', recency at INTEGER NOT NULL DEFAULT 0, recency at ms INTEGER NOT NULL DEFAULT 0 ; So now let's take a look at just the first row and not the 375th row: bash $ sqlite3 ./state 5.sqlite -header -column 'SELECT title, first user message, preview FROM threads LIMIT 1;' title first user message preview ----------------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------------------------------------------------------------- write an html to markdown converter - use recursive descent to make an ast - first things first, parse to an AST, we can make the AST to markdown step next write an html to markdown converter - use recursive descent to make an ast- first things first, parse to an AST, we can make the AST to markdown step next write an html to markdown converter - use recursive descent to make an ast - first things first, parse to an AST, we can make the AST to markdown step next This has nothing to do with our current session In fact this looks like the very first prompt I put into Codex. write an html to markdown converter - use recursive descent to make an ast - first things first, parse to an AST, we can make the AST to markdown step next Let's count the number of rows in this table: bash $ sqlite3 ./state 5.sqlite 'SELECT count FROM threads;' 375 And the count is the row number of the message we were looking for Alas, at least for this approach, we've only got the first message from the prompt. But at least we still have our JSONL files to look through Now that we've poked around in the stored changes, which can help describe the intent of using such tools, let's take a look at the code itself to see what we can find. php fn main - std::io::Result< { let host = env::var "HOST" .unwrap or else | | "127.0.0.1".to string ; let port = env::var "PORT" .unwrap or else | | "3000".to string ; let address = format "{host}:{port}" ; let listener = TcpListener::bind &address ?; println "Serving williamcotton.com Rust copy at http://{address}" ; for stream in listener.incoming { match stream { Ok stream = { thread::spawn || { if let Err error = handle client stream { eprintln "request failed: {error}" ; } } ; } Err error = eprintln "connection failed: {error}" , } } Ok } Well part of our prompt ended up in the source code as there is an explicit mention of this being a "Rust copy". Let's see what else we can find. A very important consideration in this code comparison is that Web Pipe is a very different kind of language than Rust. It's a DSL. It is probably closer to a declarative configuration language than anything else. It certainly isn't a general purpose programming language. But we still see fingerprints all over the code. For example, the code continues to use HTMX and keeps the very same HTML. Rust code: php fn blog layout title: &str, description: Option<&str , content: &str - String { base layout title, description, content, true } fn base layout title: &str, description: Option<&str , content: &str, blog scripts: bool - String { let meta tags = description .filter |value| value.is empty .map |value| format "<meta name=\"description\" content=\"{}\" ", escape html value .unwrap or default ; let head extras = if blog scripts { r " <script src="https://cdn.jsdelivr.net/npm/htmx.org@2.0.6/dist/htmx.min.js" </script <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.11.1/highlight.min.js" </script <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.11.1/languages/fsharp.min.js" </script <script src="/hljs-wp.js" </script " } else { "" }; let footer scripts = if blog scripts { r " <script try{hljs.highlightAll ;}catch e {}</script " } else { "" }; format r "< DOCTYPE html <html lang="en" <head <meta charset="utf-8" <meta name="viewport" content="width=device-width,minimum-scale=1,initial-scale=1"/ <link rel="preload" href="LigaMenlo-Regular.woff" as="font" type="font/woff" crossorigin <title id="page-title" {}</title <link rel="stylesheet" href="/app.css"/ {}{} </head <body <div id="app" <div class="sitewrapper" <header <h1 <a href="/" hx-get="/htmx/" hx-target=" main-content" hx-swap="innerHTML" hx-push-url="/" williamcotton.com</a </h1 <nav <a href="/about" hx-get="/htmx/about" hx-target=" main-content" hx-swap="innerHTML" hx-push-url="/about" About</a <a href="/bio" hx-get="/htmx/bio" hx-target=" main-content" hx-swap="innerHTML" hx-push-url="/bio" Bio</a <a href="/contact" hx-get="/htmx/contact" hx-target=" main-content" hx-swap="innerHTML" hx-push-url="/contact" Contact</a </nav </header <div class="content" id="main-content" {} </div <footer <p &copy; 2025 William Cotton</p </footer </div </div {} </body </html " , escape html title , meta tags, head extras, content, footer scripts } fn render article list item article: &ArticleSummary - String { let slug = escape html &article.slug ; let title = escape html &article.title ; let date = if article.formatted date.is empty { String::new } else { format " <p class=\"published-date\" {}</p \n", escape html &article.formatted date }; format r " <article <h2 <a href="/articles/{slug}" hx-get="/htmx/articles/{slug}" hx-target=" main-content" hx-swap="innerHTML show:window:top" hx-push-url="/articles/{slug}" {title}</a </h2 {date}{body} </article " , body = article.body preview } Web Pipe code: handlebars baseLayout = < DOCTYPE html <html lang="en" <head <meta charset="utf-8" <meta name="viewport" content="width=device-width,minimum-scale=1,initial-scale=1"/ <link rel="preload" href="LigaMenlo-Regular.woff" as="font" type="font/woff" crossorigin <title id="page-title" {{ title}}</title <link rel="stylesheet" href="/app.css"/ {{ metaTags}} {{ headExtras}} </head <body <div id="app" <div class="sitewrapper" <header <h1 <a href="/" hx-get="/htmx/" hx-target=" main-content" hx-swap="innerHTML" hx-push-url="/" williamcotton.com</a </h1 <nav <a href="/about" hx-get="/htmx/about" hx-target=" main-content" hx-swap="innerHTML" hx-push-url="/about" About</a <a href="/bio" hx-get="/htmx/bio" hx-target=" main-content" hx-swap="innerHTML" hx-push-url="/bio" Bio</a <a href="/contact" hx-get="/htmx/contact" hx-target=" main-content" hx-swap="innerHTML" hx-push-url="/contact" Contact</a </nav </header <div class="content" id="main-content" {{ @partial-block}} </div <footer <p © 2025 William Cotton</p </footer </div </div {{ footerScripts}} </body </html handlebars blogLayout = {{ baseLayout}} {{ inline "metaTags"}}{{ if description}}<meta name="description" content="{{description}}" {{/if}}{{/inline}} {{ inline "headExtras"}} <script src="https://cdn.jsdelivr.net/npm/htmx.org@2.0.6/dist/htmx.min.js" </script <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.11.1/highlight.min.js" </script <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.11.1/languages/fsharp.min.js" </script <script src="/hljs-wp.js" </script {{/inline}} {{ inline "footerScripts"}} <script try{hljs.highlightAll ;}catch e {}</script {{/inline}} {{ @partial-block}} {{/baseLayout}} handlebars articleListItem = <article <h2 <a href="/articles/{{fields.slug}}" hx-get="/htmx/articles/{{fields.slug}}" hx-target=" main-content" hx-swap="innerHTML show:window:top" hx-push-url="/articles/{{fields.slug}}" {{fields.title}}</a </h2 {{ if fields.formattedDate}}<p class="published-date" {{fields.formattedDate}}</p {{/if}} {{ if fields.bodyPreview}} {{{fields.bodyPreview}}} {{/if}} </article It even keeps the same names, "blog layout", "base layout" and "article list item". You can probably guess that the CSS is directly copied from one project to another just based on the class and id names on the DOM elements. There's all sorts of other similarities, from the exact same error messages, to the order of the route handler definitions, test fixtures and more. With a careful eye the list grows considerably. Now of course the porting from one language to another could be instructed to use tailwind, a different set of error messages, a different approach to syntax highlighting and many other instructions to help obfuscate the source. But the intent of these changes would still be captured in the evidence gleaned from our ~/.codex directory. We haven't even started to look at an AGENTS.md file or a docs directory filled with instructions for an LLM to follow. What we do know is that the paper trail for porting from one codebase to another using an LLM is most probably something that can be discovered.