When software developers and AI agents share the learning

wpnews.pro

Before Tobi Lütke ran Shopify, he learned programming through Germany’s apprenticeship system, the way people have learned trades forever: in a shared workshop, watching people who already knew what they were doing. More recently, describing Shopify’s River, he reached for a related word: Lehrwerkstatt, a teaching workshop where “the whole shop floor is the classroom.”

X has been agog by the numbers around River, Shopify’s Slack-native AI agent. In total, 5,938 Shopify employees worked with River across 4,450 different Slack channels, and River now coauthors roughly one in eight merged pull requests across the company. It’s a big deal, but understanding why it works that way is the most important part.

River can read code, run tests, open pull requests, query the data warehouse, inspect production traces, and sometimes push back on a plan it thinks is bad. Great. Lots of companies will have clever coding agents someday soon. Some already do.

The interesting part is that River doesn’t work alone; it works where everyone can see it.

I’ve already argued that agents reward explicit, consistent, well-documented software. They like the “boring” stuff, such as schemas, tests, conventions, clean setup instructions, and codebases that don’t require a deep retrospective with the one engineer who remembers why the build script has to run twice. Dropping an agent into a messy repo is mostly an efficient audit of your engineering discipline. Agents hold up a mirror to our engineering practices.

This is where Shopify comes off looking good. Without all the engineering pre-work, River wouldn’t be a success. In early 2024, the company says it had many repositories, bespoke development environments, and slow feedback loops. It then made two unpopular but critically important choices: moved to a monorepo called World and built dev environments, continuous integration, and production images on Nix as one reproducible substrate.

Shopify recognized that “code is going to be increasingly written with AI, and our infrastructure needs to be the substrate for that.” But the company did more than insist on legible code: It started to create shared memory of that code across the company.

River has one design constraint that every enterprise architect should pay attention to: It only works in public Slack channels. No direct messages. No private groups. You summon River where other people can watch, join, search, and learn. That sounds like a small product choice, but it’s not. It’s the operating model, kind of like open sourcing code development within Slack.

Because of this design constraint, every River session becomes a visible transcript. Shopify can then mine those transcripts, see recurring patterns, and feed them back into River’s skills, prompts, and defaults. One engineer’s hard-won fix at two o’clock becomes the next engineer’s starting point at four o’clock. The model doesn’t need to be retrained for the company to get smarter, and developers don’t need to go out of their way to document things. The work just has to leave a trace.

That’s the Lehrwerkstatt, productized. Everyone gets to watch the agent work.

Now compare that with how most enterprises are deploying AI. One developer works with a private chatbot in a private IDE in a private window that no one else will ever see. Multiply that by a few thousand. Each person discovers a clever way to investigate a flaky test, explain a troublesome service boundary, or avoid a migration trap. Then the session closes, and the discovery dies. Sure, the developer may go faster, but the company is no better off than it was yesterday.

One mistake enterprises have made with knowledge management is treating documentation as something people write after the work. This rarely works. Few employees (developers or otherwise) want to undertake the tedium of documenting what they already did. Not unless someone is paying them to do it.

River suggests a better pattern: The work itself creates the documentation.

Not every transcript is useful, of course. Most probably aren’t. But the useful ones can become skills, defaults, examples, runbooks, repo instructions, or links that help the next person avoid starting from zero. Shopify says River sessions are searchable and reproducible, and the company feeds patterns from those sessions back into River’s skills, prompts, and defaults. That’s not a chatbot; it’s a learning loop.

This is where the usual “AI will make developers more productive” framing feels too small. The more interesting claim is that AI can make software organizations more teachable. However, this won’t happen by default. The shop floor needs to be institutionalized or the enterprise will remain an atomized collection of productivity silos.

This is where

is useful, but only if properly used. agents.mdagents.md describes itself as a README for agents and says it’s now used by more than 60,000 open source projects. How should a developer use it? GitHub, based on analysis of more than 2,500 repositories, gives some clear guidance: Put commands early, be specific, provide real examples, and set explicit boundaries.

In other words, write down what matters.

But don’t mistake the file for the capability. ETH Zurich researchers recently tested whether repository-level context files actually help coding agents and found that they often reduce task success while increasing inference cost by more than 20%. InfoQ summarized their finding this way: LLM-generated context files often hurt, and human-written ones should focus on non-inferable details, such as custom tools, unusual build commands, and highly specific project constraints.

That’s the enterprise opportunity.

Public GitHub projects often don’t have much non-inferable domain knowledge to encode, but enterprise software is filled with it: odd quirks such as why the pricing service can’t be called during checkout in a certain region, or which legacy API looks dead but still supports a major customer, or why the data model says one thing but revenue recognition says another. Etc., etc. That’s the context worth preserving, rather than directory maps an agent can discover or generic coding preferences. That’s what the shop-floor version of agents.md

looks like: Not a static file that someone auto-generates and forgets, but rather the residue of observed work. Agents struggle, humans correct, patterns emerge, and only the durable lessons become instructions.

If all this sounds great (and it should), then it’s worth a word of warning: You probably won’t be able to copy Shopify, any more than you could have (or should have) copied Google. You’re not Shopify. Most companies shouldn’t wake up Monday and announce a monorepo migration, a Nix conversion, and a Slack-only agent because River sounds cool. That approach has worked for Shopify, but it doesn’t mean it will work for you. The useful approach for any company that isn’t Shopify is to ask different questions: Where does agent work happen in your company and who learns from it? If the answers are “in private” and “nobody,” you’ve got problems. I’m not saying that every agent session belongs in a public channel. You absolutely should *not *dump customer data, security incidents, HR issues, or privileged production context into a companywide AI water cooler. Boundaries still matter. In some cases, they matter more because agents can move faster and touch more systems than humans do, as I’ve warned.

But the principle survives the caveats: Agent work should be inspectable, reusable, and improvable where appropriate. The organization should be able to see the path from question to tool call to failed attempt to correction to pull request to reusable knowledge.

For years, developer experience mostly meant removing friction for individuals: faster setup, better docs, nicer APIs, etc. Those are all still good. But agentic development adds a new requirement: shared learning. A great developer experience now needs other things: Can the next developer benefit from the last agent session? Can the agent explain not just what it changed, but what it learned? Can a private breakthrough become a team asset without creating a surveillance nightmare? And no, visibility isn’t surveillance, and the goal is not to grade every keystroke or turn developers into content producers for the corporate memory machine. The goal is to make valuable work observable enough that it compounds.

This is a management problem as much as a tools problem. Developers will use agents because agents help them get work done. At this point, you’d struggle to get them to stop. Still, they won’t voluntarily produce beautiful organizational memory as a side effect unless the workflow makes it natural. You need to make the shared shop floor the golden path, as I’ve applied in various ways for years.

In the River story, humans are still the teachers. The organization is still responsible for deciding what counts as good work. The system still needs judgment, taste, security, cost control, and review. The magic happens when all this work is done in the open where the organization can learn from the teaching.

That’s the real promise of agentic coding inside enterprises. Not that every developer gets a private genius, but rather that every developer can tap into collective genius. Lütke learned his trade in a room where the craft was visible, and apprentices learned by watching the work. The companies that win the agent era will rebuild that room for software.

In short, the smartest thing your AI can do isn’t to code faster. It’s to work in public.

source & further reading

infoworld.com — original article AI needs a flight school pgEdge joins rush to merge OLTP and OLAP storage to support AI Why private AI is the smarter bet

When software developers and AI agents share the learning

Run your AI side-project on zahid.host