{"slug": "i-lead-ai-agents-every-day-here-are-5-shifts-no-standard-tells-you-how-to-make", "title": "I Lead AI Agents Every Day - Here Are 5 Shifts No Standard Tells You How to Make", "summary": "A Google DeepMind safety lead announced a $10 million investment in multi-agent safety, citing the lack of an established research field. A developer running multi-agent systems in production outlined five critical operational shifts not covered by the new PMI standard for AI in project work, including writing boundary files that define autonomous versus escalated decisions, judging results cold without context on the path, and replacing headcount questions with capability mapping.", "body_md": "A Google DeepMind safety lead said this week that they're putting $10M behind multi-agent safety because \"there just isn't really a field of research for multi-agent safety yet.\"\n\nI read that and laughed, because I'm already running the thing the research field doesn't exist for yet. Most of us are. You spin up a couple of agents, hand them work, and somewhere in there you quietly become a manager of workers that don't think like workers.\n\nTwo days before that, PMI published the first official standard for AI in project work. It's a solid document. It also leaves the entire \"how do you actually do this on a Tuesday\" layer to you. So here's my Tuesday layer: five shifts I had to make, each one learned by getting it wrong first.\n\nMy first instinct with an agent was the same as with a person: here's work, go.\n\nThat broke the first time an agent made a reasonable decision on something that turned out to be irreversible. It wasn't the agent's fault. I never told it which decisions were one-way doors.\n\nSo now the first artifact I write isn't a task list. It's a boundary file. Something like this lives next to the work:\n\n```\n# decision-boundaries.yml\nautonomous:\n  - reformat, refactor, rename within a module\n  - anything reversible with a git revert\nescalate:\n  - schema changes, public API shape\n  - deletes, migrations, anything touching prod data\n  - spend over $0 or any external send\non_unsure: stop_and_ask\n```\n\nThat file does more for me than any standup. Leadership moved from assigning the work to defining what may be decided without me.\n\nI used to review work I'd seen get built. I knew the steps, so \"looks right\" was usually safe.\n\nThen I started getting finished diffs with no memory of how they came to be. \"Looks right\" stopped being safe. The code was clean and the reasoning under it was wrong in a way you only catch if you go digging.\n\nThe skill now is judging a result cold, with zero context on the path. Ethan Mollick wrote this week about a model holding twelve hours of focus on one spec. When the attention window outlasts mine, my job isn't checking steps. It's scoping the spec so tightly the steps don't need a babysitter.\n\n\"How many engineers do I need\" is a question I catch myself asking and kill.\n\nThe real one: what mix of people and agents produces this outcome, and what's the human-only core I'd never hand off? The plan turned into a capability map with a deliberately protected center.\n\nGergely Orosz's June job-market analysis lands in the same place from the data side: the roles that compound are where judgment about AI systems is the scarce input, not execution on a known stack. Capability planning is that judgment pointed at your own team.\n\nStandup tells you something broke. Which means it tells you late.\n\nWorkers that fail unpredictably need the alarm built up front. I keep a short tripwire list, each one a single sentence: if this observable crosses this line, halt and ping me, and here's who owns the ping.\n\n```\n# tripwires.yml\n- watch: test_pass_rate\n  trip: \"< 100% on touched files\"\n  action: halt + page me\n- watch: files_changed\n  trip: \"> 20 in one task\"\n  action: pause for scope review\n```\n\nIt feels too simple to matter. It has saved more bad mornings than any dashboard I've built.\n\nThis is the one that's actually a promotion.\n\nOwnership used to mean the outcome is mine. It still is. The level changed. I don't own the deliverable directly anymore. I own the system that makes it: people, agents, and the rules between them. That's the only level that scales.\n\nBoris Cherny, who runs Claude Code, said this week he hasn't written a line of code himself in eight months. People hear a flex. I hear the shift in one sentence: stopped producing the work, started owning the system that produces it. Bigger job, not a smaller one.\n\nI'm not clean on all five. Solid on three, shaky on two, and the shaky ones cost me the most.\n\nRate yourself one to five on each, fast. The two you score lowest are the two behaviors that move you this quarter. Which one did you make first, and which are you still avoiding?\n\nTags: #projectmanagement #ai #career", "url": "https://wpnews.pro/news/i-lead-ai-agents-every-day-here-are-5-shifts-no-standard-tells-you-how-to-make", "canonical_source": "https://dev.to/itskondrat/i-lead-ai-agents-every-day-here-are-5-shifts-no-standard-tells-you-how-to-make-1pg4", "published_at": "2026-06-12 07:09:26+00:00", "updated_at": "2026-06-12 07:42:22.691159+00:00", "lang": "en", "topics": ["ai-agents", "ai-safety", "ai-tools", "ai-products", "ai-research"], "entities": ["Google DeepMind", "PMI"], "alternates": {"html": "https://wpnews.pro/news/i-lead-ai-agents-every-day-here-are-5-shifts-no-standard-tells-you-how-to-make", "markdown": "https://wpnews.pro/news/i-lead-ai-agents-every-day-here-are-5-shifts-no-standard-tells-you-how-to-make.md", "text": "https://wpnews.pro/news/i-lead-ai-agents-every-day-here-are-5-shifts-no-standard-tells-you-how-to-make.txt", "jsonld": "https://wpnews.pro/news/i-lead-ai-agents-every-day-here-are-5-shifts-no-standard-tells-you-how-to-make.jsonld"}}