How do we grasp the implications of exponentialism? This edition explores this, as well as the era of efficient AI, people management, and our craving for shared experiences.
Edition #44 of Implications.
This edition explores forecasts and implications around:
**(1) grasping exponentialism, (2) the era of efficient AI, (3) talent density scoring and impact-based hiring, (4) what the entertainment world must learn from the Knicks and (5) some surprises at the end, as always.**If you’re new, This ~monthly analysis is written for founders + investors I work with, colleagues, and a select group of subscribers. I aim for quality, density, and provocation vs. frequency and trendiness.here’s the rundownon what to expect.. My goal is to ignite discussion, socialize edges that may someday become the center, and help all of us connect dots.We don’t cover news; we explore the implications of what’s happening If you missed the big annual analysis or more recent editions of Implications, check out
recent analysis and archives here. A few recommendations based on reader engagement:We humans are designed to forget. Continuously replaying the details of every fight, fear, and trauma would undoubtedly cripple life’s requirement to let go, learn from mistakes, give and get second chances, and expand our minds beyond the confines of the past. Our malleable memory is a feature, not a bug.What are the implications of remembering everything?Some of the world’s greatest artists and most ambitious storytellers are modernizing their craft and raising the bar of world-class storytelling through a series of workflows and techniques that I’ve started calling Precision Generative Workflows(PGW) with friends and artists. These techniques are modern brushes and chisels.These new workflows don’t start with prompts and they don’t create slop. Instead they enable more creative risk-taking without compromising creative control.Originality is the primary ingredient of timeless creations. What does this mean for the demand side of entertainment?As our feeds become filled with pirated IP violations and copycat scenes, what will we crave more of?
Navigating Cambrian Explosions
One of our peculiar human tendencies is always thinking “this time is different” when, in fact, history does repeat itself. However, there are undoubtedly periods of exponential change like “Cambrian explosions,” periods when unparalleled evolution happens incredibly fast. At the risk of falling for humanity’s natural narcism about the importance of OUR lifetimes, I do believe we’re living in such a time.
“How do we grasp exponentialism?” As you examine the data about AI capabilities, cost curves, and certainly frontier technology, the only thing more striking than how fast (and non-linear) everything is growing is how wrong our forecasts have been. I was speaking to an Anthropic investor recently about how even their ambitious internal forecasts have nonetheless routinely underestimated results in terms of compute requirements and revenue generated. Linear growth models are falling short, and we humans are simply struggling to grasp the implications of this new technology. Even one of the companies at the center of this moment, with better data than anyone else in the world, is having a hard time grasping just how fast it is all happening! What should we take away here? Let’s acknowledge that technological advancement and societal change are growing at an accelerating, exponential rate … and let’s find creative ways to grasp what this could actually mean.“How do we become comfortable with the inevitable?” As we do begin to grasp the implications of exponentialism, I see friends in leadership roles across industries struggling with realizations that are becoming clear faster than they’re ready for them. Whether it is artist friends watching parts of their process completely refactored or lawyer friends questioning the business model of their firms, they are trying to become comfortable with inevitable changes without the extended periods of socialization that humans typically require. It dawned on me: This is why I dedicate time to this monthly analysis. If science fiction is a prototype for the future, discussion of the implications is the accelerant for our readiness.
Let’s dive into Edition #44
The Era of Efficient AI
While everyone is pontificating about the world’s insatiable appetite for compute (and the dizzying infrastructure investments that are pouring gasoline on every company involved in this complex industrial stack), we need to consider the practical possibility that most of what we need from AI will soon come from very cheap and likely “local” models that run on our own computers and phones. We may look back at this early period of artificial intelligence and realize that by using frontier models in the cloud, we were essentially hiring PhDs for every task just because we could. Fast emerging efficient AI practices will help us avoid malpractice-by-overspecification - you don’t send a cardiac surgeon to take a patient’s blood pressure — not because the surgeon would fail, but because the surgeon is slower, scarcer, more expensive, and the nurse’s reading is indistinguishable. It is far more prudent to allocate the right talent for the right job. Brian Armstrong, Founder/CEO of Coinbase and NewLimit, recently noted, “demand for intelligence is near infinite, but 80 percent of workloads will be running on 99 percent cheaper models within 12-18 months. The other 20 percent of workloads will still run on latest gen models where IQ maxing is important (scientific breakthroughs, etc).” The same week, Clem Delangue, Co-founder/CEO of HuggingFace, an open collaborative platform for model builders, shared some recent research from Stanford suggesting that, in Delangue’s words, “local models can answer 71.3 percent of real-world chat and reasoning queries accurately, up from 23.2 percent in 2023. Obviously at a fraction of the cost and energy consumption of frontier APIs. The obvious conclusion: you don’t need a frontier model for most tasks. The future is multi-model: local, open-source, smaller and cheaper for the majority of workloads, frontier APIs when no other choices!”
We’re seeing early evidence of “efficient AI” policies, like Uber blowing through their allocated token budget in four months and allocating monthly budgets to engineers as a result. I suspect the “ROI on tokens” will become a metric for humans; what can YOU achieve within a given allocation of compute?
The Stanford research introduced a fascinating metric called “intelligence per watt” that helps determine the most efficient path for AI queries and tasks, and also makes us realize just how much overkill (assigning PhD-level costs to accomplish very simple tasks) is happening in the cloud today.
Indeed, we’re entering an era of Efficient AI, where we (and often AI agents on our behalf) will route our queries and actions to the right models.
An efficiency ecosystem will emerge. The orchestration layer for models and agents, where routing decisions are made, will become mission-critical software. We will see inference players like BaseTen, Modal, and OpenRouter among others get into the efficient-routing business, supporting the routing of tasks to models that can accomplish the work most efficiently. Platforms like HuggingFace, which host and nourish the open source model ecosystem, will also thrive in a world where local models reign. We will also see companies like Ramp, long devoted to spend efficiency, ushering in the era of autonomous intelligent spend (aka “thinking money,” as Ramp calls it) where your business costs optimize automagically. We’ll get into a world where the local models that live on our devices become the default first stop for AI-driven operations, and calls to frontier models in the cloud only happen when deemed necessary (and with decreasing frequency). Clearly, this is the strategy for Apple, where all the innovation and investment has gone into the full stack of tech for local AI processing and the company seems completely comfortable off frontier model calls when absolutely necessary. Perhaps Apple is looking around the corner and realizing that our everyday consumer needs from AI will soon be fulfilled locally, cheaply — and privately? Of course, frontier models will unlock so many new possibilities (curing disease, pioneering new understandings in physics, etc), but the “PhD-level model for every task” era is coming to an end.
The contrarian view here is that if you could give every worker the type of superhuman levels of intelligence and capabilities available from frontier models, why wouldn’t you (costs and energy consumption impact aside)? The 71.3% figure for work performed by local models cited by Delangue may be true and economically trivial simultaneously. If local models handle 71% of queries at 1% cost, that 71% is being commoditized toward zero marginal value. As a result, the economic surplus concentrates in the scarce 20–29% — meaning the economic opportunity is actually pushed to the frontier models. But every company — and industry — functions like a machine with many parts. And some of these parts are simply designed to operate within a specified range of capability and possibility that are, today, being run inefficiently.
People management in the age of AI & talent density
Leaders have shifted from celebrating how big their teams are to a new era where leaders flex how small their workforce can be. Modern companies now aspire for talent density over hiring. People have always been and will remain the most important part of a company. But as people offload their mundane and repetitive tasks to compute, the quality and talent density of your team matters most.
How do you measure “impact per person” within a company? More importantly, how do you optimize the potential of people in this new world? If technology raises the bar for what humans are capable of (and what is expected of humans on the job), HR may be the next big disruption as every company reimagines their approach to measuring and developing people. What might the future of AI-native HR look like?
Talent Density Scoring: Are you doing the work of one person or 2.3 people? Are you instrumenting your function and hacking your assigned “job to be done” using agents that you have orchestrated to not only complete your job, but scale yourself, take on more adjacent responsibilities, and improve the performance itself? “Talent density” is the measure of what each person is uniquely capable of, and I expect a whole new era of scoring systems and mechanisms to understand who is doing what. Some products likeMacroscopehave started to do this in engineering organizations, while others likeWindmillare leveraging AI to transform management in a way that reveals and develops talent density.Cost of Job: Every role now has an additional “cost” beyond the salary in the form of consumption of tokens. While most people are just tinkering with new agentic workflows to complete their jobs, there is a small cohort of early adopters who are consuming unfathomable amounts of tokens to automate vast swaths of work as well as develop new ideas that expand their roles (and contribute to their talent density). We can expect new ways to measure the operating expenses of a business that incorporate token usage on a per role basis. We can also expect efforts to quantify and compare the costs of models vs. humans for certain functions. Just the other day I came across an effort online to quantify the cost per hour of top performing models (scores below reflecting performance across popular benchmarks), and it was wild to see which ones were above or below minimum wage…
Impact-Based Hiring: As tools to measure impact coincide with efforts to refactor “screen jobs” across functions with AI, I expect to see more hiring with compensation tied to impact. While sales has long been impact-based, what other functions can be reimagined when impact can be more accurately measured? Quality assurance? Lead generation? Marketing? What are the job types and the tools that enable people to be hired and entirely compensated based on the impact they make to an organization? Of course, some people will try to game whatever system is used to allocate their compensation, in this case by using huge amounts of tokens to inflate their own impact. To guard against that, companies may look at the ratio of “work produced to tokens used” to see which employees are optimizing their impact in efficient and sustainable ways.Product (and company) surface area expansion: Finally, when you demonstrate higher ROI on each person you employ, I expect companies to start expanding their markets, their lineups of features and products, and their aspirations by hiring MORE people. This is classic “Jevons Paradox,” the economic theory suggesting that, as technological advancements increase the efficiency of a resource, overall consumption of that resource increases rather than decreases. When efficiency drops the cost of use, it sparks higher demand. Should this not apply to people?
Shared Experiences & The Pursuit of Togetherness
By a total fluke, my son and I landed some unsold Knicks tickets about 15 minutes before game four of the finals. Lesson learned: if you’d like to attend a game but don’t mind watching on TV, wait until the very last moment and then make an offer to brokers with unsold inventory! Anyways, what struck me most about the experience was the miraculous way all New Yorkers united for a few hours after this now legendary comeback. As we traversed the streets, strangers were high-fiving each other, police were chanting “Knicks in five” alongside teenagers and senior citizens, drivers were yelling “Go Knicks” out of their car windows as they were stuck in traffic, and there was even a random pickle vendor giving out free pickles to every passerby (pic below). What struck me about this moment was the role that shared entertainment experiences play in bringing people together...and how desperately we seek “togetherness” in this age of increased division and isolation. We humans WANT things in common. We eagerly play the name game to identify mutual friends when we meet people, we often discuss TV shows and films we have seen with our friends, we want to go to theaters to experience films with a fanbase that we identify with, and we are willing to hug complete strangers after a shared emotional experience. As the entertainment industry adopts new technology that enables hyper-personalization and virtual worlds that we immerse ourselves in, lets pay attention to what humans want more of: shared experiences and togetherness.
Ideas, Missives & Mentions
Finally, here’s a set of ideas and worthwhile mentions (and stuff I want to keep out of web-scraper reach) intended for those I work with (free for founders in my portfolio, and colleagues…ping me!) and a smaller group of subscribers. We’ll cover a few things that caught my eye and have stayed on my mind as an investor, technologist, and product leader (including culture hacks to get distribution, EQ>IQ, unexpected AI conundrums, and how art is STILL just story). Subscriptions go toward organizations I support including COOP Careers and the Museum of Modern Art. Thanks again for following along, and to those who have reached out with ideas and feedback.
Keep reading with a 7-day free trial #
Subscribe to Implications, by Scott Belsky to keep reading this post and get 7 days of free access to the full post archives.