Everyone's Building Jarvis. Nobody's Even Close.

wpnews.pro

A Swiss Army knife is a terrible knife.

It's a terrible screwdriver, a terrible bottle opener, and a terrible saw. The only thing a Swiss Army knife is genuinely good at is being small enough to carry around, and the reality of "small enough to carry" is not a strong engineering thesis.

But that's exactly what everyone in AI is building right now.. the everything assistant. The one that reads your email, manages your calendar, writes your code, books your flights, and files your taxes. The pitch sounds incredible but the product is a mediocre version of six different tools duct-taped together.

I know because I built one.

I spent about three months on a product called Skippy. It leveraged OCR screen capture, email ingestion, calendar sync, and pattern recognition to become an AI assistant that understood your whole digital life. It attracted a scary amount of interest from investors and Reddit before I'd even launched, people were just asking for beta access non-stop.

And I shelved it. Not because it failed, but because the more I used it the more I realized that a tool trying to do everything just ends up doing nothing well enough to actually rely on. I was playing with automations, making reservations, ordering food on DoorDash, and every single integration felt like a worse version of the thing it was replacing. You sacrifice everything for the benefit of having everything, and the benefit isn't actually that great.

If you don't believe me, go look at Killed by Google sometime. Google Inbox. Google Allo. Google Hangouts. Google Wave. Google Stadia. Products backed by billions of dollars and thousands of engineers. Dead. If Google with functionally infinite resources can't sustain multi-feature products, what makes a solo dev stitching 15 libraries together in Bali think they're going to pull it off? The tools that actually win do one thing exceptionally well, so well you stop noticing they exist. That's the product thesis nobody in the personal AI space seems to accept, and honestly I think it's because the Jarvis fantasy is just too seductive to let go of.

And while we're on the topic of things people don't want to hear, quick reality check for the r/LocalLLaMA crowd: your MacBook is not a datacenter.

Open-source models are genuinely impressive for what they are, I'm not going to pretend otherwise. But the gap between a 70B open model and frontier production models from Anthropic or OpenAI is not a crack, it's a chasm. There's a reason the GPU shortage exists and there's a reason inference at scale costs what it costs.

I actually went through a phase where I thought there had to be models you could run locally that would be comparable. So I built a home lab, RTX 5090, RTX 6000 PRO, 256GB DDR5, 128TB NAS, 42U rack, the whole setup. I use local models for experimentation and fine-tuning constantly but what I don't do is pretend they're competitive with frontier intelligence at tasks where output quality actually matters. That's not pessimism, that's just what the benchmarks say. If it were actually possible to match frontier quality on consumer hardware, companies like Anthropic wouldn't exist and NVIDIA wouldn't have the market cap it does.

This one's going to piss some people off, but it needs saying.

Most developers using AI to write code are getting worse at their jobs, not better. And that's coming from someone who uses AI to write code every single day.

What good AI-assisted development looks like is basically pair programming, which has been around since the beginning of time. You direct, you review, you push back when the model suggests something dumb. You understand every line that ships and the AI just accelerates your judgment rather than replacing it.

What's actually happening is people type "build me a todo app with auth" into Cursor, tab-accept whatever comes out, run npm run dev

, take a screenshot, and post it to Reddit as something they "built."

That's not engineering. That's pulling on a slot machine lever and hoping for 7's.

These vibe coders, use the term loosely, can't debug their own code because it was never their code. They don't understand the architecture, they can't explain the state management pattern, and when production breaks at 2 AM they're completely lost because they vibed with the output and never actually directed it. Go on Reddit for fifteen minutes and you'll see people pushing AI slop for days, they didn't even change the default Claude color scheme, you click buttons and they don't work, and they can't fix it because they don't even understand the causality of the bug.

AI is meant to speed up your pace of development. Not replace the need to understand what you built.

The orchestrators are the ones who will still have jobs in five years. They use AI more aggressively than the vibe coders actually, but they understand every line. They refactor. They question the model's choices. They treat AI as a power tool, not a replacement for skill. Prompt engineering is a skill of its own, and leveraging other skills to prompt engineer more effectively is a skill of its own too. People underestimate this hard.

And here's the thing that ties all of this together, the part that nobody's really talking about.

There's a book series called Expeditionary Forces where an alien elder AI named Skippy can literally manipulate wormholes. Omniscient-level intelligence. But it has a fatal design flaw: it only answers exactly what you ask.

Ask "is there danger ahead?" and Skippy says no. Because you didn't ask about danger to the left, or danger arriving in thirty seconds. The answer was technically correct and also catastrophically incomplete.

Sound familiar?

Ask Claude a question and you'll get a brilliant answer, scoped precisely to what you asked. But it won't mention the related problem from last week, it won't connect the dots to the bug you introduced three months ago, it won't anticipate what you actually need versus what you literally typed. And that's not because the model is bad at reasoning, it's because it has no memory. No continuity. No accumulated understanding of you or your work. Every session starts completely from zero. Real reasoning isn't just answering your question, it's answering the pieces surrounding it too, and AI can't do that without knowing what you've been working on, what's gone wrong before, what you actually care about.

That's what I built TrueMemory to fix, by enabling persistent memory that survives across sessions. It has an encoding gate inspired by how biological memory works by that evaluating the novelty, salience, and prediction error before deciding what to store. It's not a vector dump or a conversation log, it's a system that watches your workflow and decides what matters the same way a brain does.

The architecture and benchmarks are in my arXiv paper. The bottleneck in AI right now isn't intelligence. It's that your model forgets you exist every time you close the tab. Everyone's building Jarvis and nobody's even close, because they keep building the mouth and the hands and nobody's building the brain.

Josh Adler builds persistent memory systems for AI. Research: arXiv:2605.04897. More at joshadler.com.

source & further reading

dev.to — original article Building an LLM System from Scratch in Pure Python & NumPy: Architecture, Invariants, and Clean Code Franklin: a coffee-shop AI that treats neurodivergent customers as regulars The AI Senior Dev Dilemma: Am I Coding or Just Prompting?

Everyone's Building Jarvis. Nobody's Even Close.

Run your AI side-project on zahid.host