Zen and the Art of Machine Learning Research

wpnews.pro

temperament over talent

So you want to do AI research? It’s true that no one really teaches you how. Not directly, anyway. But it turns out that the way to get started is pretty simple: some combination of (i) reading and (ii) building stuff. You can’t do one without the other. You become a researcher through the combination.

It turns out the process of becoming a great researcher is not unlike learning to meditate:

**I. **

The way to get started is pretty simple, through some combination of

(a) reading and learning, and

(b) building stuff.

You can’t only do one. You’ll become a researcher through this combination.

There’s an old Zen saying that goes something like this –

on days we find insight, we sit.

on days we do not find insight, we sit.

Doing research is basically like this. Scientific insights can come seemingly at random. Most days they will not come. An important trait for success is just putting in the time & effort. Like any other pursuit (music, sports, sales, etc.), if you want to become world-class, it will take a tremendous amount of discipline.

Noam Shazeer makes a nice hat-tip to the inherent randomness of successful research ideas in the SwiGLU paper:

“We offer no explanation as to why these architectures seem to work; we attribute their success, as all else, to divine benevolence.”

A related comment is that it’s possible to read too many papers. If you want to solve a problem, the tried-and-true path to success is to attempt a solution, try it, reach a bottleneck, try to solve it, and only reach for literature when you’ve run out of ideas yourself.

II.

Fine, but what should I work on?

If you’re just starting out, here’s my honest answer: I don’t think the exact topic matters much. That said, I would warn you against choosing things that have been popular for less than six months. AI moves fast, but the fundamental ideas haven’t changed in forty years. If you want to make a career out of this, I wouldn’t advise you to think too hard about the concepts of 2026: harnesses, agents, context engineering, etc. These will change.

Instead, you’ll learn more by going back to the basics: learn what cross-entropy is. Compute it by hand for a small distribution. Deeply understand SVD, to the point where you can start to visualize it in your head. Don’t think too much about RL for coding specifically, instead learn the ideas behind policy gradients, why they’re useful, and why they’ve been popular for decades.

One more meta-comment: if the best possible outcome of your research project is a higher score on an existing benchmark, you are not going deep enough. Often, existing datasets won’t test new interesting capabilities.

Jason Wei makes a similar point: An underrated but occasionally make-or-break skill in AI research (that didn’t really exist ten years ago) is the ability to find a dataset that actually exercises a new method you are working on.

As for a concrete suggestion, I can’t make one; that has to come to you. Go deep, focus on the basics, and don’t chase benchmarks. Stay in the water and the ideas will come.

III.

in the beginner’s mind there are many possibilities; in the expert’s mind there are few

– Suzuki

Something often-repeated in Silicon Valley these days is how experience in AI research might actually be counterproductive to good research intuition in the modern day. I’ve observed parts of this up-close; many researchers from the pre-scaling-era remain interested in designing methods that work at a small scale but will obviously fail when tested at scale.

One really impressive thing about OpenAI is that most of the people running the company (on the technical side, at least) are under 35. Many of the important decisionmakers behind chatGPT are under 30. One thing we can take away from this is that since AI is such a nascent field (chatGPT is less than four years old!) no one has a huge advantage, because no one has been working on it for very long.

In short, holding on to ideas for too long can actually be counterproductive. Stay open-minded and refuse to let ego cloud your judgement.

IV.

Inspiration strikes when you least expect it.

Here are two examples from history:

The discovery of the

structure of the benzene ringfamously came in a dream: the structure had never been seen before, but was imagined as a snake biting its own tail.Ozempic basically comes from lizards. The GLP-1 hormone it mimics was first found in the venom of the Gila monster, a desert lizard that eats just a few times a year. Somehow we figured out how to make this work for humans too.

One important takeaway is that to do good research, you must do things other than research. Most of my personal “aha moments” happened away from the keyboard, especially when going on walks.

Darwin, Tesla, Feynman, Aristotle. Many great thinkers of history proclaimed the outsized benefits of stretching your legs and going for a little stroll. Even if you don’t do research, you should probably go on more walks.

V.

Even when inspiration strikes, nature may not be benevolent: even with a perfect implementation, our idea might just not be true in some fundamental sense. Or perhaps it was, or seems to be. When the results come in, how should we react?

Another principle we can borrow from Zen is (experimental) equanimity.

When analyzing an experiment, we can channel the following mentality:

Did it go well? Great!

Did it go poorly? Also great!

Both outcomes teach you the same amount of information. In fact, it’s often possible to learn more from a string of negative results than a single positive result. “Wow, it’s still not working – incredible!” Now that’s a healthy attitude for research.

The converse of this is that you shouldn’t get that excited about good results. In fact, most good results come because of a bug; it’s not that the results themselves were good, it’s that you measured incorrectly, and convinced yourself. Everyone wants their ideas to work – and this is a good thing! – but one thing all experienced researchers share is extreme skepticism, especially in the face of outcomes that seem too-good-to-be-true. Unfortunately, they almost always are.

VI.

A flower does not think of competing with the flower beside it. It just blooms.

Research is extremely outcome-driven. Especially in academia, it’s easy to look at others’ successes on paper and turn to emotions.

People succeed for different reasons. Some people get lucky. The academic reviewing process, in particular, is neither consistent nor fair. When new research comes out in your area that you admire, ask yourself the following question:

Am I operating at the proper level of depth to have made this insight myself?

Now there are two possible outcomes. If the answer is yes – great. Your process is sound, but you didn’t make this finding; you were busy, you were doing something else, but you could’ve.

And if the answer is no – then take this as motivation to go deeper.

VII.

before enlightenment, chop wood, carry water. after enlightenment, chop wood, carry water.

Many successful projects typically involve hundreds of hours of gruntwork behind the scenes. Andrej Karpathy labeled a nontrivial portion of ImageNet by hand. The creators of SWEBench, who were ahead of their time in many ways, spent hundreds of hours painstakingly filtering GitHub data to get a small, tractable set of GitHub issues useful for evaluation.

If you look at the career of great researchers, they likely spent lots of time working in obscurity before finding success. Get used to this. The more ambitious and forward-thinking an idea, the more work it may be to thoroughly implement and evaluate. This difficulty is a feature, not a bug. VIII.

Collin Raffel, an amazing researcher whom I deeply respect, once mentioned that he thinks many ideas fail not because they’re bad ideas, but because the code has a bug that the researcher never found.

In general this is a really difficult problem, especially in the world of LLMs. A modern deep learning software stack is extremely complicated, and bugs can lie anywhere: in training, in inference, in harnesses, in data.

if something looks wrong, you cannot move on. You can and should log many metrics and strive to understand all of them. If some of the metrics look different than you expected, you need to figure out why, because something may be wrong. I’ve tweeted before that one of the most important traits in a researcher is healthy paranoia. Be paranoid! **IX. **

One practical point is that most experiments that involve deep learning take too long. Training models can take weeks or months. These days, evaluating a model on a single task can take multiple days.

Especially when coding with agents, our instinct may be to spin up many experiments in parallel and let them all run at a slow cadence. Although simple parallelization helps to some degree, context switching is a harmful pattern.

It is of paramount importance that you design ergonomic research workflows that support fast experimental feedback. Shorten cold-start times for training, make small evals that return results quickly. I really admire Keller Jordan’s nanoGPT speedrun as an example of how much we can learn from fast iteration cycles.

(This said, at the end of the day, some results take an unavoidably long time. When you can, maintaining state over multiple days and understanding last week’s experiments when they finish today is an incredibly useful skill.)

X.

Coding agents help you move faster, but they make two problems worse: we have a harder time understanding basic details, and we context switch more often. A good researcher actively works to fight against both forces.

Codex can write a training script for you; it can even execute the script, babysit it while it’s running, interpret the results, and send them to you in an email. But maybe it ran into an error and shortened the system prompt without asking you. Maybe it shortened sequence lengths to get eval running in a reasonable time. Maybe it ran the wrong config because you didn’t specify.

From an engineering perspective, these are all small errors with an easy fix. But from a scientific one, they’re grave: small omissions like this can materially change important results of papers and are therefore not acceptable. Beware dragons. Even if you didn’t write the code, if you want to understand your results, you need to understand the system that produced them. I’ll level with you – this is hard! It’s tempting to outsource understanding to the machine. For many applications, it’s faster. But doing good science requires learning how the entire system works, so that you can be sure observations about it are true. There’s no easy way around this.

XI.

TLDR: Talent isn’t all that it takes to become a successful researcher. Temperament is greatly underrated. Stay curious and persistent, remain thoughtful and meticulous, and the ideas will come.

source & further reading

blog.jxmo.io — original article You cannot sell AI written software

Zen and the Art of Machine Learning Research

temperament over talent

Run your AI side-project on zahid.host