The Dopamine Revolution

There’s a pull to working with LLMs that most productive tools don’t have. Not the utility kind, the “I need to accomplish something” kind, but the other kind where you find yourself opening it again not because you have a task but because the last session felt good and you want to see what happens next. I kept noticing this before I really understood what was driving it.

Most tools I’ve used that work this way have been consumer products: social feeds, games, the comment section. The productive side of the stack has a different quality to it. Stack Overflow has a tone that makes you feel slightly judged for asking, documentation assumes you already know what you need, and colleagues have a threshold below which questions stop feeling welcome. None of them pull you back the way consumer products do, and I hadn’t expected something I was using to debug code to behave differently.

Why Does Working With an LLM Feel Different?

Part of it is patience, which sounds minor until you’ve spent years working around the alternative. The model doesn’t imply you should already know this, doesn’t make a correction that carries the weight of a judgment. A mid-level engineer I worked with kept asking questions about Kafka consumer group behavior, not to use the knowledge immediately, just to understand what was happening when consumer lag spiked. She’d had those questions for months, never asked them because the engineers who knew were busy and it didn’t feel like the right use of their time. With the LLM she just kept going. Two days later she had a mental model she’d been deferring for years. No one was waiting on her to stop.

“Intermittent reinforcement constitutes an important condition of action. Many significant features of behavior can be explained only by reference to the properties of schedules.”

— C.B. Ferster & B.F. Skinner, Schedules of Reinforcement (1957)

But patience alone doesn’t explain it. Textbooks are patient and nobody opens one because the last session felt good. The other part is variance: most output is competent, but occasionally it’s genuinely good in a way you didn’t see coming, an explanation that reframes something you’d been thinking about wrong, an approach cleaner than what you had. Variable ratio schedules produce more persistent behavior than reliable ones do. It’s the same structure behind slot machines and the social feed. The reward isn’t guaranteed, and that’s precisely why you keep checking.

What the Pull Actually Costs

The mechanism isn’t what’s new. Variable reinforcement goes back decades. What’s different is where it now applies. Consumer products have deliberately exploited this for years. Productive tools haven’t worked this way, at least not in my experience, partly because the medium didn’t support it. The thing delivering the variable reward is now something you’re using to build and learn, not something designed to capture your attention.

The effect, practically, is a lower threshold for starting. Engineers pick up areas they’d avoided before, not because access was blocked but because trying used to be unrewarding: the people who knew were busy, the docs assumed too much. Now there’s something patient that’s occasionally brilliant, and that changes what people attempt. At the team level, that same lowered threshold is what finally moves adoption decisions that good arguments couldn’t.

But I think the efficiency gains and the reward loop are running on separate tracks, even when they point in the same direction. The boilerplate case is real: scaffolding that used to take an afternoon takes minutes now, and agents don’t develop the fatigue that makes the third hour of a debugging session feel like the sixth. Those gains are genuine. But you can also spend ninety minutes getting to a conclusion you had after ten, feel good about it because two of those exchanges were genuinely sharp, and call that a net win. The variable reward doesn’t care about the efficiency math. It just keeps you at the terminal. Naming the actual constraint (am I here because this is working, or because two exchanges felt good) is the kind of check that’s easy to skip.

“Conditions of learning that make performance improve rapidly often fail to support long-term retention and transfer, whereas conditions that create challenges and slow the rate of apparent learning often optimize long-term retention and transfer.”

— Elizabeth Bjork & Robert Bjork, “Making Things Hard on Yourself, But in a Good Way” (2011)

The part I keep turning over is what that costs at the level of understanding. Skinner had a line about this: “The real question is not whether machines think but whether men do.” The research on desirable difficulties argues that the difficulty itself can be the mechanism: the struggle is often how the learning gets encoded. The engineer who uses LLM output as a starting point and investigates from there probably builds real understanding faster than before. The one who ships the first answer without knowing why it works is accumulating a different kind of debt. The tool gives the same encouragement to both, and I’m genuinely uncertain which pattern dominates as LLMs become the default environment for engineers who’ve never had to work without them.