There's a thing that does the thing. It writes the code, drafts the email, reads the document, answers the question. It's the part everyone watches, because it's the impressive part. And there's another thing that holds the thing. It decides where the work happens, what the thing is allowed to touch, what it knows when it starts, and what happens to the result when it finishes. Almost nobody watches that part. It's the one that matters most.
The thing that does the thing is the model. The thing that holds it is the harness. Once you can see the harness, you stop being surprised by how differently the same model behaves from one place to the next, and you start being able to do something about it.
The model reasons, the harness acts
That's the cleanest definition I've found, and it isn't mine. The phrase has settled into the industry over the past year, and Anthropic published a piece in November called "Effective harnesses for long-running agents" that treats it as a first-class engineering problem. The idea is simple once you hear it. The model thinks. It can't actually do anything on its own. It can't open a file, call another system, remember yesterday, or check whether its own work was any good. Something has to wrap around it and turn its reasoning into action. That wrapper is the harness.
When the model decides to read a file, the harness is what decides whether the read is allowed, what comes back, and how much of it fits into the next prompt. When the model finishes a step, the harness decides whether that step gets saved, tested, thrown away, or handed to the next session. The model supplies the intelligence. The harness supplies everything else, and everything else turns out to be most of the experience.
The same model, in three different harnesses
Here's the part that took me a while to internalize. The model you use in a plain chat window, the model running inside Claude Code in my terminal, and the model running on a schedule while I'm asleep can all be the exact same model. They don't feel like the same tool. They barely feel like the same category of tool. The difference isn't the intelligence. It's the harness around it.
Drop a model into a chat box and you get a smart conversation that forgets everything when you close the tab. Put that model in a coding harness with access to a file system, a terminal, and a test runner, and it builds software across hours of work. Give it a harness that runs on a timer and reaches into your email and your calendar, and it does work you never sat down to ask for. Same engine. The harness decided what kind of tool it became.
This is why two people can use what is technically the same model and have completely different opinions about how useful it is. One of them is typing into a bare chat box. The other has wrapped the model in connectors, saved context, and a workflow that fits their actual job. They're not arguing about the model. They're describing two different harnesses and crediting the model for the gap.
You've been staring at the model the whole time
Once you have the word, you start seeing harnesses everywhere, and you notice that the companies building them are competing on the harness far more than on the raw model.
Anthropic is building Claude's harness. The connectors, the workspaces, the skills, the agent that runs on its own, the version that lives in your terminal. None of that is the model. It's the structure around the model that decides where it can work and what it can reach. OpenAI is building the same kind of thing around ChatGPT, with its own shape and its own choices. The frontier models themselves are converging. They leapfrog each other by a few months at a time, and for most real work they've become close enough that the choice between them is rarely the thing that decides whether you get value. The harness is. The labs know this, which is why so much of what they ship now is harness and not model.
So when someone asks me which model they should use, my honest answer is that it's usually the less interesting half of the question. The more interesting half is what you're going to wrap around it.
I build my own harnesses
The part most people don't realize is that you don't have to wait for a lab to hand you one. You can build your own, around the same model everyone else is using, shaped to exactly how you work. I do this constantly now, and it's changed more about my output than any single model upgrade has.
The clearest example is the Command Center I wrote about a few weeks ago, the queue I built so I could capture work the instant I think of it and feed it to Claude one focused job at a time. That's a harness. The model didn't change. I changed where it does the work, how the jobs are sequenced, and how the context survives after a session ends. The result is a tool that fits my day in a way no out-of-the-box chat window ever could.
It shows up in smaller places too. When I shape an engagement document, I'm not typing into an empty box. I've wired the model to the connectors that hold our client context, our past work, and our capabilities, so the relevant material is already in reach before I ask for anything. When I build a presentation now, I run it through a builder we put together that knows our brand, our layouts, and our voice, so the model is producing inside guardrails instead of guessing. In every one of those cases, the model is the commodity. The harness is the thing I actually made, and it's the thing that makes the output mine.
None of this is exotic engineering. A few years ago, building a harness around a model was something only a research lab could do. Now the tooling has gotten good enough that an operator who understands their own workflow can assemble a real one in a few evenings. The barrier moved from "can you build this" to "do you understand your work well enough to shape it."
Where the difference lives now
If the models are converging, and I think they are, then the question of who gets more out of AI stops being a question about access to the best model. Everyone is getting the same engine. The difference moves to the harness. Where you point the model, what you let it touch, what context you put in front of it, and what you do with what comes back.
That's a more hopeful place to compete than it sounds, because the harness is the part you can actually control. You can't out-build Anthropic's model. You can absolutely build a better harness for your own work than the generic one you were handed, because you understand your work and the generic harness doesn't.
So the next time the impressive part dazzles you, the thing doing the thing, look past it to the thing holding it. That's where the leverage is. The model is going to keep getting better on its own. The harness is the part that's yours to build.