Going autonomous

"Going autonomous" means the agent doesn't need you in the room anymore. It starts itself on a schedule or an event. It does its work. It recovers from the ordinary errors on its own. It only bothers you when something genuinely requires human judgment. This section is about the ladder from "works when I run it" to "runs while I'm asleep", and the specific things you need in place at each rung.

Before you read this section.

You should already have a working Level 3 agent (see the autonomy spectrum). Meaning: an agent that does real work when a human runs it, with human-in-the-loop approvals for risky steps. If you haven't got that yet, the rest of this section is premature. Get Level 3 working on a task you care about, then come back.

What you actually need to go autonomous.

Going autonomous isn't one change. It's a stack of six layers working together. Missing any one of them and the whole thing falls over at 3am.

~ the six layers of a truly autonomous agent ~

Each layer assumes the one below it works. If the model is unreliable, nothing above matters. If the harness is flaky, scheduling it won't help. Build bottom-up; skip nothing.

What to read in this section.

These four pages each cover one layer (or one cluster). Read in order, they build on each other.

Is autonomy even worth it for your task?

Most tasks shouldn't be autonomous. Autonomy has real costs, infrastructure, monitoring, risk, and for many workflows a Level 3 agent with a human in the loop is a better solution. Here's the test.

~ autonomy: worth it vs not ~

The progression, in one picture.

Zoomed out, the journey from "first session" to "fully autonomous" looks like this:

~ the progression that actually works ~

Skip any step and you end up with the classic autonomy failure: the agent runs confidently for a day, then does something unexpected, and nobody was watching. The steps aren't optional. They're how you earn trust with an agent, the same way you earn trust with a new employee.