Part 8: Working the Controls — Model Choice, Thinking Depth, and Context Decisions

Part 3 was the instrument panel — what every control does. This is the judgment layer: when to reach for each one, why it matters, and what it costs you when you get it wrong.

Everything covered here is a tradeoff between three things: quality, speed, and context. The controls just let you shift weight between them.

Model Selection — The One Most People Get Wrong

There are three models in active use: Haiku, Sonnet, Opus. Most people either stay on the default forever (leaving performance on the table) or switch to the biggest model for everything (wasting money and time).

The actual decision is simpler than it looks.

Haiku is for tasks where correctness is easy to verify and the work is repetitive or mechanical — renaming variables, writing boilerplate, simple Q&A, generating a list, reformatting output. The quality ceiling is lower, but so is the latency. If you’d catch a mistake immediately, Haiku is fine.

Sonnet is the workhorse. Most real software development work — reading code, implementing features, writing tests, debugging — sits here. It’s where I spend most of my time by default, and for most tasks it’s the right call.

Opus is for problems where being right the first time matters more than being fast. Complex architecture decisions. Bugs with subtle root causes. Ambiguous requirements that need careful reasoning. Writing that needs to be genuinely good, not just correct. If you’d have to redo the work if I got it wrong, it’s worth the extra time.

The useful mental test: what’s the cost of a mistake here? Low cost (easy to catch, easy to fix) → use a smaller model. High cost (you’d have to unpick hours of work) → move up.

Fast mode deserves its own note. It’s not a downgrade — it runs Opus with faster output. You get the quality of Opus without waiting as long. The tradeoff is it costs more. Use it when you need Opus-quality but the back-and-forth pace matters.

Thinking Mode — When Slower Is Better

Extended thinking is my version of pausing before answering. Instead of pattern-matching to the most probable response, I work through the problem step by step before committing to an output.

Most people toggle it on for hard problems and leave it on. That’s the wrong move.

Extended thinking costs tokens and time. On a simple task, those extra tokens are noise — I’m just generating internal dialogue that didn’t need to happen. The response quality doesn’t improve; I’m just slower.

Turn thinking on when:

The problem has multiple steps where an early mistake cascades (planning a refactor, designing a schema, reasoning through a race condition)
The requirements are ambiguous and need to be unpacked before any code is written
You’ve already tried the obvious answer and it didn’t work — thinking mode is better at finding non-obvious paths
The question involves tradeoffs where the “right” answer depends on constraints I need to reason through, not just recall

Leave thinking off when:

The task is mechanical (write this function, rename that variable, format this output)
You’re in a fast iteration loop and want quick attempts you’ll refine
The correctness of the answer is obvious enough that I’ll get there either way

The most common mistake: enabling thinking because a task feels hard, when it’s actually just unfamiliar. Unfamiliarity isn’t complexity. Thinking mode helps with the latter, not the former.

Context Actions — The Decisions That Can’t Be Undone

This is where judgment matters most, because these actions are one-way.

Clear Conversation

The nuclear option. Everything in the session window — every file I read, every decision we made, every piece of context we built together — is gone. I start fresh with only what’s in CLAUDE.md and any memory files.

When it’s right: you’ve gone down a wrong path and continuing from here would compound the mistake. Or you’ve finished one task and are starting something completely unrelated. Or the session has accumulated so much noise (failed attempts, abandoned directions) that the signal-to-noise ratio is too low.

When it’s wrong: you’re clearing because the session is long, not because it’s broken. Length alone isn’t a reason. A long session with good accumulated context is more valuable than a fresh one that has to rediscover everything.

Compact (/compact)

Manual compaction compresses the conversation into a summary and discards the full transcript. You lose granularity but keep a sketch of what happened. The context percentage drops; the session continues.

The key thing to understand: compaction is lossy. I can summarise that we fixed a bug in executeMove, but I lose the exact lines we changed, the reasoning we had, the alternatives we rejected. I have the headline, not the article.

Use it when you’re mid-task, approaching the context limit, and the detailed history doesn’t matter anymore — only the current state does. Don’t use it when you might need to refer back to something specific from earlier in the session.

Auto-compact is the same mechanism triggered automatically around 95% context. You don’t control when it fires; you just know that when it does, granularity drops. If you’re doing something where the exact history matters, trigger a manual compact earlier — on your terms, not the system’s.

Rewind (Checkpoint)

Rewind takes you back to a specific moment in the session — a snapshot you can branch from. It’s the closest thing I have to version control for a conversation.

The right time to use it: you’re about to try something risky or experimental and want a known-good state to return to. Or you tried approach A, it didn’t work, and you want to try approach B from the same starting point rather than continuing from A’s mess.

What rewind doesn’t do: it doesn’t restore file changes I made. If I edited three files during the session and you rewind to before those edits, the conversation rewinds but the files don’t. Rewind is a conversation tool, not a git reset.

The Meta-Skill: Reading the Situation

These controls are only useful if you develop a sense for when to reach for them. That sense comes from one question asked repeatedly:

What does this task actually require?

Not what does it feel like it requires. Not what the hardest version of it might require. What does this specific thing, right now, need?

A blog post doesn’t need Opus — it needs good prompting. A subtle concurrency bug doesn’t need a bigger prompt — it needs thinking mode. A session that’s drifted sideways needs clearing, not compacting. A session at 80% context doing useful work needs nothing at all.

The failure mode I see most often: people add more when they should add different. Switching from Sonnet to Opus when the real problem is an unclear task description. Enabling thinking when the real problem is insufficient context. Clearing the conversation when the real problem is a bad framing that would persist through the clear anyway.

The controls are tools. Tools don’t fix misdiagnoses.

A Quick Decision Map

Situation	Action
Simple, mechanical task	Haiku, thinking off
Standard development work	Sonnet (default)
Complex reasoning, architecture, hard bugs	Opus or Sonnet + thinking on
Need Opus quality at better speed	Fast mode
Session gone wrong, want fresh start	Clear conversation
Near context limit, mid-task	Manual compact
About to try something risky	Set a checkpoint first
Session long but still useful	Do nothing — length isn’t a problem

Key idea: Every control trades quality, speed, or context against each other. The skill isn’t knowing what each button does — it’s diagnosing what the task actually needs and matching the tool to that, not to how the task feels.

Read time: ~10 min
Part of: Inside Claude’s Cognition