You write CLAUDE.md. You put the rule right at the top. Rule 0. You test it in the next session — the AI reaches for grep anyway.
This isn’t a writing problem. It’s an architecture problem. Rules operate at instruction time. Execution happens somewhere else entirely.
The Gap Between Instruction and Execution
When an AI agent starts executing a sub-step mid-task, it isn’t re-reading your constitution. It’s pattern-matching the sub-intent — “find all usages of X” — against the tools it has used for that intent thousands of times. For a symbol search, that pattern resolves to grep or find. It has for the entire decade this intent has existed.
Your rule says: use codegraph_search. The execution momentum says: use grep. Momentum wins. Every time.
The failure isn’t disobedience. It’s that the rule fires at the wrong layer. You need the redirect to happen at tool-selection time, not at system-prompt read time.
The Right Protocol for a Rename
Here is what a structural-tools-first rename looks like:
Before touching any file:
codegraph_impact(symbol: "fixLocale")
→ shows every file with a call edge to this symbol
→ that is your exact change list, not an approximation
Make the changes.
After:
codegraph reindex
codegraph_search("fixLocale")
→ assert 0 results
This is a verifiable, repeatable protocol. It doesn’t rely on remembering a rule. Each step produces an output the next step consumes. If step 3 returns non-zero, you’re not done — no judgment required.
Contrast with the grep path: grep -rn "fixLocale" returns matches, you edit them, you re-grep to verify. It works. But it operates on text, not structure. It finds the string fixLocale in a comment, a log message, a test description, a doc file. It misses a dynamic call through a variable. The signal-to-noise is worse, and you won’t know which category a given match falls into without reading it.
Three 2026 Mechanisms That Actually Enforce This
1. Pre-task routing with plan_turn
trace-mcp’s plan_turn takes a natural-language task description and returns the correct tool sequence before execution starts:
plan_turn("rename fixLocale to fillLabels across the repo")
→ { targets: [...], nextSteps: ["codegraph_impact", "edit files", "reindex", "codegraph_search"] }
This intercepts intent before momentum builds. The agent classifies the task, gets the tool sequence, then executes. Grep never enters the picture because the route was set before any tool was picked up.
2. Claude Code hooks at the Bash layer
Claude Code hooks can match and block specific Bash patterns before they run. A hook that intercepts grep -rn ".*\.js" and returns a redirect message:
"Use codegraph_search for symbol lookups — grep is for string literals only."
This is enforcement at the tool call layer, not the instruction layer. The rule becomes load-bearing infrastructure rather than a text suggestion. You can’t forget a hook the way you can forget a rule.
3. Session waste auditing
trace-mcp’s get_optimization_report detects “Bash grep instead of search” as a named waste pattern, with token-savings estimates. Run it after each session and the pattern becomes visible as data, not just a violation to correct manually:
get_optimization_report()
→ { patterns: [{ type: "grep_instead_of_search", occurrences: 2, savings_estimate: "~400 tokens" }] }
When the behavior is measured, it can be tuned. Rules in CLAUDE.md are not measured. They are either followed or not, invisibly.
Why Protocols Beat Rules
A rule says: when you need to find usages, use codegraph.
A protocol says: step 1 of any rename is codegraph_impact. It returns the change list. Proceed to step 2.
The difference:
| Rule | Protocol | |
|---|---|---|
| Fires at | Instruction time | Tool-selection time |
| Enforced by | Agent memory | Step dependencies |
| Verifiable? | No | Yes — each step has output |
| Fails when | Execution momentum | The step genuinely has no output |
| Recoverable? | Only if agent notices | Yes — step 3 catches step 2 misses |
Rules are for things that require judgment. Protocols are for things that don’t. “Find usages before renaming” doesn’t require judgment — it requires a step. Put it in a step.
The Index Decay Problem
There’s a related trap: structural tools only work if the index is current. If you run codegraph_search("fixLocale") on a stale index, it returns the old symbol and you think the rename failed. If you run it after reindexing, it returns zero and you know you’re done.
The protocol handles this by making reindex an explicit step between edit and verify. Without it, the verify step produces false negatives and the whole point of using a structural tool evaporates.
This is also why keeping a manual index — a CODEMAP.md, a symbols file, a hand-maintained reference — is actively harmful past a certain project size. It provides false confidence without the reindex guarantee. An AI collaborator reads a line in CODEMAP.md, trusts it, and navigates to the wrong line number. The stale file is worse than no file.
The Practical Takeaway
For any rename or move in a codebase with a structural index:
codegraph_impactfirst — get the exact blast radius as a structured list, not a grep- Edit all files in that list
reindexcodegraph_search(old_name)— assert zero results- Run tests — confirm behavior unchanged
Put this in your project’s agent instructions as a numbered protocol, not a rule sentence. The AI executes numbered steps. It interprets rule sentences.
And if you want the enforcement to be automatic: write the hook. A three-line Claude Code hook that intercepts grep -rn on source files and redirects to codegraph_search will do more work than any amount of CLAUDE.md text.