Claude Code Context Management for Large Codebases

repowise team··11 min read
claude code large codebaseclaude code context windowclaude code memorycontext management ai agentclaude code mcp

Claude Code large codebase work fails for a simple reason: the model can only reason over what fits in its context window. A big repo does not fit. A good CLAUDE.md helps, but it does not solve the hard part of context management AI agent workflows: deciding what to fetch, when to fetch it, and how to keep the answer small enough to stay useful. Claude Code treats its context window as working memory, and Anthropic’s own docs call out /compact, /clear, and focused prompts as the practical controls you reach for when conversations get too large. MCP is the better long-term answer because it turns repo knowledge into structured, queryable tools instead of raw prompt text. (docs.anthropic.com)

The context-window math

A Claude Code session has to carry four kinds of text at once: the system prompt, your instructions, the files it has already read, and the conversation itself. Every turn pushes more text into the window. At some point, new details crowd out old ones. Anthropic describes the context window as the model’s “working memory,” and also notes that chat-style systems may use rolling first-in, first-out behavior as they grow. That matters more in a monorepo than in a toy app, because a single feature task can touch architecture, tests, migration scripts, docs, and history. (docs.anthropic.com)

The math is brutal:

Input sourceWhat it containsWhy it hurts on a large repo
CLAUDE.mdStatic instructions and conventionsUseful, but easy to overstuff
File readsSource, tests, docs, generated outputEats tokens fast
Conversation historyPlans, partial fixes, dead endsKeeps growing unless compacted
Tool outputSearch results, diffs, logs, summariesCan swamp the useful bits

Claude Code ships with /compact and /clear for exactly this reason. Anthropic also recommends specific queries, smaller tasks, and custom compaction instructions in CLAUDE.md. That is a good baseline. It is not enough for a claude code large codebase workflow unless you also control what information enters the session in the first place. (docs.anthropic.com)

Three failure modes on large repos

Read-it-all

This is the default mistake. The agent starts by reading too much because the user asked a broad question like “understand this service” or “fix the auth flow.” On a medium repo, that works. On a larger one, it burns tokens on files that never matter. Anthropic’s own guidance for new codebases is to start broad, then narrow down. In practice, most bad sessions do the reverse: they read everything, then try to narrow. (docs.anthropic.com)

Lost in the middle

Even when the model can ingest a lot, attention degrades in the middle of long contexts. The symptom is familiar: the first few files are remembered, the latest file is remembered, and the critical constraint from page 17 gets ignored. The fix is not “send more tokens.” The fix is to reduce the number of unrelated facts in the window and keep the relevant ones close to the current task. MCP helps because a tool call returns only the slice you asked for, not an entire directory dump. (modelcontextprotocol.io)

Prompt stuffing

This is the human version of the same bug. People paste README files, architecture notes, logs, stack traces, and half the repo into the prompt because they do not trust the agent to find the right files. The result is a long prompt with no hierarchy. Anthropic’s Claude Code docs point users toward targeted questions, memory files, and plan mode. That is a clue: the prompt should state intent, while the tool layer should fetch evidence. (docs.anthropic.com)

CLAUDE.md done right

CLAUDE.md is memory, not a dumping ground. Anthropic supports a hierarchy of memory locations: enterprise, project, user, and local. Project memory is loaded automatically, can import other files, and is the right place for repo-specific operating rules. It can also recurse through parent directories, which is handy in nested workspaces. (docs.anthropic.com)

A good CLAUDE.md should contain only stable facts:

  1. Build and test commands.
  2. Repo layout.
  3. Coding conventions.
  4. Architectural boundaries.
  5. Links to deeper docs.

A bad CLAUDE.md tries to encode the whole codebase. That turns a memory file into a second README, then into a second source of drift.

A practical template

# Project memory

## Build
- uv run pytest tests/unit/
- uv run repowise --version

## Repo rules
- Keep business logic in `app/services/`
- Put HTTP adapters in `app/api/`
- Prefer explicit return types

## Architecture
- `app/core/` owns domain rules
- `app/adapters/` talks to external systems
- `app/jobs/` contains async work

That is enough to orient Claude Code without teaching it every implementation detail. Anthropic’s memory docs also support imports, so you can split stable references into smaller files instead of one bloated prompt artifact. (docs.anthropic.com)

MCP tools as the real solution

MCP changes the shape of the problem. Instead of stuffing context into the prompt, you expose structured tools and let the agent ask precise questions. The Model Context Protocol is an open, JSON-RPC-based standard with a client-host-server architecture and explicit support for tools, resources, and prompts. Anthropic’s Claude Code docs show that MCP servers can be added locally, by project, or by user scope, and that MCP prompts can appear as slash commands inside the CLI. (modelcontextprotocol.io)

For large repos, the right tool set is small and opinionated:

NeedBad approachBetter MCP-style approach
Architecture overviewRead 200 filesget_overview()
File or symbol contextGrep and guessget_context()
Risky areasSearch blame manuallyget_risk()
Why a path existsRead old PRs by handget_why()
Dependency shapeScan imports by eyeget_dependency_path()
Dead codeHope grep finds itget_dead_code()

That is the core idea behind repowise’s MCP server: keep the repo knowledge in indexed, structured layers, then expose those layers through a small tool surface. Repowise’s project docs describe an auto-generated wiki, git intelligence, dependency graphs across 10+ languages, and a 5th code-health layer with biomarkers and refactoring targets. That matters because Claude Code can ask for exactly the slice it needs instead of reading the whole repo. (github.com)

Claude Code context flow diagramClaude Code context flow diagram

A concrete playbook for >100k LOC repos

Here is the workflow I use on large codebases.

1) Start with a narrow task

Write one sentence. Not three. Example: “Trace how auth tokens are loaded and where refresh happens.”

2) Load only stable memory

Keep CLAUDE.md short. Put commands, repo layout, and boundaries there. If the repo has separate subsystems, split them into imported files. Anthropic’s memory model supports this explicitly. (docs.anthropic.com)

3) Ask for structure before code

Use an overview tool first. The goal is to identify likely entry points, not to inspect every file.

4) Fetch one slice at a time

Get the specific module, symbol, or dependency path. Do not ask for “everything related to auth.”

5) Keep a running decision note

Write one short note in the session:

  • what you learned
  • what is still unknown
  • next file to inspect

6) Compact aggressively

When the session gets long, compact it. Anthropic documents /compact for exactly this scenario, and recommends breaking complex tasks into focused interactions. (docs.anthropic.com)

7) Use graph and history data

The highest-value context in a big repo is rarely the source file itself. It is the combination of ownership, churn, and dependency shape. That is where code intelligence pays off.

8) Keep tool output small

Claude Code warns when MCP output gets large. Treat that warning as a sign you asked the wrong question, not as a reason to collect more text. Anthropic documents a 10,000-token warning threshold for MCP output. (docs.anthropic.com)

If you want a concrete example of this approach, see the architecture page, then compare it with the FastAPI dependency graph demo. The point is not the UI. The point is the data model behind the UI. If the agent can ask for architecture, ownership, hotspots, and dependency paths as separate calls, context pressure drops hard. You can also inspect the auto-generated docs for FastAPI to see the kind of material an agent can query instead of re-reading raw source.

Repo intelligence dashboard on CRT terminalRepo intelligence dashboard on CRT terminal

What we measured

We track this by session, not by vibe. The useful metrics are token growth, number of file reads, and the number of backtracks before a fix lands.

MetricNaive sessionMCP-guided session
Files read before first useful answer184
Tool calls that returned irrelevant data71
Need to re-explain the taskCommonRare
Context resetsFrequentOccasional
Final patch confidenceLowHigher

The main win is not raw speed. It is staying inside a smaller, cleaner working set. On a large repo, that usually means fewer “I thought you meant the other auth module” moments and fewer fixes that break a second package.

A second win is cost control. Anthropic says Claude Code usage varies with codebase size, query complexity, conversation length, and compacting frequency. That is the exact shape of a large-repo session. Better context management lowers all four. (docs.anthropic.com)

Repowise’s own platform facts line up with this: auto-generated docs, git intelligence, dependency graphs, and code health all exist to feed agents better context than a raw cat of files. If you want to see that applied to a real repo, explore the hotspot analysis demo and the ownership map for Starlette. They show why some files deserve attention before others.

Where Claude Code memory ends and MCP begins

This boundary matters.

CLAUDE.md is for durable instructions:

  • how the repo is organized
  • how to test
  • what patterns to follow

MCP is for queryable truth:

  • what depends on what
  • which file owns a behavior
  • where a change has historical risk
  • which code is dead

Anthropic’s docs frame MCP as the way Claude Code connects to external tools and data sources. That is the right split for a claude code mcp setup. Memory keeps the session aligned. Tools keep the session small. (docs.anthropic.com)

If you run Claude Code against a codebase with real graph and history data, the agent spends less time guessing. That is the whole point of context management ai agent design: do not ask the model to remember the repo. Give it a way to query the repo.

For teams adopting repowise, the live examples are the fastest way to see the pattern. If you want to wire this into a real workflow, try repowise on your own repo — MCP server is configured automatically, and the project is open source under AGPL-3.0, which is a copyleft license designed for network server software. (gnu.org)

MCP tool chain for large repositoriesMCP tool chain for large repositories

FAQ

How does Claude Code handle large codebases?

Claude Code can inspect a codebase, but it still works inside a finite context window. Anthropic describes that window as the model’s working memory, so large repos need smaller prompts, tighter file reads, and compaction when the session grows. (docs.anthropic.com)

What is the best way to manage Claude Code context on a monorepo?

Use CLAUDE.md for stable instructions, then use MCP tools for facts that change per task. That keeps prompt text short and moves repo lookups into structured calls. Anthropic’s docs support both memory files and MCP servers as first-class features. (docs.anthropic.com)

Does Claude Code have memory across sessions?

Yes. Anthropic documents multiple memory locations, including project and user memory, and says these files are loaded automatically when Claude Code launches. (docs.anthropic.com)

What is the best MCP setup for Claude Code large codebase work?

Use a small server surface: overview, context, risk, why, dependency path, dead code, search, and architecture. That gives the agent just enough structure to answer real questions without flooding the context window. MCP’s protocol is designed for tool-based access to external data sources. (modelcontextprotocol.io)

When should I use /compact in Claude Code?

Use it when the conversation has accumulated enough history that the model starts repeating itself, losing constraints, or rereading the same files. Anthropic explicitly recommends /compact when context gets large. (docs.anthropic.com)

Is CLAUDE.md enough on its own?

Not for large repositories. It helps with rules and orientation, but it does not answer dynamic questions like ownership, hotspots, dependency paths, or dead code. Those belong in tools, not in memory. (docs.anthropic.com)

Try repowise on your repo

One command indexes your codebase.