AI CONTEXT & MCP GUIDE

AI Context & MCP: How repowise Cuts Agent Context Tokens by 96%

How repowise indexes your repo once and serves it to AI coding agents through nine task-shaped MCP tools that collapse search, read, and reason into a single curated call.

96%
Fewer context tokens to load (2,391 vs 64,039, ~27x) at answer parity
89%
Fewer file reads across paired benchmarks, at answer parity
70%
Fewer tool calls on the flask48 and sklearn48 benchmark suites
9
Task-shaped MCP tools over one endpoint, open source under AGPL-3.0
By Raghav ChamadiyaUpdated June 2026 · 12 min
TL;DR

repowise indexes your repo once and exposes it to AI coding agents through nine task-shaped MCP tools, so the agent calls one tool and gets curated context — docs, ownership, history, risk — instead of grepping and re-reading files. On paired runs (same model, same harness, with versus without repowise), loading context took 2,391 tokens instead of 64,039 — 96% fewer, roughly 27x — at answer parity, plus 89% fewer file reads and 70% fewer tool calls. It is open source under AGPL-3.0, self-hostable, and works with Claude Code, Cursor, Cline, and Codex.

DEFINITION

AI context over MCP is repowise serving your indexed codebase to AI agents through the Model Context Protocol. Instead of pasting files into a prompt, the agent calls a task-shaped tool — get_answer, get_context, get_risk — and receives curated, grounded context: documentation, ownership, git history, and risk in one round-trip, with a staleness envelope on every response.

repowise MCP server exposing nine codebase tools to an AI coding agent
One index, nine task-shaped tools — the agent calls get_context once instead of grepping and re-reading files.

Why does AI agent context matter?

Your agent burns thousands of tokens grepping, reading, and re-reading the same files to reconstruct context it should have been handed. Entity-by-entity tools force the model to do the retrieval itself: search, open, scroll, repeat.

The problem is fragmented context — the agent sees file fragments, never the whole picture. Each fragment costs tokens, and the model pays again on every retry when the first read missed.

  • Token cost compounds: raw exploration of a single task can spend 64,039 tokens just loading context, before any reasoning.
  • Retries multiply reads: a missed grep means another search, another open, another scroll.
  • No enrichment: raw file reads carry no ownership, no history, no risk, no "why."

This is the wedge. repowise is reproducible — deterministic, the same input yields the same answer — and dual-audience, since the same index serves agents and humans, and private, since it is self-hostable with source processed transiently. It is not a public-repo wiki; it is a context layer you host yourself.

How does the repowise MCP server work?

One index, nine task-shaped tools, curated answers — not raw dumps. The pipeline is four steps.

StepWhat happens
Indexrepowise parses the repo into a graph, reads git history, and builds the wiki. Code is processed transiently, never persisted.
ConnectRegister the MCP endpoint in Claude Code, Cursor, Cline, Codex, or any MCP client. One URL, nine tools.
CallThe agent calls a task-shaped tool and gets a curated answer in one round-trip instead of grepping and re-reading.
TrustEvery response carries a _meta staleness envelope, so the agent knows when the index is current and when to verify.

Task-shaped, not entity-shaped. Most MCP servers mirror data entities — one file, one symbol, one diff — which forces long sequential chains. repowise tools are shaped around the task the agent is doing. get_answer collapses search, read, and reason into one cited round-trip. get_context absorbs what would otherwise be five or six calls — docs, signatures, ownership, freshness, callers, metrics — into one.

The nine tools, by what only they answer

ToolWhat only this tool answers
get_overviewOne-time architecture orientation — the architecture map, key modules, entry points, and git health on an unfamiliar repo.
get_answerSynthesised Q&A with citations and a calibrated confidence; low confidence returns ranked best_guesses instead of a guess.
get_contextA triage card for files, modules, or symbols — summary, signatures, ownership, freshness, the hotspot bit, governing decisions.
get_symbolExact source bytes for one symbol with live-verified line bounds — no offset math, no 800-line file read.
search_codebaseConcept search over the wiki when you know the concept but not the file — ranked pages with snippets and a search_method flag.
get_riskWhat history says about touching these files — hotspot score, dependents, co-change partners, owners, and a PR directive when you pass changed_files.
get_whyDecision archaeology — why the code is shaped this way, falling back to git history when no ADRs exist.
get_dead_codeA tiered cleanup plan — unreachable files, unused exports, and zombie packages by confidence tier, pure graph and SQL.
get_healthBiomarkers and per-file scores across three signals (defect, maintainability, performance) — the same signals a merge-gate judges on.
repowise get_context triage card with signatures, ownership, and hotspot flag
One get_context call replaces five or six — and the skeleton view is roughly 37% of a full file read.

The staleness envelope. An index that lies is worse than no index. Every MCP response carries a _meta envelope with index_age_days, the indexed_commit, and a stale_warning that fires only when the index has actually diverged from HEAD. Silence means current, so the agent trusts a verified response without re-reading; the only re-read triggers are an approximate-bounds flag, a stale_warning, or low confidence.

How does it help you?

Fewer tokens, fewer retries, grounded answers, and provenance your editor never sees.

  • Fewer tokens and retries: 96% less context to load and 70% fewer tool calls — the budget goes to reasoning, not file archaeology.
  • Grounded answers: get_answer returns citations and a calibrated confidence; high is content-grounded and citable without a verification read.
  • Provenance: get_risk surfaces who owns the code, how risky it is to touch, and what changes with it.

The same index that feeds agents also answers who owns this, how risky is this change, and why is it shaped this way — and repowise keeps the generated CLAUDE.md and a managed AGENTS.md current so the orientation files never drift.

Walkthrough: connect your agent

Step 1 — Install and index. Run pip install repowise, then repowise init to build the graph, git, health, and wiki layers. Code is processed transiently and never persisted.

pip install repowise
repowise init     # index the repo + register the MCP server
repowise init indexing a repository and writing the MCP server config
repowise init indexes the repo and wires up the MCP server in one command.

Step 2 — Register the MCP endpoint. repowise init writes .mcp.json and auto-registers the server for Claude Code. For Cursor, Cline, or Codex, drop the same mcpServers block pointing at repowise mcp <project>.

Step 3 — Call a task-shaped tool. Ask the agent a "how does X work" question and it calls get_answer or get_context, getting a curated answer in one round-trip instead of grepping and re-reading.

Step 4 — Trust the staleness envelope. Every response carries a _meta envelope with index_age_days, the indexed_commit, and a stale_warning that fires only when the index has diverged from HEAD. Silence means current.

Proof: the paired token benchmark

Each stat below is reproducible on your own repo, and the headline is a paired, same-model-same-harness comparison — not an estimate.

MetricResultMethod
Context tokens to load2,391 vs 64,039 — 96% fewer (~27x)Paired runs, same model + harness, with vs without repowise
File reads89% fewerAcross benchmarks, at answer parity
Tool calls70% fewerflask48 and sklearn48 benchmark suites
Task-shaped MCP tools9One endpoint, every MCP client calls the same set
Defect-validated health scoreROC AUC 0.74Surfaced to agents via get_health, calibrated on a real defect corpus
License and deploymentAGPL-3.0, pip install, self-hostCode processed transiently, never persisted

The headline is the paired benchmark: 96% fewer tokens (2,391 vs 64,039) is a hard, paired, same-model-same-harness context-token comparison, with answer quality held at parity.

FOR YOUR ROLE

How each role uses this feature

FREQUENTLY ASKED

Questions, answered

How much does repowise cut token usage?

On paired runs (same model, same harness, with versus without repowise), loading context used 2,391 tokens instead of 64,039 — a 96% reduction, roughly 27x fewer — at answer parity. Across benchmarks that is 89% fewer file reads and 70% fewer tool calls. This is a measured, paired, same-model-same-harness comparison, not an estimate.

Which agents and editors does it work with?

Any MCP-compatible client. repowise is agent-neutral: Claude Code (one-command setup via repowise init), Cursor, Cline, and Codex, plus any tool that speaks the Model Context Protocol. The same nine tools are exposed over a single MCP endpoint, so you are not locked to one editor.

Does my code leave my machine?

No. repowise is open source under AGPL-3.0 and self-hostable. You can run it fully local with your own LLM key, or fully offline via Ollama. Code is processed transiently and never persisted, so it never has to leave your infrastructure.

What is the difference between RAG and MCP here?

RAG is a retrieval technique (embed, search, stuff the prompt). MCP is the transport the agent uses to call tools. repowise uses retrieval inside get_answer but exposes it through MCP, so the agent calls a task-shaped tool and gets a curated, cited answer in one round-trip instead of you stuffing embeddings into a prompt.

How does the agent know when the index is stale?

Every MCP response carries a _meta envelope with index_age_days, the indexed_commit, and a stale_warning that appears only when the index has actually diverged from HEAD. Silence means the index is current. This is honesty as a feature: the agent is told exactly when to verify against source rather than trusting a snapshot blindly.

Is this the same index a human can use?

Yes. The repo is indexed once and serves both audiences: AI agents via MCP and humans via the web UI and generated CLAUDE.md and AGENTS.md files. The same graph, git history, wiki, decisions, and defect-validated health score feed every consumer.

Last reviewed: June 2026

Give your agent real codebase context in one call