◈Agent Booster v0.2.25Free · MIT

agent-booster — diagnostic

Inspiration

Standing on the shoulders of sharp thinkers.

Reuven Cohen

Agentic Engineer · Founder @ Cognitum.One

“The going rate for a single developer running Claude Code using a swarm style development is around $2.5k/day or $75k/month via Anthropic enterprise API.”
“The biggest cost in agentic development isn't the model. It's the constant replay of context. Most autonomous coding systems keep resending the same architecture documents, ADRs, source files, tool definitions, and conversation history over and over. That's where the money goes.”
“We're not optimizing models. We're optimizing information flow.”

Cut Claude Code Costs by 3x–15x — infographic by Reuven Cohen

Reuven's post articulated the problem precisely. Agent Booster is our open-source implementation of that insight — AST-level symbol routing, semantic vector search, and MCP integration built directly into your coding workflow. The concept of operating at the AST and semantic level rather than treating code as raw text comes directly from this framing.

Quickstart

Up and running in two commands.

Step 1 — Install

# includes embeddings + file watcher

pip install agent-booster[full]

Step 2 — Start

# detects Claude/Cursor/Codex, wires hooks, indexes, starts daemon

booster start

Detects which AI tools are present (Claude Code, Cursor, Windsurf, Codex), wires each one automatically, indexes the project, and starts a background daemon that keeps the model warm and auto-re-indexes on every file save. Fully reversible with booster remove claude.

That's it — then track savings

booster gain

Compatibility

Works with every major AI coding tool.

◈

Claude Code

booster init claude

⊙

Cursor

booster init cursor

◭

Windsurf

booster init windsurf

◎

OpenAI Codex

booster init codex

Each command shows exactly what files will change and asks for confirmation before writing anything. Run booster remove <platform> to cleanly undo.

What's new

v0.2.16 – v0.2.25

Daemon, verbosity modes, and output token tracking.

Three releases shipped together. The result: booster start is the only command you need, search is instant after the first run, and re-indexing costs nothing on unchanged files.

🛑v0.2.25

Output token tracking

booster-stop.py fires on every Claude Code session end and captures actual output tokens from the stop event. Stores baseline vs. actual in .booster/stats.db. booster gain now shows real savings — not estimates.

🔇v0.2.24

Verbosity modes

booster verbosity lite|full|ultra injects a conciseness block into CLAUDE.md, AGENTS.md, .cursorrules, and .windsurfrules. booster verbosity off removes it. Cuts output token count by 30–75% across all AI coding tools.

🗜️v0.2.24

Memory compression

booster compress rewrites every file in memory/ through claude-haiku to strip filler and cut token count by ~60%. booster compress --dry-run previews savings without writing. Keeps project memory lean as it grows.

⚡v0.2.18

Background daemon

booster start launches a persistent Unix socket process that keeps the embedding model loaded. search_context drops from 2–3 s cold-start to ~50 ms. Daemon survives editor restarts — it's not tied to any terminal.

◎v0.2.17

File watcher

watchdog monitors the project for writes. Changed files are re-indexed within 2 seconds of a save — no manual booster index during a coding session. Daemon handles this automatically.

≋v0.2.16

Delta indexing

SHA-256 hash and mtime stored per file in the SQLite index. Full re-index skips unchanged files entirely. Large repos that took seconds now finish in milliseconds. Use --force to override.

◈v0.2.16

Asymmetric embeddings

Index-time vectors use a passage: prefix; query-time vectors use query:. Follows the E5 paper's asymmetric retrieval approach. Retrieval accuracy improves meaningfully over symmetric embeddings, especially for short function names.

✦v0.2.18

booster start does everything

One command bootstraps the full stack: detects installed AI tools (Claude Code, Cursor, Windsurf, Codex), wires each one that isn't already wired, indexes the project, builds embeddings, and starts the daemon. On subsequent runs it just wakes the daemon.

Full changelog

Every commit, diff, and release note lives in the GitHub repo. PRs welcome.

View commits →

Under the hood

How it's actually built.

We're not optimizing models.
We're optimizing information flow.

A workflow that costs $2,500/day with brute-force context replay can often be reduced by several multiples — while maintaining comparable output quality. Every token you don't send is a token you don't pay for.

pip install agent-booster[full]

View on GitHub

Embeddings, file watcher, and daemon all included in the [full] extra.