Context Compression

for local agents reading structured files

Make exact data reads cost less.

Context Compression sits behind your AI tool. When an agent reads JSON, JSONL, CSV, or TSV, it can swap the file for a lower-token file only after proving the smaller file decodes to the same parsed data.

measured savings

Cut the noisy tokens before they reach the model.

On 28 real files, Context Compression cut 17.5M tokens down to 15.2M. That is 2,324,959 tokens gone, and every selected read still decoded back to the exact same parsed data.

raw tokens 17,496,442

original structured files in the corpus

optimized tokens 15,171,483

selected after round-trip verification

tokens removed 2,324,959

advanced benchmark candidates, May 22 2026

corpus savings 13.3%

with raw fallback included

SQuAD QA rows 72.7%
Titanic rows 50.5%
LogHub logs 41.5%
GitHub metadata 29.1%

when to use it

Use it when the data must stay exact.

The sweet spot is simple: an agent needs the real rows or keys, but the original file format spends too many tokens on repeated notation.

Repetitive structured files

Exports, logs, benchmark data, metadata, and tables where brackets, keys, quoting, and delimiters dominate the read.

Exact follow-up work

Tasks where the model may need a specific row, field, or value later. A prose summary would be too lossy.

Local agent workflows

Codex, Claude Code, Pi, Hermes, MCP, OpenClaw, and generic agents can stay on their normal file-read path.

Not semantic compression

It does not decide what matters. The selected representation must decode to the same parsed rows and keys.

Not arbitrary shell rewriting

jq, grep, head, pipes, flags, and raw-shell intent stay untouched.

Not a quality claim

The repo proves deterministic savings and equality checks. Full answer parity across model families is still open evidence.

how it works

A selector, a verifier, and a fallback.

Adapters stay thin. They ask the selector for a decision, then read only the verified path. If the proof fails, the agent receives the original file.

01

Catch a whole-file read

The adapter handles supported local JSON, JSONL, CSV, and TSV reads.

02

Try reversible candidates

The selector counts model tokens for safe views and benchmark-only advanced views.

03

Verify parsed equality

A sidecar wins only when it decodes back to the same parsed source data.

04

Return sidecar or raw

The report records hashes, token counts, selection policy, and the trusted read_path.

install

Free. MIT open source. Keep the fallback on.

Clone it, point it at your worst file, and keep the raw fallback on. Start with a local checkout, run the selector, then wire the matching adapter into your runtime.

Local checkout

No hosted service is required. The selector runs on local files and writes sidecars under the project cache.

git clone https://github.com/saminkhan1/context-compression
cd context-compression
python3 -m venv .venv
.venv/bin/python -m pip install -r requirements.txt
chmod +x run-hook.sh

Codex hook

Use the Bash PreToolUse hook for simple whole-file cat reads. Runtime hooks default to the safer candidate tier.

[features]
hooks = true

[[hooks.PreToolUse]]
matcher = "Bash"

[[hooks.PreToolUse.hooks]]
type = "command"
command = "/absolute/path/to/context-compression/run-hook.sh"

Manual proof

Run the selector directly when you want the full JSON report before enabling transparent rewrites.

.venv/bin/python selector.py \
  --cwd "$PWD" \
  --model gpt-5.4-mini \
  --adapter manual \
  --include-candidates \
  --verify-report \
  sample-repetitive.json

Evidence gate

Use the repo checks before repeating benchmark or product claims on your own data.

.venv/bin/python -m unittest discover -s tests
.venv/bin/python scripts/verify_evidence.py --full-tests
python3 scripts/verify_clean_install.py