open format specification v1.0

.codeindex

An open format for structured code documentation.
Committed to git, shipped in packages, queried by AI agents.

.codeindex
├── index.json
├── files.jsonl
├── symbols.jsonl
└── texts.jsonl

AI agents waste tokens finding code

They grep, read files, grep again, backtrack. Thousands of tokens burned locating a function — or worse, they miss it and hallucinate.

Without .codeindex
grep -r "class Config" src/
cat src/main.py | head -50
grep -r "def __init__" src/main.py
cat src/main.py
... 4 tool calls, ~3000 tokens
With .codeindex
search("Config", scope: ["symbol"])
→ Config · class · src/main.py:22-45
→ Config.__init__ · method · :23-30
  sig: def __init__(self, path: str, debug: bool = False)
... 1 tool call, ~200 tokens

Build and query with codeix

Index your codebase in seconds. Your AI agent finds the right function on the first try — no scanning, no guessing, no wasted tokens.

terminal
codeix build               # parse source, write .codeindex
codeix serve               # MCP server + file watcher
codeix -r ~/project build  # build from a specific dir

MCP Tools

discovery
explore
Project structure: metadata, subprojects, and files grouped by directory.
search
search
Unified full-text search across symbols, files, and texts. FTS5, BM25-ranked, with scope/kind/path filters.
lookup
get_file_symbols
List all symbols in a file, ordered by line number.
lookup
get_children
Get direct children of a class, struct, or module.
graph
get_callers
Find all call sites and references to a symbol.
graph
get_callees
Find all symbols that a function calls or references.

Languages

PythonRustJavaScriptTypeScriptGoJavaCC++RubyC#VueSvelteAstroHTML

Ship your index with your package

Include .codeindex in your package and every developer who depends on you gets instant navigation of your API — no setup, no re-indexing.

Author
Build the index
Run once on your source tree. Deterministic output, clean diffs.
codeix build
Publish
Ship with your code
Commit .codeindex to your repo or include it in your published package.
.codeindex
Consumer
Query instantly
AI agents navigate your API without parsing a single file.
search()
Git repos
npm
PyPI
crates.io

An open format anyone can read

Plain JSONL. Git-friendly diffs. No binary blobs, no proprietary databases. Any tool that can parse JSON can consume a .codeindex.

language-agnostictool-agnosticdeterministicgit-friendlyhuman-readable
symbols.jsonl
"file""src/main.py""name""os""kind""import""line"11"file""src/main.py""name""Config""kind""class""line"2245"file""src/main.py""name""Config.__init__""kind""method""line"2330"parent""Config""sig""def __init__(self, path: str, debug: bool = False)""file""src/main.py""name""main""kind""function""line"4860"sig""def main(args: list[str]) -> int"
index.json
Manifest: version, project name, detected languages
files.jsonl
One line per source file: path, language, hash, line count
symbols.jsonl
Functions, classes, imports — with signatures and parent relationships
texts.jsonl
Comments, docstrings, string literals — searchable prose

Install

All channels install the same single binary. No runtime dependencies.

install
npx codeix            # npm
uvx codeix            # Python
cargo install codeix  # Rust
brew install codeix   # Homebrew
.mcp.json
  "mcpServers"    "codeindex"      "command" "npx"      "args""codeix"