System Overview
prx is a single Rust binary with a busybox-style architecture. Every subcommand shares common infrastructure — tree-sitter parsing, token counting, JSON output, content hashing — but each command is a self-contained module. The binary can be invoked as prx <subcommand> or via hardlinks named after each subcommand.
Binary Architecture
prx uses clap::Command::multicall(true) to dispatch subcommands. This means the same binary can be invoked as prx search or as a hardlink named prx-search — both routes hit the same handler.
Subcommand dispatch goes through a Rust enum:
#![allow(unused)]
fn main() {
enum Commands {
Search(SearchArgs),
Read(ReadArgs),
Find(FindArgs),
Edit(EditArgs),
Diff(DiffArgs),
// ...
}
}
Each command lives in src/commands/ as its own module. Shared infrastructure lives in the src/ root modules, imported by any command that needs it.
Module Layout
src/
├── main.rs # CLI entry point, clap dispatch
├── lib.rs # Library surface (public API)
├── output.rs # JSON envelope, error formatting
├── tokens.rs # Token counting (tokenizers crate)
├── hash.rs # Content hashing (xxh3)
├── walk.rs # File walking (ignore crate)
├── workspace.rs # Shared utilities
├── fallback.rs # Graceful fallback to Unix tools
│
├── commands/ # Subcommand handlers
│ ├── search.rs # prx search
│ ├── read.rs # prx read
│ ├── find.rs # prx find
│ ├── edit.rs # prx edit
│ ├── diff.rs # prx diff
│ ├── batch.rs # prx batch
│ ├── context.rs # prx context
│ ├── impact.rs # prx impact
│ ├── index.rs # prx index
│ ├── init.rs # prx init
│ ├── mcp.rs # prx mcp
│ ├── outline.rs # prx outline
│ ├── exists.rs # prx exists
│ ├── stats.rs # prx stats
│ └── run.rs # prx run
│
├── search/ # Search engine
│ ├── fusion.rs # RRF fusion, adaptive alpha
│ ├── graph.rs # Import graph
│ ├── semantic.rs # Model2Vec embedding search
│ ├── literal.rs # Regex/literal search
│ ├── structural.rs # ast-grep pattern search
│ ├── tokenize.rs # Identifier tokenization
│ └── symbols.rs # Symbol index
│
├── chunking/ # Code chunking
│ └── treesitter.rs # Tree-sitter AST chunking
│
├── ranking/ # Result ranking
│ ├── boosting.rs # Definition boost, stem matching, coherence
│ ├── penalties.rs # Noise penalties, saturation decay
│ ├── proximity.rs # Import graph proximity boost
│ └── weighting.rs # Alpha weight resolution
│
├── index/ # Index management
│ ├── dense.rs # Model2Vec embeddings
│ ├── sparse.rs # BM25 sparse matrix
│ └── bloom.rs # Bloom filter for exists
│
├── parsing/ # Tree-sitter integration
│ ├── imports.rs # Import extraction (10 language families)
│ ├── languages.rs # Language detection, grammar loading
│ ├── outline.rs # Symbol extraction
│ ├── snap.rs # Structural snapping
│ └── strip.rs # Comment stripping
│
└── runner/ # prx run parsers
├── mod.rs # Runner framework, tool detection
├── cargo_test.rs
├── pytest.rs
├── go_test.rs
└── ... # 22 parsers total
Shared Infrastructure
Tree-sitter Parsing (src/parsing/)
AST parsing for 15 languages, with grammars compiled directly into the binary. No runtime grammar loading. Tree-sitter powers chunking, --snap, --skeleton, --outline, syntax validation, structural search, and import extraction. Language grammars are C code compiled via the cc crate at build time.
Token Counting (src/tokens.rs)
Two modes: fast (byte_count / 4) for general use, and exact (cl100k_base tokenizer) when --budget is active. The tokenizer vocabulary is embedded via include_bytes! and loaded lazily on first use. Commands select results greedily until the token budget is exhausted.
JSON Output (src/output.rs)
Every command returns a standardized JSON envelope. Errors go to stdout as structured JSON — never to stderr. The --plain flag bypasses the envelope for human-readable output. Command handlers never write to stdout directly; all output goes through this module.
Content Hashing (src/hash.rs)
xxh3 128-bit hashing via the xxhash-rust crate. Runs at ~30 GB/s, making it cheaper to recompute than to cache. Every response that includes file content includes a hash, enabling agents to skip re-reads when nothing has changed.
File Walking (src/walk.rs)
Built on the ignore crate (from ripgrep). Respects .gitignore and .prxignore. Skips binary files (null byte in first 8KB) and files over 1MB. Used by search, find, and index commands.
Data Flow
A typical search query follows this path:
- CLI parses args, dispatches to
Commands::Search - File walker discovers files, respecting
.gitignore - Tree-sitter chunks each file (1500-char, syntax-aware boundaries)
- If semantic mode: embed chunks via Model2Vec (lookup + mean pool + normalize)
- If semantic mode: embed query, run cosine similarity against chunk vectors
- If literal mode: regex match against chunk text
- BM25 scores computed (if hybrid or sparse mode)
- RRF fusion combines scores from active retrievers
- Reranking pipeline applies boosts and penalties
- Budget enforcement selects top results greedily until token limit is reached
- Results serialized as JSON and written to stdout
Import Graph and Project Intelligence
The import graph (search/graph.rs) captures file-level dependency edges extracted via tree-sitter AST queries across 10 language families. Edges are resolved by suffix matching with proximity-based disambiguation. The graph is persisted as imports.bin.
Two commands consume the import graph:
prx contextassembles a module context package: stats, documentation, entrypoints, file skeletons, and 1-hop import edges.prx impactwalks the import graph backwards to find dependents. Supports symbol-level narrowing.
Both commands work without a persisted index, building the graph on-the-fly with a warning.
MCP Server (src/commands/mcp.rs)
Compiled in by default (controlled by the mcp Cargo feature). Exposes all prx tools as MCP tools over stdio transport using the rmcp crate. Async runtime via tokio, linked only when the mcp feature is active. The core binary without mcp or watch is fully synchronous.
Feature Flags
| Feature | Dependencies | Purpose |
|---|---|---|
default | ["mcp"] | Includes MCP server by default |
mcp | rmcp, tokio | MCP stdio server |
watch | notify, tokio | File watching for persistent index |
Key Architectural Decisions
These decisions are settled. They reflect deliberate tradeoffs, not defaults.
| # | Decision | Rationale |
|---|---|---|
| 1 | Single binary, busybox-style | clap multicall. prx search or hardlink prx-search. Zero install friction — download one file, run it. |
| 2 | Model weights embedded in binary | include_bytes! with float16 potion-retrieval-32M model (~32 MB). No internet required, works in sandboxes and air-gapped environments. |
| 3 | Pure Rust Model2Vec inference | No ONNX Runtime dependency. Inference is tokenize + lookup + mean pool + normalize (~50 lines). ONNX Runtime dropped x86_64 macOS support; pure Rust works everywhere. |
| 4 | JSON output by default | Agents parse structured data, not column-aligned text. --plain flag for human fallback. Errors in stdout, never stderr. |
| 5 | Tree-sitter for structural code parsing | Powers chunking, –snap, –skeleton, –outline, syntax validation, structural search. Import extraction uses tree-sitter AST queries (10 language families). No LSP server required. |
| 6 | Token budgets, not truncation | --budget N returns the best N tokens of results, ranked by relevance. Not head -N arbitrary cutoff. |
| 7 | Dry-run edits by default | prx edit previews changes. --apply commits. Agents see what will change before it happens. |
| 8 | Content hashes in every response | Enables cheap “has this changed?” checks. Eliminates ~50% of redundant file re-reads. |
| 9 | No daemon for basic usage | All commands work statelessly. Optional prx index --watch for warm caching. |
| 10 | 6-stage reranking pipeline | Definition boost, stem matching, file coherence, import graph proximity, noise penalties, saturation decay. Quality comes from ranking, not just retrieval. |
| 11 | BM25 with compound identifier tokenization | camelCase/snake_case splitting without stemming. Code identifiers are semantically distinct — “HTTPResponse” and “HTTP” mean different things. |
| 12 | RRF fusion with adaptive alpha | Symbol queries (Foo::bar) lean BM25 (alpha=0.3). Natural language queries stay balanced (alpha=0.5). Auto-detected. |
| 13 | Parallel indexing via rayon | All 5 indexing stages run in parallel. No shared mutable state, no Arc, no Mutex — pure par_iter on thread-safe immutable data. 7.6x speedup on 10-core (11K files: 410s → 54s). |
| 14 | Zero-copy memory-mapped embeddings | embeddings.bin is mmap’d via memmap2 and cast to &[f32] with bytemuck::cast_slice (zero allocation, zero deserialization). OS page cache keeps index warm across queries. Falls back to owned Array2<f32> if mmap fails. |
Error Handling
All errors are written to stdout as structured JSON:
{
"version": "0.2.0",
"command": "read",
"status": "error",
"error": {
"code": "file_not_found",
"message": "File not found: src/auth.ts",
"suggestion": "Use `prx find` to discover files."
}
}
stderr is reserved for RUST_LOG debug logging only. Exit codes: 0 for success, 1 for errors, 2 for usage errors.
When prx fails internally, the fallback system catches the error, runs the equivalent Unix tool, and returns results in the same JSON envelope with "fallback": true.