What Is semble_rs?
semble_rs is a Rust CLI built by johunsang for AI coding agents, and it is one of the best AI Coding Agents tools for agentic IDE users who need semantic code search, repo maps, and log compression. The project claims up to -99% token reduction, a 747x shrink on tree output, and a single binary that runs on CPU with no daemon, no API key, and no GPU.
Quick Overview
| Attribute | Details |
|---|---|
| Type | AI Coding Agents |
| Best For | Agentic IDE users |
| Language/Stack | Rust 1.75+, tree-sitter, BM25, Model2Vec static embeddings |
| License | MIT |
| GitHub Stars | N/A as of Feb 2026 |
| Pricing | Open-Source |
| Last Release | N/A |
Who Should Use semble_rs?
- Solo developers using Claude Code, Codex, or Cursor who need to ask intent-based questions about a codebase without setting up a hosted index.
- Platform and backend engineers working in Rust, Python, or JavaScript monorepos who want
depsandimpactbefore editing shared modules. - CI owners and release engineers who want to compress build and test output into something a model or human can scan fast.
- Korean-speaking teams that care about code search across English and 한글 project names, comments, or identifiers.
Not ideal for:
- Teams that need a multi-repo SaaS search layer with enterprise permissions and org-wide analytics.
- Users who only need exact-text search and are already happy with
ripgreporgrep. - Teams that do not want to download a local model or run commands against a checked-out repository.
Key Features of semble_rs
- Hybrid retrieval —
semble_rscombines BM25 with Model2Vec static embeddings and merges candidates with RRF. It then reranks with definition hits, identifier-stem matching, file coherence, and noise penalties, which is why it handles intent queries better than literal search. - AST chunking with tree-sitter — Code is split on syntax boundaries instead of arbitrary line counts. That keeps functions, classes, and related declarations together, which reduces the chance that an agent gets a half-broken snippet.
- Gitignore-aware repo tree —
treeprints a compact codebase map using the same index as search, so.git/,target/, andnode_modules/do not explode context. The repo reports reductions from 9x to 747x depending on project size. - Build and CI log digesting —
digestauto-detects cargo, pnpm, npm, yarn, bun, tsc, pytest, go test, Gradle, ruff, mypy, clang, gcc, cmake, make, swiftc, and GitHub Actions. It preserves failures, tracebacks, file locations, and panic stacks while collapsing repetitive progress noise. - Dependency and impact analysis —
depsshows what a file imports and defines, whileimpactshows what is likely to change if that file changes. Optional Graphviz--dotoutput is useful when you want a visual graph instead of raw text. - Agent-friendly output modes —
--outline,--group,--compact,--json, and--json --striplet you tune for token budget, precision, or pipeline integration. The docs explicitly recommend--outlinefirst,--compactsecond, and JSON only when the chunk body is required. - Local-only runtime —
semble_rsships as a single Rust binary and runs on macOS, Linux, and Windows. On first run it downloads the defaultminishlab/potion-code-16Membedder, which is about 60 MB.
semble_rs vs Alternatives
| Tool | Best For | Key Differentiator | Pricing |
|---|---|---|---|
| semble_rs | AI agents that need semantic code search plus log compression | Hybrid BM25 + embeddings, AST chunking, tree, digest, deps, and impact in one local binary | Open-Source |
| ripgrep | Fast exact-text and regex search | Blazing-fast literal search, but no semantic ranking or repo summarization | Open-Source |
| ast-grep | Syntax-aware search and codemods | Structural pattern matching for source transformations, not agent context packing | Open-Source |
| Sourcegraph | Enterprise cross-repo code intelligence | Hosted indexing, org-scale permissions, and multi-repo search | Paid |
Pick ripgrep when you already know the string or regex and just want the fastest possible grep replacement. Pick ast-grep when you need structural code rewrites or AST-based queries rather than a semantic agent loop.
Use Sourcegraph when your problem is cross-repository discovery with enterprise access control, not local token reduction. Use semble_rs when you want the shortest path from question to relevant chunks, and pair it with Claude Context Mode when you need tighter prompt shaping for Claude-specific workflows.
If you run multiple agents in parallel, OpenSwarm is the better coordination layer, while semble_rs supplies the repo-grounded facts those agents read. That combination works well when one process plans, another inspects dependencies, and a third compresses CI output.
How semble_rs Works
semble_rs starts by indexing a local repository with gitignore-aware rules and parsing supported languages with tree-sitter. It stores syntax-aware chunks, then scores them with a hybrid retrieval pipeline that mixes BM25 with Model2Vec static embeddings before fusing candidates using RRF.
The design choice that matters most is the static embedder. There is no transformer forward pass at query time, so latency stays low on CPU and the binary can operate inside a normal developer shell instead of a service cluster. The repo reports about 150 ms for a 22-file project and about 10 s for 1,600 files, which is good enough for iterative agent loops.
The search pipeline is tuned for code, not generic prose. Reranking favors exact symbol definitions, identifier stems, and file coherence, while noise penalties reduce the chance that comments or accidental keyword matches dominate the result set.
git clone https://github.com/johunsang/semble_rs.git
cd semble_rs
cargo install --path .
semble_rs search 'auth flow' ./my-project --outline
semble_rs tree ./my-project --symbols
cargo build 2>&1 | semble_rs digest
The first command installs the binary locally, and the next two show how an agent would move from overview to detail. The final line demonstrates the digest pipeline, which is the part that saves the most tokens when build output turns into wall-of-text failure logs.
Pros and Cons of semble_rs
Pros:
- Single binary, no daemon — easy to drop into local workflows, CI scripts, or agent runners without standing up infrastructure.
- Strong token savings — the docs show
treereductions up to 747x anddigestreductions up to 98.9% on real GitHub Actions logs. - Code-aware ranking — BM25, embeddings, AST chunks, and reranking together produce better results than plain
grepon intent queries. - Useful beyond search —
deps,impact, anddigestmake it more than a search utility. - Cross-platform support — macOS, Linux, and Windows are explicitly supported.
Cons:
- Local repository required — it is not a hosted cross-repo index, so you need a checkout or a shallow clone.
- First-run model download — the default embedding model is roughly 60 MB, which is fine for most laptops but still a startup cost.
- Semantic ranking can return leads, not facts — the
plancommand is explicitly described as a guardrail, so low-confidence results still need review. - Agent-oriented output takes practice —
--outline,--group,--compact, and JSON modes are useful, but the best mode depends on the task. - Not a rewrite engine — it helps you find and compress code, but it does not replace codemod tools like
ast-grep.
Getting Started with semble_rs
git clone https://github.com/johunsang/semble_rs.git
cd semble_rs
cargo install --path .
semble_rs tree . --symbols
semble_rs search 'how is auth handled' . --outline
After the first run, semble_rs downloads the default minishlab/potion-code-16M model from HuggingFace unless you override it with --model or SEMBLE_MODEL_PATH. Start with tree to understand repo shape, then use search with --outline for discovery and --compact when you need exact matching lines.
Verdict
semble_rs is the strongest option for agent-driven local code search when you need semantic retrieval, repo mapping, and CI-log compression in one binary. Its main strength is massive token reduction without a server, and its main caveat is that it depends on a local checkout plus an initial model download. Recommend it if you live inside Claude Code, Codex, or Cursor and want faster code discovery.



