Retrieval at Scale | Drop for 2026-03-29

TL;DR

Three noteworthy updates since your Mar 19 drop: (1) SPLADE‑Code introduces learned‑sparse retrievers tailored for code, reporting state‑of‑the‑art code retrieval under 1B params with sub‑ms inverted‑index latency; (2) SuperKMeans claims 4–7× faster k‑means (CPU) and up to 4× (GPU) vs FAISS/cuVS for vector index training while matching centroid quality; (3) GraphER proposes graph‑based enrichment and reranking that plugs into standard vector stores (no KG needed) to lift multi‑hop/fragmented‑evidence RAG.

SPLADE‑Code: learned‑sparse retrieval purpose‑built for code

Key facts and current state of the topic
- Code search is still dominated by dense retrievers; LSR (e.g., SPLADE) is attractive for CPU‑friendly inverted‑index serving but underexplored for code.
Important context and background information
- Code has long tokens, multilingual identifiers, and domain drift—challenges for both sparsity and latency budgets in production systems.
Recent developments or changes
- New work introduces SPLADE‑Code (600M–8B), reporting 75.4 MTEB‑Code under 1B params and 79.0 at 8B, and showing sub‑millisecond query latency on 1M‑passage indices. If you maintain lexical/LSR tiers for developer or policy/code assets, this is a strong candidate to A/B against dense baselines. (arxiv.org)

SuperKMeans: faster k‑means for building vector indexes

Key facts and current state of the topic
- K‑means underpins IVF‑style partitioning and many quantizers; build time at billion scale often gates freshness and reindexing SLAs.
Important context and background information
- FAISS and cuVS are common baselines for centroid training; faster clustering directly accelerates IVF/RQ/PQ index builds.
Recent developments or changes
- “SuperKMeans” reports up to 7× faster CPU clustering than FAISS/Scikit‑Learn and up to 4× faster GPU clustering vs cuVS, with comparable centroid quality on vector‑search tasks. Consider evaluating as a drop‑in for IVF/RQ codebook training to shorten index‑build windows. (arxiv.org)

GraphER: graph‑based enrichment and reranking that fits standard vector stores

Key facts and current state of the topic
- RAG on complex questions often needs evidence aggregation beyond nearest neighbors; heavy knowledge‑graph builds are costly and brittle.
Important context and background information
- Prior graph‑augmented RAG improves multi‑hop QA but typically requires curated KGs or bespoke indices that complicate ops.
Recent developments or changes
- GraphER proposes offline object‑level enrichment and lightweight graph‑style reranking at query time, without requiring a KG and remaining retriever‑agnostic; authors report gains across multiple retrieval benchmarks with negligible latency overhead. Useful for multi‑hop/fragmented‑signal queries without replatforming your candidate store. (arxiv.org)