Retrieval at Scale | Drop for 2026-06-05

TL;DR

Faiss v1.14.2 lands major ANN features (SuperKMeans, SVS Vamana coarse quantizer, TurboQuant CPU path, Metal backend, IVF early‑stop, cuVS filtered search), improving build speed and recall/latency levers on CPU/GPU.
Milvus 3.0‑beta debuts a lake‑native architecture (External Collection, Snapshots) plus multi‑vector EmbList + DISKANN and richer query/TTL/text features—promising for fresher, disk‑oriented retrieval at scale.
Vespa’s May 2026 update adds “total‑” effort knobs, group pinning, in‑memory doc IDs, a raw‑text query operator, and easy embedding integrations—useful for predictable relevance and ops at high QPS.
FAVOR (arXiv) proposes a filter‑agnostic, selectivity‑aware HNSW method that reports 1.3–5× QPS at 95% recall across varying filter selectivity.
Elastic Stack 9.4.2 is a security patch; upgrade recommended for Lucene‑based vector/lexical stacks.

Key facts and current state of the topic
- Faiss remains a primary ANN workhorse; its recent cycles focused on verification speed (Panorama), binary/multi‑bit RaBitQ, and GPU/interop. (github.com)
Important context and background information
- Faster clustering, better quantization, and early‑termination directly raise recall under fixed latency, especially for filtered or hybrid retrieval pipelines. (github.com)
Recent developments or changes
- 1.14.2 (May 21, 2026) adds SuperKMeans (faster k‑means), supports SVS Vamana as an IVF coarse quantizer, introduces a CPU TurboQuant path, a Metal GPU backend for IndexFlat on Apple Silicon, IVF early‑stop, database‑parallel flat search, persistent HNSW locks for incremental adds, pip wheels via scikit‑build, and filtered search for cuVS indexes. Plan A/Bs vs. your current PQ/RaBitQ/IVF settings. (github.com)

Key facts and current state of the topic
- Milvus 3.0‑beta (May 9, 2026) extends Milvus into the data‑lake ecosystem and deepens retrieval features while remaining the kernel for Zilliz Lakebase. (github.com)
Important context and background information
- Zero‑copy over Parquet/Iceberg and snapshotting reduce ETL and enable consistent batch/serve views—key for freshness and large A/Bs. Multi‑vector support matters for late‑interaction/patch‑level embeddings. (github.com)
Recent developments or changes
- Highlights: External Collection (zero‑copy lake queries), Snapshots (MVCC‑style stable views), EmbList + DISKANN for variable‑length multi‑vectors, server‑side aggregation and multi‑field ORDER BY, nullable vectors, custom tokenizer/synonym resources, per‑entity TTL, explicit force‑merge, and a new manifest‑based Storage V3. Evaluate on lake‑resident catalogs and token‑heavy embeddings. (github.com)

Key facts and current state of the topic
- Vespa’s May newsletter introduces features targeting retrieval quality and operational predictability for hybrid/vector stacks. (blog.vespa.ai)
Important context and background information
- Cluster‑size–independent “total‑” parameters stabilize effort as nodes scale; pinning avoids pagination drift; in‑memory IDs and a raw‑text operator simplify high‑QPS and lexical paths. (blog.vespa.ai)
Recent developments or changes
- New: “totalTargetHits/total‑max‑hits/totalKeepRankCount/totalRerankCount,” search group pinning, in‑memory document IDs, a text() operator for raw‑text matching, near‑matching‑aware ranking, plus easy embedding integrations (OpenAI, Mistral, Voyage) and index backups. Use these to keep p95/p99 and relevance steady under autoscaling. (blog.vespa.ai)

Key facts and current state of the topic
- Filtered ANN often degrades under selective predicates; many solutions are tailored to specific filters or selectivity regimes. (arxiv.org)
Important context and background information
- FAVOR unifies selectivity estimation and execution, reshaping distances via an exclusion metric and routing between brute‑force (very low selectivity) and HNSW otherwise. (arxiv.org)
Recent developments or changes
- The paper (May 8, 2026) reports 1.3–5× QPS at Recall@10=95% vs. strong baselines across arbitrary filters—worth piloting against ACORN‑style traversal or strict post‑filtering. (arxiv.org)

Key facts and current state of the topic
- Elasticsearch underpins many hybrid (lexical + vector) systems; point releases frequently include security fixes. (elastic.co)
Important context and background information
- Keeping Lucene/Elasticsearch current reduces risk in candidate generation and ranking tiers handling user data and metadata filters. (elastic.co)
Recent developments or changes
- 9.4.2 (May 28, 2026) is a recommended upgrade addressing potential vulnerabilities; review the advisory and roll through managed/self‑hosted clusters. (elastic.co)