Retrieval at Scale | Drop for 2025-12-02

TL;DR

Since Nov 20, 2025: Faiss main added multi‑bit RaBitQ and integrated PANORAMA into HNSW/Flat for faster verification; Qdrant 1.16.1 improves batch‑query speed and on‑disk behavior (building on ACORN + inline HNSW from 1.16); Weaviate 1.34.1 ships stability fixes and zstd‑compressed backups (1.34’s ACORN‑by‑default and batching previews remain); Amazon OpenSearch Service now supports OpenSearch 3.3 and introduces Agentic Search with vector‑search enhancements; a new paper studies SPLADE at billion‑scale with practical pruning strategies.

Faiss main: multi‑bit RaBitQ + PANORAMA integrated into HNSW/Flat

Key facts and current state of the topic
- RaBitQ is a high‑accuracy binary quantizer; Faiss previously supported 1‑bit variants and GPU CAGRA/IVF improvements. PANORAMA accelerates ANN verification with learned orthogonal transforms. (github.com)
Important context and background information
- Verification often dominates ANN latency; finer‑grained quantization helps probe more candidates under fixed budgets. Bringing these into Faiss main enables immediate source builds and signals what’s likely in the next release. (github.com)
Recent developments or changes
- Nov 21–19: Faiss main added multi‑bit RaBitQ (2–9 bits) and integrated PANORAMA in IndexHNSWFlatPanorama/IndexFlatL2Panorama; scalar‑quantizer optimizations also landed. Consider A/Bs vs. your current PQ/RaBitQ at target recalls; requires building from main until tagged. (github.com)

Qdrant 1.16.1: faster batch scans and sturdier on‑disk ops (on top of ACORN + inline HNSW)

Key facts and current state of the topic
- Qdrant 1.16 introduced ACORN‑style filtered ANN and “inline storage” that embeds quantized vectors in HNSW nodes to cut random I/O. (github.com)
Important context and background information
- Filtered ANN and on‑disk search are common pain points at scale; incremental upgrades matter for tail‑latency and stability. (github.com)
Recent developments or changes
- Nov 25: 1.16.1 improves batch queries up to 3× on full scans (single read per point), actively migrates RocksDB data to Gridstore at startup for more predictable performance, and fixes several WAL/consensus edge cases. Validate upgrade paths; some users reported startup panics when skipping intermediate versions. (github.com)

Weaviate 1.34.1: maintenance release (zstd‑backups, internal gRPC) atop ACORN‑by‑default and batching/SPFresh previews

Key facts and current state of the topic
- Weaviate 1.34 made ACORN the default filter strategy, added Flat‑index Rotational Quantization (1‑bit/8‑bit, preview), server‑side dynamic batching (preview), and SPFresh vector index. (weaviate.io)
Important context and background information
- These features target filtered‑ANN stability, memory footprint for multi‑vector/late‑interaction, and fresher on‑disk search. (weaviate.io)
Recent developments or changes
- Nov 27: 1.34.1 adds zstd compression for backups, an internal gRPC server mirroring REST cluster APIs, and multiple fixes (e.g., batch/vectorizer and S3 multipart uploads). Good for production hardening while you pilot RQ + batching. (github.com)

Amazon OpenSearch Service: OpenSearch 3.3 support + Agentic Search

Key facts and current state of the topic
- Managed OpenSearch picked up 3.3, continuing the 3.x vector/AI feature cadence (GPU builds, quantization, filtered‑ANN improvements in prior 3.x). (docs.opensearch.org)
Important context and background information
- Many large retrieval stacks rely on OpenSearch/Lucene; service‑level upgrades ease adoption without self‑hosting. (aws.amazon.com)
Recent developments or changes
- Nov 24–25: Service now supports 3.3 with vector‑search enhancements (e.g., batch processing for semantic highlighter) and introduces Agentic Search (natural‑language → DSL query planning), useful for complex, filter‑heavy retrieval and hybrid pipelines. Evaluate on representative traffic and filters. (aws.amazon.com)

SPLADE at billion scale: effectiveness/efficiency trade‑offs with practical pruning

Key facts and current state of the topic
- Learned sparse retrieval (SPLADE) keeps inverted‑index efficiency with strong relevance, but query/doc expansions can stress posting lists at scale. (arxiv.org)
Important context and background information
- Prior work introduced FLOPS/DF‑FLOPS‑style regularizations; this study adds large‑scale evidence and pruning tactics. (arxiv.org)
Recent developments or changes
- Nov 27: New arXiv evaluates BM25 vs. SPLADE vs. Expanded‑SPLADE on tens‑of‑millions to billions of titles and proposes document‑centric pruning, top‑k query‑term selection, and boolean term‑thresholding—helpful knobs to keep SPLADE latencies closer to BM25 while preserving gains. (arxiv.org)