AI Search & RAG

Vector Databases With the Best Filtering on Metadata (2026)

Q: Can I do hybrid search (vector + keyword + filters) in one query?

Yes — Weaviate, Qdrant, and Milvus all support combining vector similarity search, BM25 keyword search, and metadata filtering in a single query. Weaviate has the most mature hybrid search implementation with configurable weighting between vector and keyword scores. Pinecone supports hybrid search with sparse-dense vectors. Chroma currently focuses on vector + filter without native BM25 keyword support.

Last updated April 5, 2026

6 tools compared

Top Picks

View Details

View Details

View Details

You build a RAG pipeline, embed your documents, and vector search works beautifully — until someone asks "show me contracts from 2024 with values over $50,000." Pure semantic search returns vaguely related documents from any year and any value. You need to combine vector similarity with structured metadata filters, and suddenly the choice of vector database matters a lot more than the benchmarks suggested.

Metadata filtering in vector databases is the feature that separates toy prototypes from production systems. In theory, every vector database supports filtering — you can attach metadata (key-value pairs) to your vectors and filter queries by those attributes. In practice, the implementation differences are massive. Some databases apply filters before the vector search (pre-filtering), some apply them after (post-filtering), and some use a hybrid approach. The strategy determines both accuracy and performance: pre-filtering is exact but can be slow on large datasets, post-filtering is fast but may return fewer results than requested, and hybrid approaches trade off between the two.

The common mistake is benchmarking vector databases on pure similarity search and assuming filter performance will scale proportionally. It doesn't. A database that searches 10 million vectors in 5ms might take 500ms when you add three metadata filters — because filtering forces it to scan more candidates to fill the result set. The databases on this list handle this problem differently, and the right choice depends on your filter complexity, dataset size, and latency requirements.

We evaluated these databases specifically on their filtering capabilities: filter expressiveness (what types of conditions are supported — equality, range, set membership, nested objects, geo-spatial?), filter performance (how does latency change as filter selectivity increases?), hybrid search support (can you combine vector similarity with keyword/BM25 search and metadata filters in a single query?), and index behavior under filters (does the database maintain vector index efficiency when filters are applied?). Browse our AI search and RAG tools category for more options in this space.

Full Comparison

Qdrant

Visit Site Full Review

High-performance vector database for AI applications

💰 Free tier with 1GB cluster, managed cloud from ~$25/mo

Visit Site Full Review

Qdrant is built from the ground up for scenarios where metadata filtering is as important as vector similarity — and it shows. Written in Rust for raw performance, Qdrant's filtering architecture uses dedicated payload indexes on metadata fields, allowing the database to narrow candidates using structured conditions before running the expensive vector distance calculations. This pre-filtering approach means that highly selective filters (matching only 0.1% of your dataset) actually make queries faster, not slower — the opposite of what happens in databases that post-filter.

The filter expressiveness is the most comprehensive on this list. Qdrant supports equality, range, set membership (must/should/must_not), geo-spatial radius and bounding box filters, full-text match on string payloads, nested object filtering, and boolean combinations of all of the above. You can build queries like "find vectors where category IN ['finance', 'legal'] AND date > '2024-06-01' AND location within 50km of New York AND full-text contains 'quarterly report'" — all applied as pre-filters before the vector search runs.

Qdrant also provides quantization options (scalar, product, binary) that reduce memory usage without significantly impacting filtered search performance — critical for production deployments where you need both filtering capability and cost efficiency. The query planner automatically decides whether to use payload indexes, vector indexes, or a combination based on the filter selectivity, so you don't need to manually optimize query strategies.

The free tier offers a 1GB cluster, and the Starter plan at $25/month provides 4GB with automatic backups. Self-hosting is free with no feature restrictions.

Vector SearchPayload FilteringQuantizationHybrid SearchMulti-Cloud DeploymentHorizontal ScalingREST & gRPC APIsSnapshot & Backup

Pros

Payload indexes enable true pre-filtering — selective filters speed up queries instead of slowing them down
Most expressive filter language: range, geo-spatial, full-text, nested objects, and boolean combinations
Rust implementation delivers consistently low latency even under complex multi-condition filters
Query planner automatically optimizes filter strategy based on selectivity — no manual tuning needed
Free self-hosted option with identical features to cloud — no artificial limitations

Cons

Smaller ecosystem and fewer managed cloud regions compared to Pinecone or Weaviate
Native BM25 hybrid search is newer and less battle-tested than Weaviate's implementation
Payload indexes consume additional memory — complex metadata schemas increase RAM requirements

Our Verdict: Best overall for metadata filtering — the pre-filtering architecture, filter expressiveness, and Rust performance make it the top choice when filtering is a primary requirement.

Weaviate

Visit Site Full Review

The AI-native vector database developers love

💰 Free 14-day sandbox trial. Flex plan from $45/mo (pay-as-you-go). Plus plan from $280/mo (annual). Enterprise Cloud with custom pricing. Open-source self-hosted option available.

Visit Site Full Review

Weaviate is the strongest choice when you need hybrid search — combining vector similarity, BM25 keyword search, and metadata filtering in a single query. While Qdrant excels at pure metadata filtering, Weaviate's architecture was designed from the start to unify multiple search paradigms. A single query can weight vector similarity at 0.7 and keyword relevance at 0.3, then apply metadata filters on top — all resolved in one pass through the index.

Weaviate's filtering uses an inverted index approach similar to traditional search engines (Elasticsearch, Lucene), which means filtering on high-cardinality fields (like user IDs or timestamps with millions of unique values) performs predictably. The filter syntax supports equality, range, boolean operators, geo-distance, and cross-reference filtering (filtering by properties of linked objects — e.g., "find documents where the author's department is 'engineering'"). This cross-reference capability is unique among vector databases and valuable for applications with relational data models.

Weaviate also provides named vectors — the ability to store multiple vector representations per object (text embedding, image embedding, summary embedding) and query any combination with shared metadata filters. For multimodal applications where you need to filter by metadata across different embedding types, this is a significant architectural advantage.

The Sandbox tier is free (14-day trial), the Flex plan starts at $45/month (pay-as-you-go), and self-hosting is free under the BSD-3 license.

Vector & Semantic SearchHybrid SearchBuilt-in RAGAutomatic VectorizationRerankingMulti-TenancyMulti-Modal SearchFlexible Deployment OptionsRBAC & SecurityReal-Time Data Sync

Pros

Best hybrid search implementation — vector + BM25 keyword + metadata filters in a single configurable query
Inverted index approach handles high-cardinality metadata fields without performance degradation
Cross-reference filtering lets you filter by properties of linked objects — unique among vector DBs
Named vectors support multimodal filtering across text, image, and other embedding types simultaneously
Mature self-hosted option with active open-source community and extensive documentation

Cons

Pure metadata filtering performance trails Qdrant's payload index approach on highly selective filters
Higher memory overhead than Qdrant due to the inverted index structures alongside vector indexes
Cloud pricing can be unpredictable on the Flex plan — usage-based billing requires monitoring

Our Verdict: Best for hybrid search combining vector similarity, keyword relevance, and metadata filters — the go-to choice when your queries need all three search paradigms together.

Pinecone

Visit Site Full Review

The vector database to build knowledgeable AI

💰 Free Starter tier; Standard from $50/mo; Enterprise from $500/mo

Visit Site Full Review

Pinecone takes the managed-service approach to metadata filtering — you get solid filtering capabilities without managing infrastructure, tuning indexes, or worrying about cluster configuration. For teams that want to build RAG applications with metadata filtering and focus on the application logic rather than database operations, Pinecone removes the operational burden entirely.

Pinecone's metadata filtering supports equality, range ($gt, $gte, $lt, $lte), set membership ($in, $nin), existence ($exists), and boolean combinations ($and, $or). You can attach up to 40KB of metadata per vector and filter on string, number, boolean, and string-list types. For most production use cases — filtering by document type, date range, user permissions, or category — these filter types cover the requirements. Where Pinecone falls short compared to Qdrant is in advanced filter types: no geo-spatial filtering, no full-text search on metadata, and no nested object filtering.

Pinecone's serverless architecture means filtering performance scales automatically with query load — there's no capacity planning for filtered queries. The database maintains separate metadata indexes alongside the vector index, so filter application doesn't require scanning the full vector set. For applications with moderate filter complexity (2-5 conditions per query) on datasets under 100 million vectors, Pinecone's filtering performance is competitive with self-hosted alternatives while eliminating ops overhead.

The Starter plan is free (2GB storage, 2M write units/month). The Standard plan at $50/month adds all cloud regions, backup/restore, and dedicated read nodes.

Serverless Vector DatabaseLow-Latency Similarity SearchHybrid SearchIntegrated InferencePinecone AssistantMulti-Cloud DeploymentBring Your Own Cloud (BYOC)Dedicated Read NodesNamespace SupportEnterprise Security

Pros

Zero infrastructure management — metadata filtering works out of the box with automatic scaling
Separate metadata indexes ensure filters don't degrade vector search performance on moderate datasets
Serverless architecture scales filtered query throughput automatically without capacity planning
Simple, well-documented filter syntax that's easy to integrate into application code
Generous free tier (2GB, 2M writes/month) lets you test filtering behavior before committing

Cons

No geo-spatial filtering, full-text search on metadata, or nested object support — limited expressiveness
Performance can degrade with highly complex boolean filter combinations on large datasets
Vendor lock-in — no self-hosted option means you're dependent on Pinecone's infrastructure and pricing

Our Verdict: Best for teams that want production-ready metadata filtering with zero ops overhead — trades some filter expressiveness for operational simplicity.

Milvus

Visit Site Full Review

High-performance, cloud-native vector database built for scalable AI applications

💰 Open source (free, Apache 2.0). Managed cloud (Zilliz Cloud) offers Free tier with 5 GB storage, Standard and Dedicated plans from $99/mo

Visit Site Full Review

Milvus is the vector database built for massive scale — billions of vectors — where metadata filtering must remain performant even as the dataset grows far beyond what fits in memory. Milvus achieves this through partition-based filtering: you can partition your data by a metadata field (e.g., tenant ID, region, date range), and filtered queries only scan the relevant partitions. For multi-tenant applications or time-series data where most queries target a specific partition, this approach dramatically reduces the search space.

The filtering system supports boolean expressions with arithmetic operators, string matching, array operations (ARRAY_CONTAINS, ARRAY_LENGTH), JSON field access for semi-structured metadata, and IN/NOT IN for set operations. Milvus also supports multiple vector fields per entity — you can store a text embedding and an image embedding in the same record and query either (or both) with shared metadata filters. The hybrid search implementation combines sparse vectors (BM25) and dense vectors with metadata filtering in a single query.

Milvus's architecture separates storage and compute, allowing you to scale query nodes independently of data nodes. This matters for filtered search workloads because complex filters are CPU-intensive while vector search is memory-intensive — scaling them independently means you can optimize cost for your specific filter patterns. The Zilliz Cloud managed service starts free (5GB) and scales to enterprise tiers.

The open-source version is free under Apache 2.0 with no feature restrictions. For teams operating at scale (100M+ vectors), Milvus's partitioning and distributed architecture handle filtered queries more gracefully than single-node alternatives.

Billion-Scale Vector SearchMultiple Index TypesGPU AccelerationHybrid SearchHot/Cold Storage TieringMulti-Language SDKsCloud-Native ArchitectureData Persistence & Replication

Pros

Partition-based filtering enables massive scale — queries only scan relevant data partitions, not the full dataset
Multi-vector fields let you filter across text and image embeddings with shared metadata conditions
JSON field filtering supports semi-structured metadata without requiring a fixed schema upfront
Separated storage and compute allows independent scaling of filter processing and vector search resources
Apache 2.0 open source with no feature restrictions — Zilliz Cloud available for managed hosting

Cons

Operational complexity is highest on this list — distributed architecture requires Kubernetes for production self-hosting
Partition key selection is critical — poor partitioning choices can negate the filtering performance benefits
Smaller community and fewer tutorials specifically about metadata filtering patterns compared to Qdrant or Pinecone

Our Verdict: Best for large-scale applications (100M+ vectors) where partition-based metadata filtering keeps performance predictable as the dataset grows beyond single-node capacity.

Chroma

Visit Site Full Review

The open-source AI-native vector database for search and retrieval

💰 Free tier with $5 credits, Team $250/mo with $100 credits, Enterprise custom pricing. Usage-based: $2.50/GiB written, $0.33/GiB/mo storage

Visit Site Full Review

Chroma is the vector database optimized for getting started quickly — and for many RAG prototypes and lightweight production applications, its metadata filtering is more than sufficient. The filtering API uses a simple where clause with $eq, $ne, $gt, $gte, $lt, $lte, $in, $nin operators and boolean $and/$or combinations. It also supports where_document for filtering on the original document text (substring matching), which is a convenient shortcut that other databases don't offer natively.

Chroma's strength for metadata filtering is developer experience, not raw performance. The Python-first API means you can add metadata filters to your queries with minimal code changes — no query language to learn, no index configuration to manage, no cluster settings to tune. For developers building AI applications who need basic metadata filtering (filter by document type, source, date, or user) and don't need geo-spatial queries or complex nested filters, Chroma removes all the friction.

The trade-off is clear: Chroma's filtering performance doesn't scale to large datasets the way Qdrant, Milvus, or even Pinecone does. The in-memory architecture means filtered queries on datasets larger than available RAM will page to disk, and there are no dedicated metadata indexes to optimize filter evaluation. For prototypes, internal tools, and applications with datasets under 1 million vectors and simple filter requirements, Chroma is the fastest path to a working product. For production systems with complex filter patterns or large datasets, plan to migrate to Qdrant or Pinecone.

The Starter plan is free with $5 in cloud credits. The Team plan at $250/month adds SOC II compliance and dedicated support.

Vector, Full-Text & Hybrid SearchSimple Pythonic APIBuilt-In Embedding FunctionsChroma Cloud (Serverless)Web & GitHub CrawlingMCP IntegrationCopy-on-Write CollectionsEmbedding Adapters

Pros

Simplest metadata filtering API — Python-native where clauses require no query language knowledge
where_document filtering on original text content is a unique convenience for RAG applications
Fastest setup to working prototype — embed, add metadata, and query with filters in minutes
Open-source with self-hosting option for data-sensitive applications
Lightweight enough to run locally for development and testing alongside your application

Cons

No dedicated metadata indexes — filter performance degrades significantly on datasets over 1M vectors
No geo-spatial filtering, no nested object support, no full-text search on metadata fields
In-memory architecture limits dataset size to available RAM without significant performance penalty

Our Verdict: Best for rapid prototyping and lightweight RAG applications — the fastest path to working metadata filtering, but plan to migrate for complex production requirements.

Zilliz

Visit Site Full Review

Enterprise-grade managed vector database built on Milvus for AI applications

💰 Free tier with $100 credits. Serverless pay-per-operation. Standard from $99/month. Enterprise custom pricing.

Visit Site Full Review

Zilliz is the fully managed cloud service built on Milvus, offering the same powerful metadata filtering capabilities without the operational complexity of self-hosting a distributed vector database. For teams that need Milvus-grade filtering at scale but don't want to manage Kubernetes clusters, Zilliz provides the same partition-based filtering, multi-vector support, and JSON field queries through a managed API.

Zilliz's filtering inherits all of Milvus's capabilities: boolean expressions, arithmetic operators, array operations, JSON path access, and partition-level data isolation. The managed platform adds automatic index optimization — Zilliz tunes the metadata indexes and vector indexes based on your query patterns, so you don't need to manually configure partition keys or index types. For teams without dedicated infrastructure engineers, this automatic tuning is the difference between a performant filtered search and a slow one.

The serverless tier handles the scaling automatically: as your filtered query volume increases, Zilliz provisions compute resources to maintain latency targets. The free tier includes 5GB of storage and 2.5M vCUs (vector compute units) per month — enough for meaningful testing of metadata filtering patterns. Paid plans scale based on storage and compute consumption.

Zilliz also provides Pipelines — pre-built ingestion and query workflows that handle embedding generation, metadata extraction, and filtered search in a single API call. For applications that need to ingest documents, extract metadata from them automatically, and make them searchable with filters, Pipelines reduce the integration work significantly.

AutoIndex & Cardinal Search EngineHybrid RetrievalTiered StorageMulti-Cloud DeploymentDynamic SchemaMulti-Tenant Partition KeysServerless ClustersEnterprise Security

Pros

Full Milvus filtering capabilities without Kubernetes complexity — partition-based, JSON, arrays, and boolean expressions
Automatic index optimization tunes metadata and vector indexes based on actual query patterns
Serverless scaling maintains filter query latency automatically as volume increases
Pipelines feature automates ingestion, metadata extraction, and filtered search in a single API
Free tier with 5GB storage is sufficient for testing complex metadata filtering strategies

Cons

More expensive than self-hosted Milvus at scale — managed convenience has a price premium
Vendor lock-in to Zilliz's infrastructure, though migration to self-hosted Milvus is straightforward
Newer platform with fewer community resources and case studies compared to Pinecone or Qdrant Cloud

Our Verdict: Best for teams that want Milvus's powerful metadata filtering with managed infrastructure — the operational ease of Pinecone with the filtering depth of Milvus.

Our Conclusion

Quick Decision Guide

If metadata filtering performance is your top priority, Qdrant delivers the most optimized filtering engine — its payload indexes and pre-filtering architecture are purpose-built for complex filter queries at scale.

If you need hybrid search (vector + keyword + filters), Weaviate provides the most complete hybrid search implementation with native BM25, vector search, and metadata filtering in a single query.

If you want managed simplicity with good filtering, Pinecone offers the easiest path to production with solid metadata filtering and zero infrastructure management.

If you need extreme scale (billions of vectors) with filters, Milvus handles the largest datasets with partition-level filtering that keeps performance predictable at scale.

If you're prototyping or building lightweight RAG, Chroma gets you started fastest with a simple filtering API and minimal setup — but plan to migrate for production workloads with complex filter requirements.

What to Watch

The biggest trend in vector database filtering is AI-native query planning — databases that automatically choose the optimal filter strategy based on the query, dataset, and filter selectivity. Qdrant's query planner already does this to some extent, and Weaviate is investing heavily in this area. Expect every database to ship intelligent filter optimization by late 2026.

Also watch for multi-vector filtering — querying across multiple vector fields (text embeddings + image embeddings) with shared metadata filters. Milvus and Weaviate are leading here, enabling queries like "find products that look like this image AND match this text description AND are under $50."

For related guides, check our AI search and RAG platforms category.

Frequently Asked Questions

What is metadata filtering in a vector database?

Metadata filtering lets you attach structured data (key-value pairs like category, date, price, region) to your vectors and then narrow search results based on those attributes. Instead of searching all 10 million vectors for similarity, you can search only vectors where region='US' and date > '2024-01-01'. This combines the power of semantic search with the precision of traditional database queries.

What's the difference between pre-filtering and post-filtering?

Pre-filtering removes vectors that don't match your metadata conditions before running the vector similarity search. This guarantees exact filter results but can be slower because the vector index may not be optimized for the filtered subset. Post-filtering runs the vector search first, then removes results that don't match filters — faster but may return fewer results than requested. Most modern databases use a hybrid approach that balances accuracy and speed.

Which vector database has the best filtering performance?

Qdrant consistently leads in filtering benchmarks due to its dedicated payload indexes and Rust-based implementation. It applies filters before the expensive vector comparison step, so highly selective filters (matching <1% of vectors) actually speed up the query. Weaviate is a close second with its inverted index approach to filtering. Pinecone performs well for simple filters but can slow down with complex boolean combinations.

Can I do hybrid search (vector + keyword + filters) in one query?

Yes — Weaviate, Qdrant, and Milvus all support combining vector similarity search, BM25 keyword search, and metadata filtering in a single query. Weaviate has the most mature hybrid search implementation with configurable weighting between vector and keyword scores. Pinecone supports hybrid search with sparse-dense vectors. Chroma currently focuses on vector + filter without native BM25 keyword support.