Best Vector Databases in 2026: What’s Free, What’s Paid, and What’s Fast
Top Vector Databases of 2026: Free, Paid, and Performance Comparison
Updated: January 2026
Vector databases are the default storage layer for modern AI search: RAG, agent memory, semantic search, recommendations, and multimodal retrieval.
But in 2026, “vector database” can mean three very different things:
- Purpose-built vector DBs (Milvus, Qdrant, Weaviate, Pinecone)
- Vector search inside what you already run (Postgres + pgvector, Redis, MongoDB, OpenSearch, Elasticsearch)
- Managed vector search platforms (Azure AI Search, Vertex AI Vector Search, Databricks Mosaic AI Vector Search)
This guide breaks down the best vector databases of 2026, which ones are free vs paid, and what benchmarks suggest about real performance.
It also includes a practical code guide you can adapt to your stack.
Table of contents
- How to choose a vector database in 2026
- Top vector databases comparison table
- Performance comparison (benchmarks that matter)
- Deep dive: best picks by category
- Code guide: ingest + search (copy-ready examples)
- Final checklist before you commit
How to choose a vector database in 2026
Most teams pick a database for “speed”, then discover the real constraints were:
- Filters (tenant_id, access control, time ranges, tags)
- Ingestion under load (updates while serving queries)
- Memory footprint (compression, quantization, storage tiering)
- Hybrid retrieval (keyword + vector + rerank)
- Ops overhead (backups, scaling, upgrades, observability)
Here is the standard retrieval pipeline most production systems end up with:
Your database choice should match where your complexity lives: ops, filters, hybrid search, or scale.
Top vector databases comparison table (2026)
This table focuses on what matters in real projects: cost model, hosting, and “why teams pick it”.
| Database / Service | Free? | Best for | Strengths | Tradeoffs |
|---|---|---|---|---|
| Pinecone | Paid (managed-first) | Fast production launch with minimal ops | Managed experience, easy scaling, clean API | Cost for convenience, managed constraints |
| Milvus (OSS) / Zilliz Cloud (managed) | Milvus: Yes (OSS) / Zilliz: Paid + tiers | Large-scale vector retrieval, enterprise workloads | Strong scaling story, rich ecosystem | Self-hosting adds operational complexity |
| Qdrant | Yes (OSS) + paid cloud | Filter-heavy apps, practical production RAG | Great developer ergonomics, strong filtering, optional quantization | Bench your workload and index choices |
| Weaviate | Yes (OSS) + paid cloud | Teams wanting OSS plus managed options | Flexible collections, compression options (PQ and more) | Configuration choices matter a lot |
| Postgres + pgvector | Yes (OSS) | “Keep it simple” when you already use Postgres | Relational + vectors together, familiar tooling | At big scale, specialized DBs can win on ops/perf |
| MongoDB Atlas Vector Search | Paid (managed) + plans | Vectors next to operational documents | Document model, integrated search workflows | Costs and performance depend on cluster choices |
| Redis Vector Search | Depends on edition/license + paid cloud | Ultra-low latency, “index next to cache” patterns | Multiple vector index types (FLAT, HNSW, SVS-VAMANA) | Licensing and memory economics require planning |
| OpenSearch | Yes (OSS) | Search teams adding vectors to an existing search stack | knn_vector field type, mature search ecosystem | More tuning, search-engine style ops |
| Elasticsearch | Paid distributions (varies) | Enterprise search with vectors built in | dense_vector field for kNN workflows | Cost and licensing depend on deployment model |
| Azure AI Search (Vector + Hybrid) | Paid (managed) | Hybrid retrieval (keyword + vector) at enterprise scale | Built-in hybrid patterns (RRF), tight Azure integration | Azure-native workflow, index-centric model |
| Vertex AI Vector Search | Paid (managed) | Large-scale managed ANN on Google Cloud | Built on ScaNN, high-scale managed indexing | GCP-native stack considerations |
| Databricks Mosaic AI Vector Search | Paid (managed) | Organizations already living in Databricks | Governance + platform integration | Best if you are already committed to Databricks |
| Chroma | Yes (OSS) + cloud offerings | Local-first development and prototypes | Simple dev experience, popular in AI tooling | Confirm scaling and ops needs for production |
| LanceDB | Yes (OSS) + cloud offerings | Embedded, local tables, offline workflows | Apache 2.0 OSS, good local persistence story | Evaluate for multi-tenant, high concurrency production |
| Vespa | Yes (OSS) | Search + recommendation systems with complex ranking | Powerful ranking and retrieval stack | Steeper learning curve than “pure VDBs” |
Tip: if you need a “one sentence rule”, use this:
If you want the lowest ops burden, pick a managed vector DB or managed search platform.
If you want maximum control and predictable infra, pick an OSS vector DB.
If your company already runs Postgres, MongoDB, Elastic/OpenSearch, or Redis deeply, start there and prove you need to add another system.
Performance comparison (benchmarks that matter)
Vector performance is not one number. The same database can win on one workload and lose on another depending on:
- index type and parameters
- target recall
- filter selectivity and complexity
- ingestion happening at the same time as queries
- hardware and cost constraints
A useful public reference point is the VDBBench leaderboard, which reports metrics like P99 latency and QPS under defined setups.
The values below are copied from their leaderboard view for a fixed monthly cost and dataset size.
1) Latency and QPS at a fixed budget
Scenario: “Vector Search Latency and QPS at $1,000 Monthly Cost” on a 1M dataset
| System (as listed) | P99 latency (ms) | QPS |
|---|---|---|
| ZillizCloud-8cu-perf | 2.5 | 9704.42 |
| Milvus-16c64g-sq8 | 2.2 | 3465.17 |
| OpenSearch-16c128g-force_merge | 7.2 | 3055.01 |
| ElasticCloud-8c60g-force_merge | 11.3 | 1925.3 |
| QdrantCloud-16c64g | 6.4 | 1242.43 |
| Pinecone-p2.x8-1node | 13.7 | 1146.53 |
What to take from this:
- This is one workload slice, not a universal ranking.
- Budget constraints can reshuffle winners compared to “unlimited hardware” benchmarks.
- P99 latency matters more than average latency for user-facing apps.
Reference: VDBBench leaderboard
2) Streaming performance (search while ingesting)
If you update your index constantly (new docs, chat history, product catalog changes), streaming performance is often the deciding factor.
This table summarizes “Streaming Performance” values shown for a 10M dataset under constant ingestion.
| System (as listed) | Static QPS | QPS @ 500 rows/s ingestion | QPS @ 1000 rows/s ingestion |
|---|---|---|---|
| ZillizCloud (8cu-perf) | 3957 | 2119 | 1860 |
| Pinecone (p2.x8-1node) | 1131 | 367.4 | 369.7 |
| OpenSearch (16c128g) | 505.7 | 161.7 | 149.7 |
| QdrantCloud (16c64g) | 446.9 | 393.8 | 347.6 |
| Milvus (16c64g-sq8) | 437.2 | 306 | 156 |
| ElasticCloud (8c60g) | 376.4 | 61.67 | 61.82 |
Reference: VDBBench leaderboard
Deep dive: best picks by category
Best managed-first vector DB
- Pinecone: great when you want production speed and minimal cluster work.
Best open-source vector DBs (general purpose)
- Milvus: strong scaling story and ecosystem, good for big deployments.
- Qdrant: excellent for metadata filtering and practical production usage.
- Weaviate: flexible collections and compression options like PQ to reduce memory usage.
Best “use what you already run” options
- Postgres + pgvector: best when “one database” simplicity matters.
- MongoDB Atlas Vector Search: best if your app is document-first.
- Redis Vector Search: best for low-latency retrieval near cache and sessions.
- OpenSearch / Elasticsearch: best for search teams that already operate a search stack.
Best managed search platforms with vectors
- Azure AI Search: great for hybrid retrieval (keyword + vector), enterprise patterns.
- Vertex AI Vector Search: strong managed ANN on Google Cloud (ScaNN-based).
- Databricks Mosaic AI Vector Search: ideal if Databricks is your data home.
Best local-first / embedded
- Chroma: easy dev workflow and popular in AI tooling.
- LanceDB: strong local persistence story and OSS.
Code guide: ingest + search (copy-ready examples)
Below is a minimal, practical setup:
- Generate embeddings (one-time per chunk).
- Store vectors + metadata.
- Query by vector, apply filters, return top-k.
Step 0: Create embeddings (example in Python)
Use any embedding model you like. This example uses sentence-transformers.
# pip install sentence-transformers
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
texts = [
"Refund policy for annual subscriptions",
"How to reset your password",
"Troubleshooting login issues",
]
vectors = model.encode(texts, normalize_embeddings=True).tolist()
Pinecone (managed)
# pip install pinecone
import os
from pinecone import Pinecone, ServerlessSpec, CloudProvider, AwsRegion, VectorType
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
index_config = pc.create_index(
name="docs-index",
dimension=384,
spec=ServerlessSpec(cloud=CloudProvider.AWS, region=AwsRegion.US_EAST_1),
vector_type=VectorType.DENSE,
)
idx = pc.Index(host=index_config.host)
# Upsert vectors with metadata
idx.upsert(
vectors=[
("doc-1", vectors[0], {"topic": "billing"}),
("doc-2", vectors[1], {"topic": "account"}),
("doc-3", vectors[2], {"topic": "account"}),
],
namespace="kb",
)
# Query with an optional metadata filter
query_vec = model.encode(["refund for yearly plan"], normalize_embeddings=True).tolist()[0]
res = idx.query(
vector=query_vec,
top_k=5,
include_metadata=True,
filter={"topic": {"$eq": "billing"}},
namespace="kb",
)
print(res)
Qdrant (OSS or Cloud)
# pip install qdrant-client
from qdrant_client import QdrantClient
from qdrant_client.http.models import Distance, VectorParams, PointStruct, Filter, FieldCondition, MatchValue
# For local dev: QdrantClient(":memory:") or QdrantClient(path="qdrant.db")
client = QdrantClient(url=os.environ.get("QDRANT_URL", "http://localhost:6333"),
api_key=os.environ.get("QDRANT_API_KEY"))
collection = "kb"
if not client.collection_exists(collection):
client.create_collection(
collection_name=collection,
vectors_config=VectorParams(size=384, distance=Distance.COSINE),
)
client.upsert(
collection_name=collection,
points=[
PointStruct(id="doc-1", vector=vectors[0], payload={"topic": "billing"}),
PointStruct(id="doc-2", vector=vectors[1], payload={"topic": "account"}),
PointStruct(id="doc-3", vector=vectors[2], payload={"topic": "account"}),
],
)
query_vec = model.encode(["refund for yearly plan"], normalize_embeddings=True).tolist()[0]
hits = client.search(
collection_name=collection,
query_vector=query_vec,
limit=5,
query_filter=Filter(
must=[FieldCondition(key="topic", match=MatchValue(value="billing"))]
),
)
for h in hits:
print(h.id, h.score, h.payload)
Weaviate (OSS or Cloud)
# pip install -U weaviate-client
import os
import weaviate
from weaviate.classes.config import Configure
weaviate_url = os.environ["WEAVIATE_URL"]
weaviate_api_key = os.environ["WEAVIATE_API_KEY"]
with weaviate.connect_to_weaviate_cloud(
cluster_url=weaviate_url,
auth_credentials=weaviate_api_key,
) as client:
# Create a collection (example uses a built-in vectorizer option; you can also import your own vectors)
if not client.collections.exists("Movie"):
client.collections.create(
name="Movie",
vector_config=Configure.Vectors.text2vec_weaviate(),
)
movies = client.collections.use("Movie")
with movies.batch.fixed_size(batch_size=200) as batch:
batch.add_object(properties={
"title": "Refund policy",
"description": "Refund policy for annual subscriptions",
"topic": "billing",
})
# Search example (vectorize query depending on your configured vectorizer)
# For custom vectors, store your own embedding and query by that embedding.
Postgres + pgvector (free and very practical)
-- 1) Install extension (varies by hosting)
CREATE EXTENSION IF NOT EXISTS vector;
-- 2) Table with an embedding column
CREATE TABLE IF NOT EXISTS kb_docs (
id TEXT PRIMARY KEY,
content TEXT NOT NULL,
topic TEXT NOT NULL,
embedding vector(384) NOT NULL
);
-- 3) Create an ANN index (choose HNSW or IVFFlat)
-- HNSW example:
CREATE INDEX IF NOT EXISTS kb_docs_embedding_hnsw
ON kb_docs
USING hnsw (embedding vector_cosine_ops);
-- 4) Query: order by cosine distance (lower is more similar)
-- Replace :query_embedding with your 384-d vector
SELECT id, content, topic
FROM kb_docs
WHERE topic = 'billing'
ORDER BY embedding <=> :query_embedding
LIMIT 5;
OpenSearch (if you already run it)
PUT my-index
{
"settings": { "index": { "knn": true } },
"mappings": {
"properties": {
"embedding": {
"type": "knn_vector",
"dimension": 384
},
"topic": { "type": "keyword" },
"content": { "type": "text" }
}
}
}
Final checklist before you commit
- Do you need strict multi-tenancy? Confirm filter performance and isolation strategy.
- Do you ingest continuously? Benchmark “search while ingesting”, not just static search.
- How big is your metadata? Some systems are fast until filters get complex.
- Do you need hybrid search? If yes, a search platform (Azure AI Search / Elastic / OpenSearch) can simplify.
- Do you want to run it yourself? If not, managed-first usually wins time-to-production.
If you want one safe, boring recommendation:
- Enterprise managed: Azure AI Search (hybrid) or Vertex AI Vector Search (GCP) depending on your cloud.
- Managed vector DB: Pinecone or a managed Milvus option.
- Open source: Qdrant, Weaviate, or Milvus.
- Simplest path: Postgres + pgvector if your scale is not extreme and you want fewer moving parts.
Happy benchmarking.
Author update
I will add live benchmarks as newer vector DB versions ship. If you want a comparison on your workload shape, share the index size and query pattern.

