Definition
A vector database is a purpose-built database system designed to store, index, and query high-dimensional vector embeddings at scale. Unlike traditional databases that search based on exact matches or keyword similarity, vector databases perform approximate nearest neighbor (ANN) search, finding the vectors most similar to a query vector across millions or billions of stored embeddings in milliseconds.
Vector databases use specialized indexing algorithms like HNSW (Hierarchical Navigable Small World) or IVF (Inverted File Index). Pinecone maintains one of the most thorough learning resources on vector search and embeddings, and Weaviate's documentation provides a practical open-source perspective to make similarity search fast and efficient. They also support metadata filtering, allowing queries that combine semantic similarity with traditional attribute-based filters, such as "find the most relevant support article published in the last 30 days."
Why It Matters for Product Managers
Vector databases are a critical infrastructure decision for any product building AI-powered search, recommendations, or RAG features. The choice of vector database affects query latency, cost, scalability, and the quality of similarity results, all of which directly impact the user experience. PMs need to understand the trade-offs between managed services like Pinecone and Weaviate versus self-hosted options like Milvus or pgvector.
From a product roadmap perspective, vector databases determine how quickly the team can iterate on AI features. A well-chosen vector database makes it easy to update embeddings, add new content sources, and experiment with different retrieval strategies. A poor choice creates ongoing operational burden and limits the team ability to improve AI feature quality over time.
How It Works in Practice
- Evaluate requirements. Determine the scale (number of vectors), query latency needs, update frequency, metadata filtering requirements, and budget constraints for the AI feature being built.
- Select the database. Choose between managed cloud services (Pinecone, Weaviate Cloud, Qdrant Cloud), self-hosted options (Milvus, Qdrant, Chroma), or extensions to existing databases (pgvector for PostgreSQL).
- Design the schema. Define the vector dimensions, metadata fields, and namespace or collection structure. Plan how content updates will be reflected in the vector store.
- Implement indexing pipeline. Build an automated pipeline that generates embeddings for new or updated content and upserts them into the vector database along with relevant metadata.
- Optimize query performance. Tune index parameters, implement re-ranking strategies, and add metadata pre-filtering to balance search quality against latency and cost requirements.
Common Pitfalls
- Over-engineering the vector database choice for a small-scale use case. For products with fewer than a million vectors, simpler solutions like pgvector often suffice and avoid the complexity of a separate system.
- Not planning for embedding updates. When the embedding model changes or content is updated, the entire vector store may need to be re-indexed, which requires a clear migration strategy.
- Ignoring metadata filtering needs upfront, which leads to expensive re-architecture when product requirements inevitably demand filtering by date, category, or user permissions.
- Treating vector search results as the final ranking. Adding a re-ranking step after initial retrieval often significantly improves relevance at minimal additional latency cost.
Related Concepts
Vector databases store and index Embeddings to enable the fast similarity search that Retrieval-Augmented Generation (RAG) pipelines depend on. They serve as the retrieval layer between user queries and the Large Language Model (LLM) that generates the final response.