VectorDatabase
class serves as the abstract foundation for all vector database implementations in Superlinked. It defines the interface that concrete vector database implementations must follow to provide persistent storage and retrieval of vector embeddings.
Constructor
Architecture
TheVectorDatabase
class ensures that any concrete implementation provides a connector to the underlying vector database through the _vdb_connector
property.
Attributes
An abstract property that concrete implementations must override to return an instance of a VDBConnector for the specific database type.
Inheritance Hierarchy
TheVectorDatabase
class serves as the base for all vector database implementations:
Inheritance Chain: VectorDatabase
→ ABC
+ Generic
Available Implementations
Production Databases
Production Databases
- QdrantVectorDatabase - Qdrant vector database integration
- RedisVectorDatabase - Redis Stack vector search
- MongoDBVectorDatabase - MongoDB Atlas Vector Search
Development & Testing
Development & Testing
- InMemoryVectorDatabase - In-memory storage for development
- TopKVectorDatabase - Top-K optimization wrapper
Vector Database Features
Vector databases provide several key capabilities:Persistent Storage
- Store vector embeddings with associated metadata
- Maintain data durability across application restarts
- Scale storage capacity based on data volume
Similarity Search
- Efficient nearest neighbor search algorithms
- Support for various distance metrics (cosine, euclidean, etc.)
- Optimized indexing for fast retrieval
Filtering and Querying
- Combine vector similarity with metadata filtering
- Support complex query conditions
- Handle large-scale concurrent queries
Performance Optimization
- Index management for search performance
- Memory and disk optimization strategies
- Clustering and sharding capabilities
Implementation Requirements
Concrete vector database implementations must provide:- Connection Management: Establish and maintain connections to the database
- Vector Operations: Store, update, and retrieve vector embeddings
- Search Functionality: Perform similarity searches with filtering
- Index Management: Create and maintain search indices
- Error Handling: Graceful handling of database errors and timeouts
Database Selection Guide
Production Workloads
Qdrant: Excellent for high-performance vector search with advanced filtering capabilities. Supports both cloud and self-hosted deployments.
Redis: Great for applications already using Redis, providing fast in-memory vector search with persistence options.
MongoDB: Ideal for applications with existing MongoDB infrastructure, offering integrated document and vector search.
Development & Testing
InMemoryVectorDatabase: Perfect for development, testing, and prototyping. No external dependencies required.
TopKVectorDatabase: Useful for scenarios where only the top-K most similar results are needed, providing memory optimization.
Best Practices
Database Configuration
Connection Limits: Configure appropriate connection limits and timeouts based on your expected query volume and latency requirements.
Performance Tuning
Index Strategy: Choose appropriate indexing strategies based on your vector dimensions, data size, and query patterns. Each database provides different indexing algorithms optimized for specific use cases.
Data Management
Backup Strategy: Implement regular backup procedures for production vector databases to prevent data loss and enable disaster recovery.
Integration Pattern
Vector databases integrate into the Superlinked pipeline as storage backends:- Vector Generation: Spaces transform data into vectors
- Index Organization: Indices organize vectors for efficient querying
- Storage: VectorDatabase implementations persist vectors
- Retrieval: Query operations search stored vectors
- Results: Matching vectors are returned with metadata