Vector Databases
Store, index, and query embeddings at scale using SimplerLLM's unified vector database interface.
What Are Vector Databases?
Vector databases are specialized databases designed to store and efficiently search high-dimensional vectors (embeddings). They enable:
- Similarity Search: Find semantically similar content in milliseconds
- Scalability: Handle millions of vectors efficiently
- Metadata Filtering: Combine vector similarity with traditional filters
- Real-time Updates: Add, update, or delete vectors on the fly
- RAG Applications: Power retrieval-augmented generation systems
Supported Vector Databases
Local (In-Memory)
Simple in-memory vector storage
- • No external dependencies
- • Perfect for prototyping and small datasets
- • Fast for development
Qdrant
Production-ready vector database
- • Scalable and performant
- • Advanced filtering capabilities
- • Cloud or self-hosted
Quick Start: Local Vector Database
The local vector database is perfect for development and testing:
from SimplerLLM.vectors.vector_db import VectorDB
from SimplerLLM.language.embeddings import EmbeddingsOpenAI
# Create embeddings instance
embeddings = EmbeddingsOpenAI()
# Create local vector database
vector_db = VectorDB.create(provider='local', embeddings_instance=embeddings)
# Add documents
documents = [
"SimplerLLM makes AI development easy",
"Python is a popular programming language",
"Vector databases enable semantic search"
]
# Store documents with embeddings
for doc in documents:
embedding = embeddings.generate_embeddings(doc)
vector_db.add(
vector=embedding,
metadata={'text': doc}
)
# Search for similar documents
query = "How to build AI applications?"
query_embedding = embeddings.generate_embeddings(query)
results = vector_db.search(query_embedding, top_k=2)
for result in results:
print(f"Text: {result['metadata']['text']}")
print(f"Score: {result['score']:.4f}\n")
Qdrant Vector Database
Qdrant is a production-ready vector database with advanced features:
Setup
First, install the Qdrant client:
pip install qdrant-client
Run Qdrant locally with Docker:
docker run -p 6333:6333 qdrant/qdrant
Basic Usage
from SimplerLLM.vectors.vector_db import VectorDB
from SimplerLLM.language.embeddings import EmbeddingsOpenAI
# Create embeddings instance
embeddings = EmbeddingsOpenAI()
# Create Qdrant vector database
vector_db = VectorDB.create(
provider='qdrant',
embeddings_instance=embeddings,
collection_name='my_documents',
url='http://localhost:6333' # Qdrant server URL
)
# Add documents
documents = [
{"text": "Machine learning is a subset of AI", "category": "AI"},
{"text": "Python is great for data science", "category": "Programming"},
{"text": "Vector search enables semantic similarity", "category": "Database"}
]
for doc in documents:
embedding = embeddings.generate_embeddings(doc['text'])
vector_db.add(
vector=embedding,
metadata=doc
)
# Search with metadata filtering
query = "artificial intelligence applications"
query_embedding = embeddings.generate_embeddings(query)
results = vector_db.search(
query_embedding,
top_k=3,
filter_dict={'category': 'AI'} # Filter by category
)
for result in results:
print(f"Text: {result['metadata']['text']}")
print(f"Category: {result['metadata']['category']}")
print(f"Score: {result['score']:.4f}\n")
Advanced Features
Batch Operations
Efficiently add multiple vectors at once:
from SimplerLLM.vectors.vector_db import VectorDB
from SimplerLLM.language.embeddings import EmbeddingsOpenAI
embeddings = EmbeddingsOpenAI()
vector_db = VectorDB.create(provider='qdrant', embeddings_instance=embeddings)
# Prepare batch data
documents = [
"Document 1 about machine learning",
"Document 2 about data science",
"Document 3 about artificial intelligence"
]
# Generate embeddings in batch
vectors = [embeddings.generate_embeddings(doc) for doc in documents]
# Add all vectors at once
for i, (vector, doc) in enumerate(zip(vectors, documents)):
vector_db.add(
vector=vector,
metadata={'id': i, 'text': doc}
)
Metadata Filtering
Combine vector similarity with metadata filters:
# Add documents with rich metadata
docs_with_metadata = [
{
"text": "Introduction to machine learning",
"author": "John Doe",
"date": "2024-01-15",
"tags": ["ML", "beginner"]
},
{
"text": "Advanced deep learning techniques",
"author": "Jane Smith",
"date": "2024-02-20",
"tags": ["DL", "advanced"]
}
]
for doc in docs_with_metadata:
embedding = embeddings.generate_embeddings(doc['text'])
vector_db.add(vector=embedding, metadata=doc)
# Search with multiple filters
query_embedding = embeddings.generate_embeddings("learning algorithms")
results = vector_db.search(
query_embedding,
top_k=5,
filter_dict={
'author': 'John Doe',
'tags': ['ML']
}
)
Updating and Deleting Vectors
# Update existing vector
vector_db.update(
vector_id="doc_123",
vector=new_embedding,
metadata={'text': 'Updated document text', 'updated_at': '2024-03-01'}
)
# Delete vector
vector_db.delete(vector_id="doc_456")
# Delete by filter
vector_db.delete_by_filter({'category': 'outdated'})
Real-World Example: RAG System
Build a Retrieval-Augmented Generation (RAG) system:
from SimplerLLM.vectors.vector_db import VectorDB
from SimplerLLM.language.embeddings import EmbeddingsOpenAI
from SimplerLLM.language.llm import LLM, LLMProvider
class RAGSystem:
def __init__(self):
# Initialize components
self.embeddings = EmbeddingsOpenAI()
self.vector_db = VectorDB.create(
provider='qdrant',
embeddings_instance=self.embeddings,
collection_name='knowledge_base'
)
self.llm = LLM.create(
provider=LLMProvider.OPENAI,
model_name="gpt-4o"
)
def add_knowledge(self, documents):
"""Add documents to knowledge base"""
for doc in documents:
embedding = self.embeddings.generate_embeddings(doc)
self.vector_db.add(
vector=embedding,
metadata={'text': doc}
)
def query(self, question, top_k=3):
"""Answer question using RAG"""
# 1. Find relevant documents
query_embedding = self.embeddings.generate_embeddings(question)
results = self.vector_db.search(query_embedding, top_k=top_k)
# 2. Build context from results
context = "\n\n".join([
f"Document {i+1}: {r['metadata']['text']}"
for i, r in enumerate(results)
])
# 3. Generate answer with context
prompt = f"""Based on the following context, answer the question.
Context:
{context}
Question: {question}
Answer:"""
answer = self.llm.generate_response(prompt=prompt)
return answer, results
# Usage
rag = RAGSystem()
# Add knowledge to the system
knowledge = [
"SimplerLLM is a Python library for working with LLMs",
"SimplerLLM supports multiple providers including OpenAI and Anthropic",
"Vector databases enable semantic search in RAG systems"
]
rag.add_knowledge(knowledge)
# Query the system
question = "What is SimplerLLM?"
answer, sources = rag.query(question)
print(f"Answer: {answer}\n")
print("Sources used:")
for i, source in enumerate(sources, 1):
print(f"{i}. {source['metadata']['text']}")
Performance Optimization
1. Choose the Right Index
Different index types (HNSW, IVF, etc.) offer tradeoffs between speed and accuracy. Configure based on your needs.
2. Batch Operations
Use batch operations when adding or updating multiple vectors to reduce overhead.
3. Filter Early
Apply metadata filters before vector search when possible to reduce the search space.
4. Monitor Memory Usage
Vector databases can be memory-intensive. Monitor usage and scale appropriately for production.
Best Practices
Development vs Production
- Development: Use local vector database for rapid prototyping and testing
- Production: Use Qdrant or other production databases for scalability and reliability
- Consistency: Use the same embedding model throughout your application
- Indexing: Create appropriate indexes on metadata fields used for filtering
- Backups: Regularly backup your vector database in production environments
Error Handling
from SimplerLLM.vectors.vector_db import VectorDB
from SimplerLLM.language.embeddings import EmbeddingsOpenAI
embeddings = EmbeddingsOpenAI()
try:
# Create vector database
vector_db = VectorDB.create(
provider='qdrant',
embeddings_instance=embeddings,
url='http://localhost:6333'
)
# Add vector
embedding = embeddings.generate_embeddings("Sample text")
vector_db.add(vector=embedding, metadata={'text': 'Sample text'})
# Search
results = vector_db.search(embedding, top_k=5)
print(f"Found {len(results)} results")
except ConnectionError as e:
print(f"Failed to connect to vector database: {e}")
print("Make sure Qdrant is running on localhost:6333")
except ValueError as e:
print(f"Invalid parameters: {e}")
except Exception as e:
print(f"Error: {e}")
Choosing a Vector Database
Decision Guide
Use Local when:
- • Prototyping or development
- • Working with small datasets (<10k vectors)
- • No persistence required
Use Qdrant when:
- • Building production applications
- • Need to scale beyond memory limits
- • Require advanced filtering and search
- • Need persistence and reliability
Next Steps
📚 Additional Resources
- • Qdrant Documentation
- • Introduction to Vector Databases
- • Check provider documentation for specific configuration options