Reliable LLM
ReliableLLM provides automatic failover between primary and secondary LLM providers, ensuring your application stays operational even when one provider fails.
Why Use ReliableLLM?
LLM APIs can fail for various reasons: rate limits, network issues, temporary outages, or service disruptions. ReliableLLM automatically handles these scenarios by failing over to a backup provider, ensuring your application remains resilient.
Key Benefits
- Automatic Failover: Seamlessly switches to secondary provider if primary fails
- Built-in Retry Logic: Exponential backoff with configurable retry attempts
- Provider Validation: Validates both providers during initialization
- Transparent Operation: Returns which provider was used for debugging
- Zero Downtime: Keeps your application running even during provider issues
How It Works
ReliableLLM implements a simple yet effective failover pattern:
Try Primary Provider
Attempts to generate response using the primary LLM provider
Automatic Failover
If primary fails, automatically switches to secondary provider
Return Response
Returns the response along with metadata about which provider was used
Basic Usage
Creating a ReliableLLM instance is straightforward - you just need two LLM providers:
from SimplerLLM.language.llm import LLM, LLMProvider
from SimplerLLM.language.llm.reliable import ReliableLLM
# Create primary and secondary LLM instances
primary_llm = LLM.create(
provider=LLMProvider.OPENAI,
model_name="gpt-4o"
)
secondary_llm = LLM.create(
provider=LLMProvider.ANTHROPIC,
model_name="claude-3-5-sonnet-20241022"
)
# Create ReliableLLM with automatic failover
reliable_llm = ReliableLLM(primary_llm, secondary_llm)
# Generate response - automatically fails over if needed
response = reliable_llm.generate_response(
prompt="Explain machine learning in simple terms"
)
print(response)
How This Helps
If OpenAI is down or rate-limited, ReliableLLM automatically switches to Anthropic Claude without your code needing to handle the error. Your users won't experience downtime.
Configuration Options
ReliableLLM supports several configuration options to customize its behavior:
With Retry Configuration
reliable_llm = ReliableLLM(
primary_llm,
secondary_llm,
max_retries=3, # Number of retry attempts
initial_delay=1.0, # Initial delay between retries (seconds)
exponential_base=2.0 # Exponential backoff multiplier
)
With Validation
# Validate both providers during initialization
reliable_llm = ReliableLLM(
primary_llm,
secondary_llm,
validate_on_init=True # Test both providers at creation time
)
Parameter Reference
primary_llm (required)
The primary LLM instance to use first
secondary_llm (required)
The backup LLM instance to use if primary fails
max_retries (optional, default: 3)
Maximum number of retry attempts per provider
initial_delay (optional, default: 1.0)
Initial delay in seconds before first retry
exponential_base (optional, default: 2.0)
Multiplier for exponential backoff (delay = initial_delay * base^attempt)
validate_on_init (optional, default: False)
Test both providers with a simple request during initialization
Advanced Usage
Getting Provider Information
ReliableLLM returns metadata about which provider was used:
response, provider_used, model_used = reliable_llm.generate_response(
prompt="What is quantum computing?",
return_metadata=True
)
print(f"Response: {response}")
print(f"Provider: {provider_used.name}") # e.g., "OPENAI"
print(f"Model: {model_used}") # e.g., "gpt-4o"
Multiple Failover Providers
You can chain multiple ReliableLLM instances for multi-level failover:
# Create three different providers
openai_llm = LLM.create(provider=LLMProvider.OPENAI, model_name="gpt-4o")
anthropic_llm = LLM.create(provider=LLMProvider.ANTHROPIC, model_name="claude-3-5-sonnet-20241022")
gemini_llm = LLM.create(provider=LLMProvider.GEMINI, model_name="gemini-1.5-pro")
# First level: OpenAI -> Anthropic
reliable_llm_1 = ReliableLLM(openai_llm, anthropic_llm)
# Second level: (OpenAI/Anthropic) -> Gemini
reliable_llm_2 = ReliableLLM(reliable_llm_1, gemini_llm)
# Will try OpenAI -> Anthropic -> Gemini
response = reliable_llm_2.generate_response(
prompt="Explain neural networks"
)
Pro Tip
Use providers with different strengths and pricing models. For example: OpenAI (fast, expensive) → Anthropic (balanced) → Gemini (cost-effective) → Ollama (free, local).
With Custom Retry Logic
# Aggressive retry strategy
reliable_llm = ReliableLLM(
primary_llm,
secondary_llm,
max_retries=5, # More retry attempts
initial_delay=0.5, # Shorter initial delay
exponential_base=1.5 # Slower backoff growth
)
# Conservative retry strategy
reliable_llm = ReliableLLM(
primary_llm,
secondary_llm,
max_retries=2, # Fewer retries
initial_delay=2.0, # Longer initial delay
exponential_base=3.0 # Faster backoff growth
)
Use Cases
Production Applications
Ensure your production services remain available even when one LLM provider experiences issues. Critical for customer-facing applications.
Cost Optimization
Use an expensive, high-quality model as primary and a cheaper model as backup. Most requests use the primary, but you have a fallback if needed.
Rate Limit Management
Automatically switch to a secondary provider when hitting rate limits on the primary, distributing load across multiple providers.
Hybrid Cloud/Local Deployment
Use cloud providers as primary and local models (Ollama) as fallback for privacy-sensitive operations or offline scenarios.
Complete Example: Production Setup
Here's a production-ready example with logging and error handling:
from SimplerLLM.language.llm import LLM, LLMProvider
from SimplerLLM.language.llm.reliable import ReliableLLM
import logging
# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
def create_reliable_llm():
"""Create a production-ready ReliableLLM instance"""
try:
# Primary: OpenAI GPT-4 (high quality)
primary_llm = LLM.create(
provider=LLMProvider.OPENAI,
model_name="gpt-4o",
temperature=0.7
)
logger.info("Primary LLM (OpenAI) initialized")
# Secondary: Anthropic Claude (reliable backup)
secondary_llm = LLM.create(
provider=LLMProvider.ANTHROPIC,
model_name="claude-3-5-sonnet-20241022",
temperature=0.7
)
logger.info("Secondary LLM (Anthropic) initialized")
# Create ReliableLLM with validation
reliable_llm = ReliableLLM(
primary_llm,
secondary_llm,
max_retries=3,
initial_delay=1.0,
exponential_base=2.0,
validate_on_init=True
)
logger.info("ReliableLLM initialized successfully")
return reliable_llm
except Exception as e:
logger.error(f"Failed to initialize ReliableLLM: {e}")
raise
def generate_with_fallback(reliable_llm, prompt):
"""Generate response with automatic failover"""
try:
response, provider, model = reliable_llm.generate_response(
prompt=prompt,
return_metadata=True
)
logger.info(f"Successfully generated response using {provider.name}/{model}")
return response
except Exception as e:
logger.error(f"All providers failed: {e}")
return None
# Usage
if __name__ == "__main__":
# Initialize ReliableLLM
reliable_llm = create_reliable_llm()
# Generate responses
prompts = [
"Explain machine learning",
"What is cloud computing?",
"Describe the benefits of microservices"
]
for prompt in prompts:
response = generate_with_fallback(reliable_llm, prompt)
if response:
print(f"\nPrompt: {prompt}")
print(f"Response: {response[:200]}...")
else:
print(f"\nFailed to generate response for: {prompt}")
Error Handling
ReliableLLM handles errors automatically, but you should still implement proper error handling:
from SimplerLLM.language.llm import LLM, LLMProvider
from SimplerLLM.language.llm.reliable import ReliableLLM
try:
# Create ReliableLLM
primary = LLM.create(provider=LLMProvider.OPENAI, model_name="gpt-4o")
secondary = LLM.create(provider=LLMProvider.ANTHROPIC, model_name="claude-3-5-sonnet-20241022")
reliable_llm = ReliableLLM(primary, secondary)
# Generate response
response = reliable_llm.generate_response(
prompt="Your prompt here"
)
print(f"Success: {response}")
except ValueError as e:
# Configuration errors (invalid models, missing API keys, etc.)
print(f"Configuration error: {e}")
except Exception as e:
# Both providers failed
print(f"All providers failed: {e}")
# Implement your fallback strategy here
# - Queue for later processing
# - Return cached response
# - Show user-friendly error message
When Both Providers Fail
If both primary and secondary fail after all retries, consider these strategies:
- Queue the request for later processing
- Return a cached or default response
- Show a user-friendly error message
- Log the failure for monitoring and alerting
- Try a third provider (if configured)
Best Practices
1. Choose Complementary Providers
Select providers with different strengths and infrastructure to maximize reliability
2. Use Similar Model Capabilities
Ensure primary and secondary models have comparable capabilities to maintain consistent output quality
3. Monitor Provider Usage
Track which provider is being used to identify patterns and potential issues
4. Test Failover Regularly
Periodically test your failover mechanism to ensure it works when needed
5. Configure Appropriate Retry Settings
Balance between retry attempts and response time based on your use case
6. Implement Monitoring and Alerts
Set up monitoring to alert when failover occurs frequently, indicating primary provider issues
What's Next?
Structured Output →
Generate validated JSON responses with Pydantic
Async Support →
Use async/await for better performance
← LLM Interface
Learn about the unified LLM interface
Quick Start Guide →
Step-by-step tutorial for beginners
Need More Help?
Check out our full documentation, join the Discord community, or browse example code on GitHub.