Reliable LLM

ReliableLLM provides automatic failover between primary and secondary LLM providers, ensuring your application stays operational even when one provider fails.

Why Use ReliableLLM?

LLM APIs can fail for various reasons: rate limits, network issues, temporary outages, or service disruptions. ReliableLLM automatically handles these scenarios by failing over to a backup provider, ensuring your application remains resilient.

Key Benefits

  • Automatic Failover: Seamlessly switches to secondary provider if primary fails
  • Built-in Retry Logic: Exponential backoff with configurable retry attempts
  • Provider Validation: Validates both providers during initialization
  • Transparent Operation: Returns which provider was used for debugging
  • Zero Downtime: Keeps your application running even during provider issues

How It Works

ReliableLLM implements a simple yet effective failover pattern:

1

Try Primary Provider

Attempts to generate response using the primary LLM provider

2

Automatic Failover

If primary fails, automatically switches to secondary provider

3

Return Response

Returns the response along with metadata about which provider was used

Basic Usage

Creating a ReliableLLM instance is straightforward - you just need two LLM providers:

from SimplerLLM.language.llm import LLM, LLMProvider
from SimplerLLM.language.llm.reliable import ReliableLLM

# Create primary and secondary LLM instances
primary_llm = LLM.create(
    provider=LLMProvider.OPENAI,
    model_name="gpt-4o"
)

secondary_llm = LLM.create(
    provider=LLMProvider.ANTHROPIC,
    model_name="claude-3-5-sonnet-20241022"
)

# Create ReliableLLM with automatic failover
reliable_llm = ReliableLLM(primary_llm, secondary_llm)

# Generate response - automatically fails over if needed
response = reliable_llm.generate_response(
    prompt="Explain machine learning in simple terms"
)

print(response)

How This Helps

If OpenAI is down or rate-limited, ReliableLLM automatically switches to Anthropic Claude without your code needing to handle the error. Your users won't experience downtime.

Configuration Options

ReliableLLM supports several configuration options to customize its behavior:

With Retry Configuration

reliable_llm = ReliableLLM(
    primary_llm,
    secondary_llm,
    max_retries=3,           # Number of retry attempts
    initial_delay=1.0,       # Initial delay between retries (seconds)
    exponential_base=2.0     # Exponential backoff multiplier
)

With Validation

# Validate both providers during initialization
reliable_llm = ReliableLLM(
    primary_llm,
    secondary_llm,
    validate_on_init=True    # Test both providers at creation time
)

Parameter Reference

primary_llm (required)

The primary LLM instance to use first

secondary_llm (required)

The backup LLM instance to use if primary fails

max_retries (optional, default: 3)

Maximum number of retry attempts per provider

initial_delay (optional, default: 1.0)

Initial delay in seconds before first retry

exponential_base (optional, default: 2.0)

Multiplier for exponential backoff (delay = initial_delay * base^attempt)

validate_on_init (optional, default: False)

Test both providers with a simple request during initialization

Advanced Usage

Getting Provider Information

ReliableLLM returns metadata about which provider was used:

response, provider_used, model_used = reliable_llm.generate_response(
    prompt="What is quantum computing?",
    return_metadata=True
)

print(f"Response: {response}")
print(f"Provider: {provider_used.name}")  # e.g., "OPENAI"
print(f"Model: {model_used}")              # e.g., "gpt-4o"

Multiple Failover Providers

You can chain multiple ReliableLLM instances for multi-level failover:

# Create three different providers
openai_llm = LLM.create(provider=LLMProvider.OPENAI, model_name="gpt-4o")
anthropic_llm = LLM.create(provider=LLMProvider.ANTHROPIC, model_name="claude-3-5-sonnet-20241022")
gemini_llm = LLM.create(provider=LLMProvider.GEMINI, model_name="gemini-1.5-pro")

# First level: OpenAI -> Anthropic
reliable_llm_1 = ReliableLLM(openai_llm, anthropic_llm)

# Second level: (OpenAI/Anthropic) -> Gemini
reliable_llm_2 = ReliableLLM(reliable_llm_1, gemini_llm)

# Will try OpenAI -> Anthropic -> Gemini
response = reliable_llm_2.generate_response(
    prompt="Explain neural networks"
)

Pro Tip

Use providers with different strengths and pricing models. For example: OpenAI (fast, expensive) → Anthropic (balanced) → Gemini (cost-effective) → Ollama (free, local).

With Custom Retry Logic

# Aggressive retry strategy
reliable_llm = ReliableLLM(
    primary_llm,
    secondary_llm,
    max_retries=5,          # More retry attempts
    initial_delay=0.5,      # Shorter initial delay
    exponential_base=1.5    # Slower backoff growth
)

# Conservative retry strategy
reliable_llm = ReliableLLM(
    primary_llm,
    secondary_llm,
    max_retries=2,          # Fewer retries
    initial_delay=2.0,      # Longer initial delay
    exponential_base=3.0    # Faster backoff growth
)

Use Cases

Production Applications

Ensure your production services remain available even when one LLM provider experiences issues. Critical for customer-facing applications.

Cost Optimization

Use an expensive, high-quality model as primary and a cheaper model as backup. Most requests use the primary, but you have a fallback if needed.

Rate Limit Management

Automatically switch to a secondary provider when hitting rate limits on the primary, distributing load across multiple providers.

Hybrid Cloud/Local Deployment

Use cloud providers as primary and local models (Ollama) as fallback for privacy-sensitive operations or offline scenarios.

Complete Example: Production Setup

Here's a production-ready example with logging and error handling:

from SimplerLLM.language.llm import LLM, LLMProvider
from SimplerLLM.language.llm.reliable import ReliableLLM
import logging

# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def create_reliable_llm():
    """Create a production-ready ReliableLLM instance"""
    try:
        # Primary: OpenAI GPT-4 (high quality)
        primary_llm = LLM.create(
            provider=LLMProvider.OPENAI,
            model_name="gpt-4o",
            temperature=0.7
        )
        logger.info("Primary LLM (OpenAI) initialized")

        # Secondary: Anthropic Claude (reliable backup)
        secondary_llm = LLM.create(
            provider=LLMProvider.ANTHROPIC,
            model_name="claude-3-5-sonnet-20241022",
            temperature=0.7
        )
        logger.info("Secondary LLM (Anthropic) initialized")

        # Create ReliableLLM with validation
        reliable_llm = ReliableLLM(
            primary_llm,
            secondary_llm,
            max_retries=3,
            initial_delay=1.0,
            exponential_base=2.0,
            validate_on_init=True
        )
        logger.info("ReliableLLM initialized successfully")

        return reliable_llm

    except Exception as e:
        logger.error(f"Failed to initialize ReliableLLM: {e}")
        raise

def generate_with_fallback(reliable_llm, prompt):
    """Generate response with automatic failover"""
    try:
        response, provider, model = reliable_llm.generate_response(
            prompt=prompt,
            return_metadata=True
        )

        logger.info(f"Successfully generated response using {provider.name}/{model}")
        return response

    except Exception as e:
        logger.error(f"All providers failed: {e}")
        return None

# Usage
if __name__ == "__main__":
    # Initialize ReliableLLM
    reliable_llm = create_reliable_llm()

    # Generate responses
    prompts = [
        "Explain machine learning",
        "What is cloud computing?",
        "Describe the benefits of microservices"
    ]

    for prompt in prompts:
        response = generate_with_fallback(reliable_llm, prompt)
        if response:
            print(f"\nPrompt: {prompt}")
            print(f"Response: {response[:200]}...")
        else:
            print(f"\nFailed to generate response for: {prompt}")

Error Handling

ReliableLLM handles errors automatically, but you should still implement proper error handling:

from SimplerLLM.language.llm import LLM, LLMProvider
from SimplerLLM.language.llm.reliable import ReliableLLM

try:
    # Create ReliableLLM
    primary = LLM.create(provider=LLMProvider.OPENAI, model_name="gpt-4o")
    secondary = LLM.create(provider=LLMProvider.ANTHROPIC, model_name="claude-3-5-sonnet-20241022")

    reliable_llm = ReliableLLM(primary, secondary)

    # Generate response
    response = reliable_llm.generate_response(
        prompt="Your prompt here"
    )

    print(f"Success: {response}")

except ValueError as e:
    # Configuration errors (invalid models, missing API keys, etc.)
    print(f"Configuration error: {e}")

except Exception as e:
    # Both providers failed
    print(f"All providers failed: {e}")
    # Implement your fallback strategy here
    # - Queue for later processing
    # - Return cached response
    # - Show user-friendly error message

When Both Providers Fail

If both primary and secondary fail after all retries, consider these strategies:

  • Queue the request for later processing
  • Return a cached or default response
  • Show a user-friendly error message
  • Log the failure for monitoring and alerting
  • Try a third provider (if configured)

Best Practices

1. Choose Complementary Providers

Select providers with different strengths and infrastructure to maximize reliability

2. Use Similar Model Capabilities

Ensure primary and secondary models have comparable capabilities to maintain consistent output quality

3. Monitor Provider Usage

Track which provider is being used to identify patterns and potential issues

4. Test Failover Regularly

Periodically test your failover mechanism to ensure it works when needed

5. Configure Appropriate Retry Settings

Balance between retry attempts and response time based on your use case

6. Implement Monitoring and Alerts

Set up monitoring to alert when failover occurs frequently, indicating primary provider issues

What's Next?

Need More Help?

Check out our full documentation, join the Discord community, or browse example code on GitHub.