Routing and Fallbacks

MultiRoute sits between your application and multiple AI model providers, routing each request to an appropriate model while handling timeouts, retries, and failover on your behalf.

This guide explains the core concepts so you can design for reliability without hard-coding provider-specific logic into your app.

Routing Basics

When your application sends a request to a /v1 endpoint, MultiRoute:

  1. Identifies the project and API key.
  2. Looks up the active routing configuration for that project and endpoint.
  3. Selects a model and provider according to your configured priorities and weights.
  4. Applies timeouts and retries as needed.
  5. Returns the response (or an error) back to your application.

Routing decisions can depend on:

Model Selection Strategies

Common strategies for choosing models include:

These strategies are expressed in configuration (via /v1/config or the dashboard routing UI), not in application code. This keeps your application logic simple and makes it easy to change models over time.

Timeouts

Every request to an underlying provider is subject to a timeout. The timeout limits how long MultiRoute will wait for a response before treating the attempt as failed.

Timeouts can be:

Choosing timeouts involves a tradeoff between:

In general:

Retries

When a request to a provider fails with a retryable error (such as a transient network issue), MultiRoute may retry the request according to your configured policy.

Retries are usually:

Retries add resilience, but also:

Use retries primarily for transient failures, not as a substitute for fixing invalid requests or quotas.

Failover and Fallbacks

If a provider or model is repeatedly failing or unavailable, MultiRoute can fail over to an alternate model according to your routing configuration.

Fallback behavior typically includes:

You control:

For critical workloads, configure at least one viable fallback model so interruptions from a single provider are minimized.

Consistency and Idempotency

Retries and failovers can result in the same logical request being sent to one or more providers. To make this safe:

MultiRoute focuses on maximizing availability and responsiveness; your application should account for this when building workflows that must be strictly consistent.

Observing Routing Behavior

To understand how routing and fallbacks behave in practice:

Combining clear routing configuration with good observability gives you a predictable and debuggable system, even as you evolve your model mix over time.