Models and reliability
Models and reliability
Authentication, provider selection, wire APIs, resilience, rate limits, usage metrics, quota, and billing.
How this volume fits
flowchart TD Auth[Auth/providers] --> Routing[Model API routing] Routing --> Usage[Usage/quota metrics] Routing --> Resilience[Retries/rate limits/concurrency] Resilience --> Fallback[Fallback and cancellation]Pages
| Page | Why read it | File |
|---|---|---|
| Models, providers, and authentication workflows | Auth manager, login, GitHub tokens, BYOK/custom providers, offline mode, model selection, and effort. | models-providers-auth.md |
| Model API routing and provider wire formats | Routing to Chat Completions, Responses, WebSocket Responses, and Anthropic Messages APIs. | model-api-routing.md |
| Rate limits, concurrency, retries, and error recovery | Retry policy, rate-limit recovery, auto-mode switching, queue pauses, concurrency limits, fallback, and cancellation. | resilience-rate-limits-concurrency.md |
| Usage, quota, and billing metrics | /usage, assistant.usage, session.usage_info, premium/AI-unit metrics, token details, and billing/quota errors. | usage-quota-billing-metrics.md |
Reading guidance
- Auth/provider selection decides where calls go.
- Routing, usage, and resilience describe the lifecycle of each model call.