Skip to content

Model API routing and provider wire formats

Model API routing and provider wire formats

This document explains how the extracted @github/copilot CLI bundle decides which model API shape to call after a model is selected. It complements models-providers-auth.md, which focuses on authentication, provider configuration, model selection, and offline mode, and resilience-rate-limits-concurrency.md, which covers retries, rate limits, fallback, and concurrency.

The short version: the CLI normalizes every agent turn into internal messages and tools, then dispatches through a small set of provider adapters. The selected adapter is determined by either GitHub Copilot model metadata or BYOK/custom-provider configuration.

Executive summary

  • In the default GitHub Copilot path, the CLI lists models from the Copilot API and uses each model’s supported_endpoints metadata to choose among Anthropic Messages, OpenAI Responses, WebSocket Responses, or Chat Completions.
  • In BYOK/custom-provider mode, COPILOT_PROVIDER_TYPE chooses the provider family: openai, azure, or anthropic.
  • For openai and azure BYOK providers, COPILOT_PROVIDER_WIRE_API chooses completions or responses; the default is completions.
  • For anthropic BYOK providers, COPILOT_PROVIDER_WIRE_API is ignored and the runtime always uses Anthropic Messages.
  • Regardless of wire format, the adapter normalizes provider responses back into a ChatCompletion-like internal shape before session events, tool handling, telemetry, and UI rendering continue.

Source anchors

app.js is bundled/minified, so semantic aliases below are analysis names. Minified anchors are version-specific lookup anchors for the analyzed @github/copilot 1.0.48 artifact.

AreaSemantic aliasMinified anchorApprox. locationRole
Provider env parserloadCustomProviderConfigFromEnv(...)pQo(...)app.js 6597Reads COPILOT_PROVIDER_*, COPILOT_MODEL, token limits, provider type, wire model, and wire API.
Provider config validatorvalidateCustomProviderConfig(...)mQo(...)app.js 6597Validates model presence, provider type, wire API, token limits, and GPT-5 response-wire recommendations.
Provider type defaultgetProviderType(config)Lcr(...)app.js 3437Defaults provider type to openai.
Wire API defaultgetWireApi(config)sle(...)app.js 3437Defaults wire API to completions.
Model runtime factorycreateModelRuntime(...)vM(...)app.js 3473Chooses built-in Copilot, Anthropic, OpenAI, or custom provider runtime.
Custom provider routercreateCustomProviderRuntime(...)_vs(...)app.js 3473Routes BYOK openai, azure, and anthropic providers to the right adapter.
GitHub Copilot endpoint routerCopilotEndpointRouterO3eapp.js 3472Reads supported_endpoints and chooses Anthropic Messages, Responses, WebSocket Responses, or Chat Completions.
Chat Completions adapterChatCompletionsAdapterU3, M3e, Emtapp.js 3439, 3472Calls chat.completions.create(...) with messages, tools, and streaming options.
Responses HTTP adapterResponsesHttpAdapteryU, xmtapp.js 3460, 3470Calls responses.create(...) with instructions, input, tools, reasoning, and text config.
Responses WebSocket adapterResponsesWebSocketAdapterPmtapp.js 3470Sends a response.create event over a WebSocket Responses session when enabled.
Anthropic Messages adapterAnthropicMessagesAdapterlle, vmtapp.js 3457, 3472Calls messages.create(...) or messages.stream(...) with system, messages, tools, thinking, and beta headers.
Custom OpenAI providerCustomOpenAIProviderFactoryAmtapp.js 3437Creates an OpenAI-compatible client from BYOK base URL, API key, bearer token, headers, and model metadata.
Azure providerAzureProviderFactoryL3eapp.js 3472Creates Azure OpenAI clients using versionless /openai/v1 or versioned /openai plus api-version and deployment routing.
Custom Anthropic providerCustomAnthropicProviderFactoryw3e, Mcrapp.js 3437Creates an Anthropic client from BYOK base URL, API key or bearer token.
Copilot API client wrapperCopilotApiClientWrapperTmt, wmtapp.js 3457Adds Copilot integration, auth, HMAC, session, interaction, feature-assignment, and API-version headers.

High-level routing flow

flowchart TD
Start["Agent turn ready for model call"] --> ProviderMode{"Custom provider config?"}
ProviderMode -->|No| Copilot["GitHub Copilot model path"]
Copilot --> ListModels["List models and read supported_endpoints"]
ListModels --> EndpointChoice{"Endpoint metadata"}
EndpointChoice -->|/v1/messages| CopilotAnthropic["Anthropic Messages adapter"]
EndpointChoice -->|/responses| CopilotResponses["Responses HTTP adapter"]
EndpointChoice -->|ws:/responses and enabled| CopilotWs["Responses WebSocket adapter"]
EndpointChoice -->|fallback| CopilotChat["Chat Completions adapter"]
ProviderMode -->|Yes| Byok["BYOK custom provider path"]
Byok --> ProviderType{"COPILOT_PROVIDER_TYPE"}
ProviderType -->|anthropic| ByokAnthropic["Anthropic Messages adapter"]
ProviderType -->|azure| AzureWire{"COPILOT_PROVIDER_WIRE_API"}
ProviderType -->|openai or unset| OpenAiWire{"COPILOT_PROVIDER_WIRE_API"}
AzureWire -->|responses| AzureResponses["Azure + Responses adapter"]
AzureWire -->|completions or unset| AzureChat["Azure + Chat Completions adapter"]
OpenAiWire -->|responses| OpenAiResponses["OpenAI-compatible Responses adapter"]
OpenAiWire -->|completions or unset| OpenAiChat["OpenAI-compatible Chat Completions adapter"]

Selection inputs

InputApplies toEffect
--modelAll pathsSets the requested session model. It overrides COPILOT_MODEL.
COPILOT_MODELAll pathsDefault model when --model is not provided. In BYOK it can set both model ID and wire model.
GitHub Copilot /models metadataDefault Copilot pathProvides model names, capabilities, token limits, and supported_endpoints.
COPILOT_PROVIDER_BASE_URLBYOK pathActivates custom-provider mode and bypasses GitHub Copilot model routing for model calls.
COPILOT_PROVIDER_TYPEBYOK pathChooses openai, azure, or anthropic; default is openai.
COPILOT_PROVIDER_WIRE_APIBYOK OpenAI/AzureChooses completions or responses; default is completions.
COPILOT_PROVIDER_MODEL_IDBYOK pathInternal model identity used for token limits and model-specific behavior.
COPILOT_PROVIDER_WIRE_MODELBYOK pathModel or deployment name sent to the provider API. Defaults to COPILOT_MODEL or the model ID.
COPILOT_PROVIDER_AZURE_API_VERSIONBYOK AzureSwitches Azure from the versionless /openai/v1 route to a versioned /openai route with api-version.
COPILOT_PROVIDER_MAX_PROMPT_TOKENS, COPILOT_PROVIDER_MAX_OUTPUT_TOKENSBYOK pathOverride built-in/default token limits.
Feature flagsDefault Copilot pathCan enable WebSocket Responses and influence model-specific behavior.

The validator warns when a GPT-5-family BYOK model uses the default Chat Completions wire API, because those models are expected to work better with COPILOT_PROVIDER_WIRE_API=responses.

Default GitHub Copilot path

When no custom provider is configured, model calls go through the GitHub Copilot API path. The runtime builds a Copilot API client wrapper, lists models, and then chooses an adapter from the selected model’s endpoint metadata.

sequenceDiagram
autonumber
participant Session as Session runtime
participant Client as Copilot API client wrapper
participant Models as Copilot /models API
participant Router as Endpoint router
participant Adapter as Provider adapter
participant API as Model API endpoint
Session->>Client: create with OAuth token or HMAC key
Client->>Models: GET /models with Copilot headers
Models-->>Client: model metadata and supported_endpoints
Client-->>Router: selected model info
Router->>Router: choose /v1/messages, /responses, ws:/responses, or /chat/completions
Session->>Adapter: system prompt, conversation, tools, options
Adapter->>API: provider-specific request
API-->>Adapter: stream or response
Adapter-->>Session: normalized ChatCompletion-like result

The Copilot API client wrapper adds headers such as:

  • Authorization: Bearer ... for OAuth-token paths, or HMAC-related headers for HMAC paths;
  • Copilot-Integration-Id;
  • User-Agent and editor/client version headers;
  • X-GitHub-Api-Version;
  • interaction/session headers such as X-Interaction-Id, X-Interaction-Type, X-Agent-Task-Id, X-Parent-Agent-Id, and X-Client-Session-Id when context exists;
  • feature-assignment context headers when available.

Endpoint selection matrix

PathSelectorEndpoint tag reported internallyRequest shapeNotes
GitHub Copilot Anthropic MessagesModel metadata includes /v1/messages/v1/messagessystem, messages, tools, tool_choice, thinking, output_config, max_tokensUses the Anthropic Messages adapter against the Copilot API client wrapper, not necessarily the public Anthropic URL.
GitHub Copilot Responses HTTPModel metadata includes /responses/responsesinstructions, input, tools, parallel_tool_calls, reasoning, text, store: falseUsed for Responses-capable models and native file/tool-search style items.
GitHub Copilot Responses WebSocketModel metadata includes ws:/responses and the feature path is enabledws:/responsesWebSocket response.create event with Responses-style fieldsUsed for streaming Responses when tools are present and the WebSocket path is enabled. Falls back to HTTP Responses on early failure.
GitHub Copilot Chat CompletionsFallback when newer endpoint metadata is absent/chat/completionsmessages, tools, completion optionsStreaming sets stream: true and stream_options.include_usage: true.
BYOK OpenAI-compatible ChatCOPILOT_PROVIDER_TYPE=openai and COPILOT_PROVIDER_WIRE_API=completions or unset/chat/completionsOpenAI Chat Completions-compatible payloadDefault BYOK route. Works with OpenAI-compatible servers such as Ollama, vLLM, and Foundry Local when they expose a compatible /v1 API.
BYOK OpenAI-compatible ResponsesCOPILOT_PROVIDER_TYPE=openai and COPILOT_PROVIDER_WIRE_API=responses/responsesOpenAI Responses-compatible payloadRecommended by the CLI for GPT-5-family BYOK models.
BYOK Azure ChatCOPILOT_PROVIDER_TYPE=azure and COPILOT_PROVIDER_WIRE_API=completions or unset/chat/completionsAzure OpenAI Chat Completions routeUses versionless /openai/v1 unless an Azure API version is set.
BYOK Azure ResponsesCOPILOT_PROVIDER_TYPE=azure and COPILOT_PROVIDER_WIRE_API=responses/responsesAzure/OpenAI Responses routeUses the Azure client wrapper with the Responses adapter.
BYOK Anthropic MessagesCOPILOT_PROVIDER_TYPE=anthropic/v1/messagesAnthropic Messages payloadIgnores COPILOT_PROVIDER_WIRE_API; uses Anthropic SDK headers including anthropic-version.

Payload mapping by adapter

Chat Completions

The Chat Completions adapter keeps the system prompt as a system-role message at the front of messages.

flowchart LR
Internal["system prompt + conversation + tools"] --> ChatPayload["messages array"]
ChatPayload --> Options["temperature/top_p/frequency_penalty/presence_penalty"]
Options --> Tools["tools and tool_choice"]
Tools --> ChatApi["chat.completions.create"]

Observed request fields include:

FieldMeaning
modelSelected model or BYOK wire model.
messagesInternal system/user/assistant/tool messages converted to chat format.
toolsFunction or custom tool definitions when available.
tool_choiceOptional requested tool policy.
reasoning_effortUsed when a reasoning effort is configured for compatible models.
thinking_budgetAlternative budget-style option used by some model paths.
max_tokensOutput-token cap from provider config, model metadata, or defaults.
streamSet to true for streaming mode.
stream_options.include_usageSet for streaming usage reporting.

The adapter returns a ChatCompletion-style object directly, with additional Copilot fields such as reasoning text, annotations, or usage preserved when present.

OpenAI Responses

The Responses adapter separates the system prompt from the rest of the conversation.

flowchart LR
Internal["internal messages"] --> Split["split first system message"]
Split --> Instructions["instructions"]
Split --> Input["input items"]
Input --> ResponseApi["responses.create"]
Instructions --> ResponseApi
Tools["tools + tool search + custom tools"] --> ResponseApi

Observed request fields include:

FieldMeaning
modelSelected model or BYOK wire model.
instructionsSystem prompt text.
inputUser, assistant, tool, function-call, custom-tool, reasoning, and output items converted to Responses input format.
toolsResponses-style tool definitions.
parallel_tool_callsEnabled when tools are present.
reasoningReasoning configuration derived from effort and model settings.
textOptional response text configuration.
storeSet to false.
includeIncludes reasoning.encrypted_content in observed requests.
streamSet for streaming HTTP Responses.

The adapter maps Responses output items back to assistant messages, tool calls, finish reasons, and usage tokens.

WebSocket Responses

The WebSocket Responses path is a streaming optimization for Responses-capable Copilot models.

sequenceDiagram
autonumber
participant Adapter as WebSocket Responses adapter
participant WS as WebSocket connection
participant API as Copilot Responses service
Adapter->>WS: open or reuse connection
Adapter->>API: response.create event
API-->>Adapter: response.created / deltas / output item events
API-->>Adapter: response.completed
Adapter-->>Adapter: update previous_response_id
Adapter-->>Session: normalized result

If the WebSocket request fails before meaningful streaming output arrives, the adapter disables that WebSocket attempt and falls back to HTTP Responses for the remaining retry path.

Anthropic Messages

The Anthropic Messages adapter maps the system prompt into system and converts the rest into Anthropic messages.

flowchart LR
Internal["system prompt + conversation + tools"] --> AnthropicSplit["system + messages"]
AnthropicSplit --> Thinking["thinking / output_config"]
Thinking --> Beta["beta/cache-control/advisor headers"]
Beta --> MessagesApi["messages.create or messages.stream"]

Observed request fields include:

FieldMeaning
modelSelected Claude-family model or BYOK wire model.
max_tokensOutput-token cap, adjusted for thinking budgets when needed.
systemSystem prompt text, sometimes augmented by advisor-system guidance.
messagesConversation converted to Anthropic message blocks.
toolsAnthropic tool definitions; custom tools and deferred-loading metadata are handled where supported.
tool_choiceOptional tool policy.
temperatureTemperature chosen by reasoning/thinking configuration.
thinkingAnthropic thinking configuration for reasoning-capable paths.
output_configAdditional output/effort configuration for adaptive reasoning paths.

The bundled Anthropic SDK layer also adds anthropic-version: 2023-06-01. Additional beta headers can be added for features such as cache control, deferred tools, or advisor behavior.

BYOK provider details

OpenAI-compatible providers

OpenAI-compatible BYOK mode is activated by COPILOT_PROVIDER_BASE_URL with COPILOT_PROVIDER_TYPE=openai or no provider type. The factory creates an OpenAI-compatible client with the configured base URL, API key, bearer token, and custom request headers.

flowchart TD
Env["COPILOT_PROVIDER_BASE_URL"] --> OpenAI["OpenAI-compatible client"]
OpenAI --> Wire{"COPILOT_PROVIDER_WIRE_API"}
Wire -->|completions or unset| Chat["Chat Completions adapter"]
Wire -->|responses| Responses["Responses adapter"]

Use COPILOT_PROVIDER_WIRE_MODEL when the model name sent to the provider differs from the semantic model ID used by the CLI for limits and model-specific behavior.

Azure OpenAI providers

Azure BYOK mode is activated with COPILOT_PROVIDER_TYPE=azure. The runtime creates an Azure-aware client and then still chooses Chat Completions or Responses from COPILOT_PROVIDER_WIRE_API.

flowchart TD
Azure["Azure provider config"] --> Versioned{"COPILOT_PROVIDER_AZURE_API_VERSION?"}
Versioned -->|No| V1["versionless /openai/v1 route"]
Versioned -->|Yes| Legacy["/openai route + api-version query"]
V1 --> Wire{"wire API"}
Legacy --> Deployment["deployment-aware route"]
Deployment --> Wire
Wire -->|completions| AzureChat["Chat Completions"]
Wire -->|responses| AzureResponses["Responses"]

Observed authentication options include API key, bearer token, and Azure managed identity fallback. For the versioned Azure path, the SDK can insert /deployments/{model-or-deployment} before eligible paths such as /chat/completions.

Anthropic providers

Anthropic BYOK mode is activated with COPILOT_PROVIDER_TYPE=anthropic. The runtime creates an Anthropic client using the configured base URL and API key or bearer token, then uses the Anthropic Messages adapter.

flowchart TD
Anthropic["Anthropic provider config"] --> Client["Anthropic SDK client"]
Client --> Messages["Anthropic Messages adapter"]
Messages --> Api["messages.create / messages.stream"]

The CLI warns that COPILOT_PROVIDER_WIRE_API is ignored for this provider family.

Request lifecycle

sequenceDiagram
autonumber
participant Agent as Agent runtime
participant Prompt as Prompt assembly
participant Adapter as Provider adapter
participant HTTP as HTTP or WebSocket client
participant Session as Session events
Agent->>Prompt: system prompt, messages, tools, attachments, options
Prompt-->>Adapter: normalized internal messages and tool definitions
Adapter->>Adapter: map to selected wire format
Adapter->>HTTP: send non-streaming or streaming request
HTTP-->>Adapter: response objects or streaming deltas
Adapter->>Adapter: normalize usage, finish reason, reasoning, and tool calls
Adapter-->>Session: model_call_success / messages / tool calls

All adapters feed the rest of the runtime through a common normalized output shape. That is why tool orchestration, task handling, telemetry, and UI rendering can remain mostly provider-agnostic even when the network API shape differs.

Streaming behavior

AdapterStreaming callStreaming events handled
Chat Completionschat.completions.create(..., stream: true)Choice deltas, tool-call deltas, reasoning text, annotations, usage.
Responses HTTPresponses.create(..., stream: true)response.created, text deltas, reasoning summary/text deltas, function/custom tool argument deltas, output item events, response.completed.
Responses WebSocketWebSocket response.create eventSame conceptual Responses event stream, plus connection state and previous_response_id tracking.
Anthropic Messagesmessages.stream(...)Content-block starts/stops, text deltas, thinking/reasoning deltas, tool-use deltas, message deltas, usage.

What is not statically recoverable

Static analysis can identify routing logic and payload shapes, but several values are runtime-dependent:

  • the exact Copilot API base URL for GitHub Enterprise or debug overrides;
  • the exact model metadata returned by /models for a user, account, policy, and feature gate state;
  • provider-specific default headers injected by SDKs or network middleware;
  • feature flags controlling WebSocket Responses and model-specific behavior;
  • custom provider behavior behind an OpenAI-compatible endpoint;
  • final prompt content after runtime instructions, hooks, MCP, memory, tools, and session state are applied.

For exact request capture, instrument immediately before the adapter sends the provider request or capture debug/network logs in a controlled environment with secrets redacted.