fix(llm): route non-OpenAI Azure deployments via Chat Completions by akirykouski · Pull Request #29775 · anomalyco/opencode

akirykouski · 2026-05-28T17:38:18Z

Issue for this PR

Also likely fixes #12879 (Kimi K2.5 on Azure Foundry rejected with role: "developer") — same root cause: Azure partner deployments routed to Responses API instead of Chat Completions. Different visible symptom (the Responses API emits role: "developer" for system instructions, which Azure Foundry's chat-completions endpoint rejects). After this change those models route through Chat Completions and emit role: "system". Marking as "likely fixes" rather than "closes" because I reproduced only the truncation symptom; the role-validation symptom would need confirmation from someone with a Kimi K2.5 deployment.

Related: #20078 (LM Studio case has the same shape — limit.output ignored — but a different code path; this PR is the Azure-specific half).

Type of change

Bug fix
New feature
Refactor / code improvement
Documentation

What does this PR do?

Azure AI Foundry hosts two distinct families of model deployments:

OpenAI-native (gpt-*, o-series) which support the Responses API
Partner deployments (DeepSeek, Kimi, Llama, etc.) which only support Chat Completions

packages/llm/src/providers/azure.ts currently routes everything through Responses by default. For partner deployments the request gets accepted at the network layer but two things break:

max_output_tokens is silently dropped during Azure's internal Responses→Chat translation. The underlying chat call uses its default of 4096 and the response comes back with finish_reason: "length" regardless of what limit.output the user set (this is the truncation symptom — see Azure AI Foundry partner deployments (DeepSeek/Kimi/Llama) capped at 4096 output tokens #29776).
System instructions get emitted with role: "developer" (a Responses-API convention). Azure Foundry's chat-completions endpoint only accepts system | user | assistant | tool, so it 422s the whole request (see Bad request when using Kimi K2.5 on Azure Foundry #12879).

Concrete repro of (1) on DeepSeek-V4-Pro (Azure AI Foundry, limit.output: 16384):

max_tokens path	actual `tokens.output`	`finish`
direct curl `/chat/completions` with `max_tokens: 32000`	14001	`stop`
opencode (default Responses routing)	4096	`length`
opencode after this change	14001	`stop`

The fix auto-detects by model id: gpt-* / o1-* / o3-* / o4-* go through Responses; everything else uses Chat. useCompletionUrls: true | false remains as an explicit override either direction, so existing configs aren't affected.

How did you verify your code works?

bun test packages/llm/test/provider/ — 150 pass, 0 fail (new test file packages/llm/test/provider/azure.test.ts covers default routing for OpenAI-native ids, default routing for partner ids, o-series, and both useCompletionUrls overrides)
bun --cwd packages/llm run typecheck — clean
Reproduced the truncation symptom (Azure AI Foundry partner deployments (DeepSeek/Kimi/Llama) capped at 4096 output tokens #29776) on azure/DeepSeek-V4-Pro and confirmed the fix lifts the 4096 cap: model now emits 14k tokens with finish: stop

Screenshots / recordings

N/A (no UI changes).

Checklist

I have tested my changes locally
I have not included unrelated changes in this PR

Azure AI Foundry hosts two distinct families: OpenAI-native deployments (gpt-*, o-series) which speak the Responses API, and partner deployments (DeepSeek, Kimi, Llama, etc.) which only speak Chat Completions. The Azure provider routed every model through Responses by default. For partner deployments this works at the network layer but Azure silently drops max_output_tokens during the Responses-to-Chat translation, capping the underlying call at the chat default (4096 tokens) and producing premature "finish_reason: length" truncations regardless of the user's configured limit.output. Auto-detect by model id so the common case Just Works while keeping useCompletionUrls as an explicit override either direction.

github-actions · 2026-05-28T17:38:28Z

Thanks for your contribution!

This PR doesn't have a linked issue. All PRs must reference an existing issue.

Please:

Open an issue describing the bug/feature (if one doesn't exist)
Add Fixes #<number> or Closes #<number> to this PR description

See CONTRIBUTING.md for details.

akirykouski · 2026-05-28T17:42:28Z

Filed #29776 with the full bug repro and linked it via Closes in the PR body. CI is green now.

rekram1-node · 2026-05-28T20:01:13Z

This is an experimental package that is opt in w/ OPENCODE_EXPERIMENTAL_NATIVE_LLM, it shouldnt be hit by users currently and it is under active development, not ready for prs currently

github-actions Bot added the needs:issue label May 28, 2026

akirykouski mentioned this pull request May 28, 2026

Azure AI Foundry partner deployments (DeepSeek/Kimi/Llama) capped at 4096 output tokens #29776

Open

github-actions Bot removed the needs:issue label May 28, 2026

rekram1-node closed this May 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(llm): route non-OpenAI Azure deployments via Chat Completions#29775

fix(llm): route non-OpenAI Azure deployments via Chat Completions#29775
akirykouski wants to merge 1 commit into
anomalyco:devfrom
akirykouski:fix/azure-chat-completions-default

akirykouski commented May 28, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 28, 2026

Uh oh!

akirykouski commented May 28, 2026

Uh oh!

rekram1-node commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

akirykouski commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue for this PR

Type of change

What does this PR do?

How did you verify your code works?

Screenshots / recordings

Checklist

Uh oh!

github-actions Bot commented May 28, 2026

Uh oh!

akirykouski commented May 28, 2026

Uh oh!

rekram1-node commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

akirykouski commented May 28, 2026 •

edited

Loading