Skip to content

Conversation

@knjiang
Copy link
Contributor

@knjiang knjiang commented Jan 16, 2026

Summary

This PR adds cross-provider compatability for chat completions, responses, anthropic.

See -> https://github.com/braintrustdata/lingua/actions/runs/21328545339

Parameter mappings

Feature Chat Completions Responses Anthropic
Reasoning reasoning_effort reasoning.effort thinking.budget_tokens
Structured output response_format.json_schema text.format output_format
Tool selection tool_choice tool_choice tool_choice + disable_parallel_tool_use
Max tokens max_tokens / max_completion_tokens max_output_tokens max_tokens

Testing

For each provider pair (A → B) across Chat Completions / Responses / Anthropic, we validate the deserialized Universal payload:

  1. Universal of source payload

    • U₁ = A payload → Universal
  2. Translate across providers and re-canonicalize

    • U₂ = (A payload → Universal → B payload) → Universal
  3. Diff the canonical forms

    • Compare U₁ vs U₂
    • Emit field-level diffs for any lost / added / changed fields
  4. Enforce in CI

    • CI fails on any unexpected diffs

Expected differences

  • Known provider limitations / intentional lossy mappings are documented in expected_differences.json.
  • Diffs covered by this file are treated as allowed; anything else is flagged as a regression and fails CI.

Copy link
Contributor Author

knjiang commented Jan 16, 2026

@knjiang knjiang changed the title add universla param configs add universal param configs Jan 16, 2026
@knjiang knjiang force-pushed the add_universal_params_between_completions_responses_and_anthropic branch from 7fbb91f to 64f4bad Compare January 21, 2026 18:58
@knjiang knjiang force-pushed the add_anthropic_parameter_test_cases branch from 8d7eba7 to 9d2ccf3 Compare January 21, 2026 18:58
@knjiang knjiang marked this pull request as ready for review January 21, 2026 19:01
@knjiang knjiang force-pushed the add_universal_params_between_completions_responses_and_anthropic branch from 64f4bad to e569a99 Compare January 21, 2026 21:01
@knjiang knjiang force-pushed the add_anthropic_parameter_test_cases branch from 9d2ccf3 to bd8d790 Compare January 21, 2026 21:01
Comment on lines -59 to -62
/// Known request fields for OpenAI Responses API.
/// These are fields extracted into UniversalRequest/UniversalParams.
/// Fields not in this list go into `extras` for passthrough.
const RESPONSES_KNOWN_KEYS: &[&str] = &[
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i move responses to responses_adapter.rs

@knjiang knjiang force-pushed the add_universal_params_between_completions_responses_and_anthropic branch 3 times, most recently from cf02e72 to 349a05d Compare January 22, 2026 10:06
target_adapter.display_name(),
test_case,
);
let roundtrip_result = compare_values(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i tightened the runner to be more accurate, basically we now compare:

source -> universal

with

source -> universal -> target -> universal

we basically deserialize universal -> JSON and do diffing

@knjiang knjiang changed the title add universal param configs Add universal param configs Jan 22, 2026
@knjiang knjiang force-pushed the add_universal_params_between_completions_responses_and_anthropic branch 5 times, most recently from a75144e to 4e0703c Compare January 23, 2026 00:19
@knjiang knjiang force-pushed the add_anthropic_parameter_test_cases branch 2 times, most recently from f2ba481 to c8a3e48 Compare January 23, 2026 00:19
@knjiang knjiang force-pushed the add_universal_params_between_completions_responses_and_anthropic branch from 4e0703c to 5aa114b Compare January 23, 2026 00:20
@knjiang knjiang force-pushed the add_anthropic_parameter_test_cases branch from c8a3e48 to f2ba481 Compare January 23, 2026 00:20
@knjiang knjiang force-pushed the add_universal_params_between_completions_responses_and_anthropic branch 2 times, most recently from 4e0703c to 561180b Compare January 23, 2026 00:40
@knjiang knjiang force-pushed the add_anthropic_parameter_test_cases branch from f2ba481 to a38f078 Compare January 23, 2026 00:40
@knjiang knjiang force-pushed the add_universal_params_between_completions_responses_and_anthropic branch from 561180b to c94e076 Compare January 23, 2026 01:18
@knjiang knjiang force-pushed the add_anthropic_parameter_test_cases branch from 768934e to 2a5de09 Compare January 23, 2026 01:20
@knjiang knjiang force-pushed the add_universal_params_between_completions_responses_and_anthropic branch 3 times, most recently from 517e490 to 7aaaece Compare January 23, 2026 07:57
@knjiang knjiang force-pushed the add_anthropic_parameter_test_cases branch from 2a5de09 to 768934e Compare January 23, 2026 07:57
@knjiang knjiang changed the base branch from add_anthropic_parameter_test_cases to graphite-base/61 January 23, 2026 08:33
@knjiang knjiang force-pushed the add_universal_params_between_completions_responses_and_anthropic branch from 7aaaece to 473a200 Compare January 23, 2026 16:42
@knjiang knjiang changed the base branch from graphite-base/61 to main January 23, 2026 16:42
@knjiang knjiang force-pushed the add_universal_params_between_completions_responses_and_anthropic branch 7 times, most recently from 7657b1e to b2a3e48 Compare January 25, 2026 06:54
@knjiang knjiang requested a review from remh January 25, 2026 17:55
}
}

fn normalize_user_content(content: &mut UserContent) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i couldn't find a good way to deserialize & compare UserContent since a single item text array and a string are semantically equivalent.

this is the only normalization i do -> https://github.com/braintrustdata/lingua/blob/main/crates/lingua/src/universal/message.rs#L31

i had an alternative approach of this code being normalized in the deserializer but that broke roundtrip tests so it lives here now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to clarify, this is just for testing right? if so it seems totally fine

Comment on lines +21 to +31
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ChatCompletionResponseMessageExt {
#[serde(flatten)]
pub base: openai::ChatCompletionResponseMessage,
#[serde(skip_serializing_if = "Option::is_none")]
pub reasoning: Option<String>,
/// Encrypted reasoning signature for cross-provider roundtrips (e.g., Anthropic's signature)
#[serde(skip_serializing_if = "Option::is_none")]
pub reasoning_signature: Option<String>,
}

Copy link
Contributor Author

@knjiang knjiang Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extended the chat completion type with reasoning + reasoning_signature.

i think reasoning_signature will be useful for ant/gemini. I think reasoning is supported via -> vllm-project/vllm#27755 but reasoning_signature is smth i added alongside it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason it's added is so that other model providers than OpenAI who use chat completions can propagate it?

Comment on lines 20 to 32
{ "pattern": "params.response_format", "reason": "Anthropic doesn't support Text format type" },
{ "pattern": "params.metadata", "reason": "Anthropic only accepts user_id in metadata" },
{ "pattern": "params.parallel_tool_calls", "reason": "Anthropic only supports disable_parallel via tool_choice" },
{ "pattern": "params.tool_choice", "reason": "Anthropic requires tool_choice to express disable_parallel_tool_use" }
],
"errors": [
{ "pattern": "does not support logprobs", "reason": "Anthropic doesn't support logprobs parameter" },
{ "pattern": "does not support top_logprobs", "reason": "Anthropic doesn't support top_logprobs parameter" },
{ "pattern": "does not support frequency_penalty", "reason": "Anthropic doesn't support frequency_penalty parameter" },
{ "pattern": "does not support presence_penalty", "reason": "Anthropic doesn't support presence_penalty parameter" },
{ "pattern": "does not support seed", "reason": "Anthropic doesn't support seed parameter" },
{ "pattern": "does not support store", "reason": "Anthropic doesn't support store parameter" },
{ "pattern": "does not support n > 1", "reason": "Anthropic doesn't support multiple completions" }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is something I was having trouble classifying whether we want to fail fast for unsupported parameters or silently drop when converting to a different model.

For now, I've gone with fail-fast with completely unsupported params.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think right now we silently drop. Maybe we should try this out on some other systems that do translation and see what they do?

Uusally people solve stuff like this by having a flag and then propagating the choice to the user. Eg. mysql defaults to flexible type conversions and has a strict mode. Postgres is the opposite.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ran

curl https://ai-gateway.vercel.sh/v1/chat/completions \                                                  
        -H "Content-Type: application/json" \                                                                   
        -H "Authorization: Bearer <AI_GATEWAY_KEY>" \               
        -d '{                                                                                                   
          "model": "anthropic/claude-sonnet-4.5",                                                               
          "messages": [                                                                                         
            {                                                                                                   
              "role": "user",                                                                                   
              "content": "Write a one-sentence bedtime story about a unicorn."                                  
            }                                                                                                   
          ],                                                                                                    
          "stream": false,                                                                                      
          "logprobs": true,                                                                                     
          "top_logprobs": 3,                                                                                    
          "frequency_penalty": 0.5                                                                              
        }'

and the parameters were dropped

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah then maybe we should silently drop, and offer a strict mode (this could even be a lingua/universal param) that enforces parameter translation is not dropped.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 14 to 19
/// Required providers for CI: Anthropic <-> ChatCompletions <-> Responses
const REQUIRED_PROVIDERS: &[ProviderFormat] = &[
ProviderFormat::Responses,
ProviderFormat::OpenAI, // ChatCompletions
ProviderFormat::Anthropic,
];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to have this automatically run on every provider format, this is temporary while we incrementally make progress.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense. Maybe leave this as a comment in the code

@knjiang knjiang requested a review from ankrgyl January 25, 2026 17:59
# Click into the job summary to see the actual coverage report
- name: Post coverage to job summary
if: always()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does if: always() do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops let me reorder. if always() just lets the CI continue if a step fails

/// Tool selection strategy (varies by provider)
pub tool_choice: Option<Value>,
/// Number of top logprobs to return (0-20)
pub top_logprobs: Option<i64>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if it's 1-20 can it be a smaller integer type like i8?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 135 to 150
// === Metadata and identification ===
/// Request metadata (user tracking, experiment tags, etc.)
pub metadata: Option<Value>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

responses/chatcompletions have it as:

metadata
map

Optional
Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.

Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.

while anthrpoic has metadata as an object with one field user_id.

https://platform.openai.com/docs/api-reference/responses/create#responses_create-metadata
https://platform.claude.com/docs/en/api/messages

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh ok. mind linking these in the comment

/// Example: OpenAI Chat extras stay in `provider_extras[ProviderFormat::OpenAI]`
/// and are only merged back when converting to OpenAI Chat, not to Anthropic.
#[serde(skip)]
pub provider_extras: HashMap<ProviderFormat, Map<String, Value>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's slightly weird that this is not nested in params , at least to me. What as the rationale behind that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@ankrgyl ankrgyl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good straightforward to me

  • It's a little out of date, but it would be useful to write some typescript examples (eg in examples/typescript/index.ts) or even some rust examples that show the ergonomics of using parameters, so we can double check the format
  • for parameters, i think it would be useful to write a "fuzz" style tester that for each provider, generates random values with respect to the openapi spec, and then roundtrips through UniversalParams (there is less entropy in parameters than raw requests, but this might just be generally useful)
  • Is there a creative way we can port the test cases we have in the proxy/ repo? We have had a bunch of historical challenges with translating reasoning for example that is well captured in those tests.

@knjiang knjiang force-pushed the add_universal_params_between_completions_responses_and_anthropic branch from b2a3e48 to 0ffbf84 Compare January 26, 2026 01:12
@knjiang knjiang requested a review from ankrgyl January 26, 2026 14:42
@ankrgyl
Copy link
Contributor

ankrgyl commented Jan 27, 2026

Looks pretty good straightforward to me

  • It's a little out of date, but it would be useful to write some typescript examples (eg in examples/typescript/index.ts) or even some rust examples that show the ergonomics of using parameters, so we can double check the format
  • for parameters, i think it would be useful to write a "fuzz" style tester that for each provider, generates random values with respect to the openapi spec, and then roundtrips through UniversalParams (there is less entropy in parameters than raw requests, but this might just be generally useful)
  • Is there a creative way we can port the test cases we have in the proxy/ repo? We have had a bunch of historical challenges with translating reasoning for example that is well captured in those tests.

just to clarify, did you address these too?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants