Skip to content

Conversation

@henrypark133
Copy link
Contributor

Adds provider_request_id, stop_reason, and makes request_type nullable in the usage log table. Implements StopReason enum tracking in the completions service to capture why streams ended (completed, length, content_filter, client_disconnect, provider_error, timeout, tool_calls, rate_limited) based on provider finish_reason or errors. Exposes these new fields in the usage history API response.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @henrypark133, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the usage logging capabilities by introducing detailed tracking of why inference requests conclude. By adding provider_request_id and stop_reason to the usage log and exposing these in the API, we gain deeper insights into the lifecycle and outcomes of each inference, improving billing accuracy, debugging, and user understanding of their usage patterns. The changes span database schema, service logic, and API responses, ensuring comprehensive data capture and presentation.

Highlights

  • Database Schema Update: The organization_usage_log table has been updated with new columns: provider_request_id (raw ID from the inference provider), stop_reason (why the inference ended), and the existing request_type column is now nullable as it is deprecated.
  • Stop Reason Tracking: A new StopReason enum has been introduced in the services::usage module to categorize why an inference stream concluded. This includes reasons like completed, length, content_filter, client_disconnect, provider_error, timeout, tool_calls, and rate_limited.
  • Completions Service Integration: The completions service now captures and records the stop_reason, provider_request_id, and response_id for both streaming and non-streaming inference calls. This allows for more granular usage tracking and debugging.
  • Usage History API Enhancement: The usage history API responses (/v1/organizations/{}/usage/history and /v1/workspaces/{}/api-keys/{}/usage/history) now expose the newly tracked stop_reason, response_id, provider_request_id, and inference_id fields, providing richer details to users.
  • Testing Improvements: End-to-end tests have been updated to include assertions for the new stop_reason, response_id, provider_request_id, and inference_id fields, ensuring the correct capture and exposure of this data.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces new fields for detailed usage tracking, specifically stop_reason, response_id, and provider_request_id, across the API, database, and service layers. Database migrations add provider_request_id and stop_reason columns to organization_usage_log, and request_type is made nullable. The StopReason enum is defined to categorize why an inference stream ended, including client disconnects and provider errors. The InterceptStream in the completions service is enhanced to capture finish_reason and errors to determine the stop_reason for usage logging. API routes and usage history endpoints are updated to expose these new fields, and e2e tests are modified to verify their correct population. Review comments highlight the need to re-include request_type in SELECT queries for backward compatibility, suggest increasing the stop_reason column size to VARCHAR(255) to prevent truncation, and request clarification on the consistent use of StopReason::Completed versus StopReason::Stop.

@henrypark133 henrypark133 marked this pull request as ready for review December 22, 2025 20:51
@claude
Copy link

claude bot commented Dec 22, 2025

Code Review - Usage Log Database Update

Thank you for this comprehensive enhancement to usage tracking. The PR adds valuable telemetry with provider_request_id, stop_reason, and makes request_type nullable. I've reviewed the changes focusing on critical production safety issues.

⚠️ CRITICAL ISSUES

1. Type Mismatch in Non-Streaming Completion (completions/mod.rs:884)

Issue: The code calls StopReason::from_finish_reason(reason) with an &inference_providers::FinishReason enum, but the method signature expects &str:

// Line 878-885: Type mismatch
let stop_reason = response_with_bytes
    .response
    .choices
    .first()
    .and_then(|c| c.finish_reason.as_ref())  // Returns &FinishReason enum
    .map(|reason| crate::usage::StopReason::from_finish_reason(reason))  // Expects &str
    .unwrap_or(crate::usage::StopReason::Completed);

The from_finish_reason method (usage/ports.rs:65) expects a string:

pub fn from_finish_reason(reason: &str) -> Self {
    match reason {
        "stop" => StopReason::Stop,
        "length" => StopReason::Length,
        // ...
    }
}

Impact: This will fail to compile. The compiler error would be something like:

expected &str, found &inference_providers::FinishReason

Fix: Add a conversion method to map the enum directly, or convert the enum to string first:

// Option 1: Direct enum mapping (cleaner)
impl StopReason {
    pub fn from_provider_finish_reason(reason: &inference_providers::FinishReason) -> Self {
        match reason {
            inference_providers::FinishReason::Stop => StopReason::Stop,
            inference_providers::FinishReason::Length => StopReason::Length,
            inference_providers::FinishReason::ContentFilter => StopReason::ContentFilter,
            inference_providers::FinishReason::ToolCalls => StopReason::ToolCalls,
        }
    }
}

// Then use:
.map(|reason| crate::usage::StopReason::from_provider_finish_reason(reason))

// Option 2: Convert to string (less efficient)
.map(|reason| {
    let reason_str = match reason {
        inference_providers::FinishReason::Stop => "stop",
        inference_providers::FinishReason::Length => "length",
        inference_providers::FinishReason::ContentFilter => "content_filter",
        inference_providers::FinishReason::ToolCalls => "tool_calls",
    };
    crate::usage::StopReason::from_finish_reason(reason_str)
})

2. Migration Backward Compatibility - Missing Column in request_type ALTER

Issue: The migration makes request_type nullable but doesn't verify existing data integrity. If the column previously had a NOT NULL constraint with data, the migration will succeed, but:

  1. Old code (if deployed) still expects request_type to be present
  2. New records will have NULL request_type, which could break during a rolling deployment

Impact: During rolling deployment, old API servers reading new records might encounter unexpected NULLs in request_type.

Recommendation:

  • Verify this is a BREAKING CHANGE for rolling deployments
  • Document deployment order: database migration → full cluster rollout (no mixed versions)
  • Or backfill request_type from inference_type before making it nullable
-- Add before ALTER COLUMN:
UPDATE organization_usage_log 
SET request_type = inference_type 
WHERE request_type IS NULL;

✅ POSITIVE OBSERVATIONS

  1. Excellent Privacy Compliance: All logging uses IDs only, no customer data exposed ✓
  2. Comprehensive Test Coverage: E2E tests verify all new fields (stop_reason, provider_request_id, inference_id, response_id) ✓
  3. Atomic Usage Recording: Transaction wraps usage log insert + balance update ✓
  4. Index on provider_request_id: Smart optimization for lookups ✓
  5. Graceful NULL Handling: API responses use skip_serializing_if for optional fields ✓

🔍 MINOR OBSERVATIONS (Not Blocking)

  1. Test Comment Accuracy (e2e_client_disconnect.rs:157): Comment says "client disconnect" but test shows stream completes normally. The comment is accurate about the mock behavior, but could clarify that real client disconnects would differ.

  2. Stop Reason Mapping Completeness: The streaming path maps errors to stop reasons (from_completion_error), but non-streaming doesn't handle provider errors. If the provider returns an error in non-streaming mode, usage isn't recorded at all. This is probably acceptable since no tokens were consumed, but worth documenting.


📋 ACTION ITEMS

Must Fix Before Merge:

  1. ✅ Fix type mismatch in completions/mod.rs:884 (add from_provider_finish_reason method or convert to string)
  2. ✅ Verify/document rolling deployment safety for request_type nullable change

Nice to Have:

  • Add deployment notes to PR description about database migration order

⚠️ STATUS: ISSUES FOUND - Critical type mismatch must be resolved before merge.

@henrypark133
Copy link
Contributor Author

  1. Type Mismatch in Non-Streaming Completion (completions/mod.rs:884)

Yes I missed one of the stop -> stop conversion and added a helper for this:

Convert from inference provider's FinishReason enum
  1. Migration Backward Compatibility - Missing Column in request_type ALTER

Migration happens first before any new instances are deployed so writes/reads from older instances will not be impacted since with the new migration older writing/reading with request_type and new instances not using request_type should both work.

@henrypark133
Copy link
Contributor Author

@claude review

@claude
Copy link

claude bot commented Dec 29, 2025

Code Review - Usage Log Database Update (Follow-up)

Thank you for addressing the previous feedback! I've reviewed the updated code and the fixes look good. The type mismatch issue has been properly resolved with the addition of from_provider_finish_reason for enum conversion.

✅ VERIFIED FIXES

  1. Type Mismatch Resolved: The code now correctly handles both:

    • Streaming path: Uses from_provider_finish_reason(&FinishReason) for enum-based finish_reason
    • Non-streaming path: Uses from_finish_reason(&str) for string-based finish_reason

    This is correct because ChatChoice (streaming) has finish_reason: Option<FinishReason> while ChatCompletionResponseChoice (non-streaming) has finish_reason: Option<String>.

  2. Migration Backward Compatibility: The migration includes proper backfill:

    UPDATE organization_usage_log
    SET inference_type = request_type
    WHERE inference_type IS NULL AND request_type IS NOT NULL;

    This ensures existing records maintain their data integrity before request_type becomes nullable.

🔍 OBSERVATIONS (No Action Required)

  1. Privacy Compliance: All logging follows the strict privacy rules in CLAUDE.md - only IDs are logged, no customer data ✓

  2. Comprehensive Test Coverage: E2E tests validate all new fields (stop_reason, provider_request_id, inference_id, response_id) ✓

  3. Database Column Types: New columns are appropriately sized:

    • provider_request_id VARCHAR(255) - sufficient for vLLM chat IDs
    • stop_reason VARCHAR(50) - adequate for enum values (longest is "client_disconnect" at 18 chars)
    • Partial index on provider_request_id - good performance optimization ✓
  4. ResponseId Integration: The .as_uuid() method was correctly added to extract the inner Uuid from the newtype wrapper for database storage ✓

📋 MINOR RECOMMENDATIONS (Optional)

  1. Stop Reason Documentation: The migration comment lists specific values but the enum has more variants (stop, rate_limited, other). Consider updating the comment:

    COMMENT ON COLUMN organization_usage_log.stop_reason IS
        'Why the inference stream ended: completed, stop, length, content_filter, client_disconnect, provider_error, timeout, tool_calls, rate_limited, or custom value.';
  2. Request Type Column: Since request_type is now deprecated and nullable, consider adding a follow-up task to remove it in a future migration (after sufficient time for rollback safety).


STATUS: APPROVED - All critical issues have been resolved. The PR is ready to merge.

Excellent work on the comprehensive usage tracking enhancements!

@henrypark133 henrypark133 merged commit 2500b16 into main Dec 29, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants