Skip to content

Conversation

@guan404ming
Copy link
Member

Why

NMS operator returns fixed-size output with trailing garbage data, wasting memory and requiring manual trimming for ONNX
compatibility.

How

  • Add dynamic_strided_slice to trim NMS output to valid detections only
  • Build slice parameters using TE compute to avoid legalization issues

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @guan404ming, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the Non-Maximum Suppression (NMS) operator within the Relax framework by implementing dynamic output trimming. This improvement ensures that the NMS output tensors contain only relevant detection data, thereby optimizing memory usage and streamlining integration with ONNX runtime environments. The change removes the burden of manual output trimming for users and aligns the operator's behavior with expected dynamic output standards.

Highlights

  • Dynamic Output Trimming for NMS: The Non-Maximum Suppression (NMS) operator now dynamically trims its output to include only valid detections, eliminating trailing garbage data and improving memory efficiency.
  • ONNX Compatibility: This change directly addresses ONNX compatibility issues by ensuring the NMS operator produces clean, dynamically sized outputs, removing the need for manual post-processing.
  • Legalization Implementation: The _all_class_non_max_suppression legalization function has been updated to use relax.op.dynamic_strided_slice for trimming. Slicing parameters (begin, end, strides) are constructed using TE compute to prevent high-level Relax operator legalization issues.
  • Code Simplification: The previously existing _create_onnx_nms_te helper function, which handled fixed-size output slicing, has been removed as its functionality is now integrated into the main legalization logic.
  • New Test Coverage: New tests (test_all_class_non_max_suppression_legalize_dynamic_trim and test_all_class_non_max_suppression_legalize_e2e) have been added to verify the correct behavior of dynamic trimming and end-to-end execution of the legalized NMS operator.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully implements dynamic output trimming for the NMS operator in Relax, addressing the previously noted TODO regarding fixed-size outputs. The changes correctly leverage dynamic_strided_slice and build slicing parameters using TE compute to ensure ONNX compatibility and improved memory efficiency. The updated documentation clearly reflects the new behavior, and the addition of comprehensive test cases, including end-to-end validation, ensures the correctness and robustness of the new implementation. This is a significant improvement to the NMS operator.

@guan404ming guan404ming force-pushed the feat/nms-dynamic-output-trimming branch 2 times, most recently from dd4fce6 to 12d6e61 Compare January 21, 2026 16:22
@guan404ming guan404ming force-pushed the feat/nms-dynamic-output-trimming branch from 12d6e61 to 60a45c3 Compare January 21, 2026 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant