-
-
Notifications
You must be signed in to change notification settings - Fork 41
Fix AI summary phrasing #100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @sunithvs, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
Hello team, gemini-code-assist here with a summary of this pull request by @sunithvs. The primary goal of this PR is to improve the quality of the AI-generated profile summaries by removing unwanted introductory phrases that the AI model might sometimes include, such as "Here is the summary:" or simply "Summary:". This makes the final output cleaner and more directly usable.
Highlights
- AI Output Cleaning: Implements a mechanism to automatically clean the output of the AI model used for generating profile summaries.
- New Cleaning Function: Introduces a new static method
_clean_summarywithin theAIDescriptionGeneratorclass specifically for this cleaning task. - Regex Implementation: The cleaning logic utilizes regular expressions to identify and remove common introductory patterns at the beginning of the generated text.
Changelog
- modules/ai_generator.py
- Imported the
remodule to enable regular expression operations. - Added a new static method
_clean_summarythat takes a string and removes leading phrases like 'here the summary' or 'summary:' using predefined regex patterns. - Modified the
generate_profile_summarymethod to pass the raw AI response through the_clean_summarymethod before returning the final summary.
- Imported the
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
AI text flows out,
Needs a little polish now,
Strip the intro words.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a utility function _clean_summary to strip common introductory phrases from AI-generated text. The implementation uses regular expressions and applies this cleaning step to the profile summary. The approach is clear and directly addresses the problem statement. The new method is well-defined as a static method with type hints and a docstring.
One area for improvement is the addition of unit tests for the new _clean_summary method to ensure its robustness against various AI outputs.
Also, the PR description mentions pytest -q failing with pytest: command not found. This seems like an environment setup issue on the author's side. It would be good to ensure that the testing environment is correctly configured so that tests (including any new ones for this functionality) can be run successfully before merging.
No specific style guide was provided, so I've considered general Python best practices (like PEP 8). The added code largely adheres to these.
Summary of Findings
- Unit Testing for New Logic: The new
_clean_summaryfunction, which handles string manipulation of AI-generated text, would significantly benefit from dedicated unit tests. These tests should cover various scenarios, including empty strings, strings with and without prefixes, different casings, and edge cases, to ensure the function's robustness and maintainability. This was raised as a 'medium' severity comment. - Import Order (PEP 8): The
import restatement (line 6) could be grouped with other standard library imports (likeimport jsonon line 1) at the top of the file, with a blank line separating standard library imports from third-party and local application imports. This aligns with PEP 8 style guidelines for import organization. (Severity: low - not commented on directly due to review settings requesting comments for medium+ severity).
Merge Readiness
The core logic of the PR is sound and addresses the stated goal effectively. However, to ensure the robustness and maintainability of the new string manipulation logic, I strongly recommend adding unit tests for the _clean_summary method as suggested in the review comment.
Once the testing aspect is addressed, the PR should be in good shape for merging. As a reviewer, I am not authorized to approve pull requests, so please ensure it undergoes any further necessary review and approval processes within your team.
| def _clean_summary(text: str) -> str: | ||
| """Remove leading phrases like 'here the summary' from the output""" | ||
| patterns = [ | ||
| r"^\s*here(?:'s| is)?\s+(?:the\s+)?summary[:,]?\s*", | ||
| r"^\s*summary[:,]?\s*", | ||
| ] | ||
| cleaned = text | ||
| for pattern in patterns: | ||
| cleaned = re.sub(pattern, "", cleaned, flags=re.IGNORECASE) | ||
| return cleaned.strip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new _clean_summary method is a good addition for refining the AI-generated text!
To ensure its long-term reliability and handle various outputs from the AI, have you considered adding unit tests for this function? Testing with different inputs would be beneficial, for example:
- Empty string:
"" - String with no matching prefixes:
"This is a normal sentence." - String with one of the target prefixes:
"Here is the summary: Content"or"summary: More content" - Strings with different casing for the prefixes:
"SUMMARY: Content" - Strings where a prefix might be followed by punctuation not currently handled (e.g., a period), to confirm current behavior:
"Here is the summary. Content" - Strings that are similar but shouldn't be stripped because the keyword isn't at the very beginning (after accounting for whitespace):
"A summary of events."
Dedicated unit tests would help catch regressions if the patterns are modified in the future or if new AI output behaviors emerge that need to be handled.
Summary
here the summaryfrom generated profile textTesting
pytest -q(fails:pytest: command not found)