Skip to content

Conversation

@mattkur
Copy link
Contributor

@mattkur mattkur commented Jan 22, 2026

Adds gen_cherrypick_prs.py, which cherry picks PRs that are completed into main into the release/<foo> branch.
The script generates these in merge order, and stops if there are merge conflicts.
You can specify a set of PRs to cherry pick, or the script will try to do any PRs that are missing.

Written with copilot.

@mattkur mattkur requested a review from a team as a code owner January 22, 2026 23:33
Copilot AI review requested due to automatic review settings January 22, 2026 23:33
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a Python script to automate the creation of cherry-pick PRs from the main branch to release branches. The script handles the complete workflow of finding merged PRs, cherry-picking them in merge order, and creating corresponding backport PRs on GitHub.

Changes:

  • Adds gen_cherrypick_prs.py script that generates cherry-pick PRs for release branches
  • Supports both manual PR specification and automatic discovery via backport labels
  • Includes conflict detection, duplicate backport checking, and interactive confirmation

queries.append(f'base:{base_branch} "{orig_title}" in:title')

items: List[dict] = []
seen_numbers: set[int] = set()
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type hint set[int] uses Python 3.9+ syntax. For better compatibility with older Python versions, consider using Set[int] from the typing module which is already imported on line 41.

Copilot uses AI. Check for mistakes.
return f"{orange}*IN PROGRESS*{reset}"
return "*NONE*"

backport_map: dict[int, List[BackportInfo]] = {}
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type hint dict[int, List[BackportInfo]] uses Python 3.9+ syntax. For consistency with the typing imports already present and better compatibility, consider using Dict[int, List[BackportInfo]] from the typing module.

Copilot uses AI. Check for mistakes.
Comment on lines +71 to +72
stderr = getattr(p, "stderr", "") or ""
stdout = getattr(p, "stdout", "") or ""
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The getattr() calls with fallback empty strings are unnecessary. When capture=True, subprocess.run with stdout=subprocess.PIPE and stderr=subprocess.PIPE always sets these attributes. The defensive coding here could mask programming errors.

Suggested change
stderr = getattr(p, "stderr", "") or ""
stdout = getattr(p, "stdout", "") or ""
stderr = p.stderr or ""
stdout = p.stdout or ""

Copilot uses AI. Check for mistakes.
seen_numbers.add(n)
items.append(pr)

rx = re.compile(rf"\bPR\s*#?{pr_number}\b", re.IGNORECASE)
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern doesn't escape the PR number, which could lead to false matches. For example, if searching for PR #123, the pattern \bPR\s*#?123\b would match "PR #1234" because \b matches between "123" and "4". Consider using a more precise pattern like rf"\bPR\s*#?{pr_number}(?:\b|$)" or ensuring the match is complete.

Suggested change
rx = re.compile(rf"\bPR\s*#?{pr_number}\b", re.IGNORECASE)
rx = re.compile(
rf"\bPR\s*#?{re.escape(str(pr_number))}(?:\b|$)", re.IGNORECASE
)

Copilot uses AI. Check for mistakes.
Comment on lines +586 to +588
if not confirm("Create this PR on GitHub? [y/N] "):
print("Aborting by user request. No further PRs will be created.")
return 0
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the user confirms to not create a PR (line 586-588), the script returns 0 (success), but some PRs may still be unprocessed. This could be confusing. Consider returning a different exit code or providing a summary of skipped PRs to make it clear that not all PRs were processed.

Copilot uses AI. Check for mistakes.
Comment on lines +482 to +484
green = "\x1b[32m"
orange = "\x1b[38;5;208m"
reset = "\x1b[0m"
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script uses ANSI color codes directly in the main function. These may not work correctly on all terminals (especially on Windows without ANSI support). Consider checking if the output is a TTY and/or if color is supported before using color codes, or using a library like colorama for cross-platform color support.

Copilot uses AI. Check for mistakes.
if not (
rx.search(hay)
or url_fragment in hay
or (title_lc and title_lc in title.lower())
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The title matching on line 160 uses substring matching (title_lc in title.lower()), which could produce false positives. For example, if the original PR title is "Fix bug", it would match any PR with "Fix bug" anywhere in its title like "Fix bug in parser" even if that's a different PR. Consider using more precise matching or requiring the title match to be more substantial (e.g., checking for an exact match or a high similarity threshold).

Copilot uses AI. Check for mistakes.
Comment on lines +477 to +480
"Warning: Could not determine base repo from git remotes. "
"Consider passing --repo OWNER/REPO.",
file=sys.stderr,
)
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After printing a warning about not being able to determine the base repo, the script continues execution even though base_repo might be None. This could cause gh_pr_create to be called with repo=None later, which may or may not work depending on the gh CLI's current directory context. Consider making this a hard error or at least documenting that the script must be run from within a GitHub repository when --repo is not specified.

Suggested change
"Warning: Could not determine base repo from git remotes. "
"Consider passing --repo OWNER/REPO.",
file=sys.stderr,
)
"Error: Could not determine base repo from git remotes. "
"Pass --repo OWNER/REPO or run from within a GitHub repository.",
file=sys.stderr,
)
return 1

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a raise SystemExit to keep with style

"--search",
query,
"--limit",
"50",
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gh pr list --limit 50 hardcoded limit in the search query might miss backport PRs if there are more than 50 matching PRs. This could lead to the script incorrectly thinking a PR hasn't been backported when it actually has. Consider increasing this limit or adding pagination to ensure all matching PRs are found.

Suggested change
"50",
"1000",

Copilot uses AI. Check for mistakes.
title = str(pr.get("title") or "")
body = str(pr.get("body") or "")
hay = f"{title}\n{body}"
hay_lc = hay.lower()
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable hay_lc is not used.

Suggested change
hay_lc = hay.lower()

Copilot uses AI. Check for mistakes.
@gurasinghMS
Copy link
Contributor

gurasinghMS commented Jan 23, 2026

This looks great. Thanks for doing it!!
I'm wondering if git worktrees could be of some help here. I haven't played around with them a ton but this seems it could be a good fit. Instead of forcing a user to have a clean state we create 1 worktree per PR that needs to be created. That way if any of the commands were to fail for any unknown reason, failed state is isolated to the worktree.

orange = "\x1b[38;5;208m"
reset = "\x1b[0m"

def status_label(state: str) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Move this function outside of main

prj = gh_pr_view(n, args.repo)

state = str(prj.get("state", "")).upper()
if state != "MERGED":
Copy link
Contributor

@gurasinghMS gurasinghMS Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also have not_completed list being populated in this branch, instead of failing when it comes across an unmerged.
How about making a function that does that takes a list of pr jsons and spits out 2 lists: infos and not_completed. That way behavior remains consistent in both from_backport_label and by_pr_number branches.

print("\nDone. Created PRs:")
for n, url in created:
print(f" #{n}: {url}")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a checkout back to the branch this script was started from.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants