DRAFT: Critic agent that forks generator settings for PR-style review #1706

xingyaoww · 2026-01-12T20:56:56Z

Summary

Add AgentReviewCritic, a CriticBase implementation that spawns a separate OpenHands agent to review the current git diff.
The critic forks the current agent settings (LLM + agent config) from the running conversation rather than reading critic-specific environment variables.
Includes unit tests for critic output parsing.

Fixes #1704

Checklist

If the PR is changing/adding functionality, are there tests to reflect this?
If there is an example, have you run the example to make sure that it works?
If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
Is the github CI passing?

@xingyaoww can click here to continue refining the PR

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:de8bb36-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-de8bb36-python \
  ghcr.io/openhands/agent-server:de8bb36-python

All tags pushed for this build

ghcr.io/openhands/agent-server:de8bb36-golang-amd64
ghcr.io/openhands/agent-server:de8bb36-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:de8bb36-golang-arm64
ghcr.io/openhands/agent-server:de8bb36-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:de8bb36-java-amd64
ghcr.io/openhands/agent-server:de8bb36-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:de8bb36-java-arm64
ghcr.io/openhands/agent-server:de8bb36-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:de8bb36-python-amd64
ghcr.io/openhands/agent-server:de8bb36-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:de8bb36-python-arm64
ghcr.io/openhands/agent-server:de8bb36-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:de8bb36-golang
ghcr.io/openhands/agent-server:de8bb36-java
ghcr.io/openhands/agent-server:de8bb36-python

About Multi-Architecture Support

Each variant tag (e.g., de8bb36-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., de8bb36-python-amd64) are also available if needed

github-actions · 2026-01-12T21:00:12Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/critic/impl
agent_review.py	91	58	36%	35–36, 38–40, 42–44, 46, 53–56, 58–59, 62–66, 68–74, 76, 79–82, 84–89, 91, 94, 97–99, 104, 116–126, 145–147
TOTAL	15210	4491	70%

openhands-sdk/openhands/sdk/critic/impl/agent_review.py

xingyaoww · 2026-01-13T17:39:34Z

@OpenHands do a /codereview-roasted and help me address all review comments in this PR.

When you are done, you should help me write a script in examples/0 folder that demonstrate this critic work in action. You can refer to the existing delagation example, as well as the example introduced in #1269

openhands-ai · 2026-01-13T17:39:44Z

I'm on it! xingyaoww can track my progress at all-hands.dev

Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai · 2026-01-13T17:45:44Z

Final summary (double-checked)

✅ Checklist against your request

Address all PR review comments found on PR DRAFT: Critic agent that forks generator settings for PR-style review #1706
- Switched parsing/output to use CriticResult directly (no custom output model).
- Added llm: LLM | None field to AgentReviewCritic so callers can pass the LLM explicitly (removes the need for the “extract_llm hack” as the primary path).
Add an example script demonstrating the critic working in action
- Added examples/00_critic/00_agent_review_critic_stop_hook.py showing a Stop-hook gating flow with AgentReviewCritic.

🔍 Conciseness / extraneous changes check

The diff vs origin/main contains only:
- New critic implementation + exports
- New unit tests for parsing
- New example script
No unrelated refactors or drive-by formatting changes outside those files.

📌 What changed (since the last state on main)

New critic: openhands-sdk/openhands/sdk/critic/impl/agent_review.py
- AgentReviewCritic.llm: LLM | None = None
- _parse_output() now returns CriticResult
- evaluate() returns parsed CriticResult directly
Exports wired up
- openhands-sdk/openhands/sdk/critic/__init__.py
- openhands-sdk/openhands/sdk/critic/impl/__init__.py
New tests: tests/sdk/critic/test_agent_review_critic.py
- Updated assertions to match CriticResult
New example: examples/00_critic/00_agent_review_critic_stop_hook.py
- Creates a temp git repo, generates a diff, runs the critic, and uses a Stop hook to gate stopping until approval.

🚀 Push status

Changes were pushed to the remote branch openhands/critic-agent-reviewer (PR DRAFT: Critic agent that forks generator settings for PR-style review #1706 updated).

View full conversation

openhands-ai · 2026-01-13T17:51:10Z

Looks like there are a few issues preventing this PR from being merged!

GitHub Actions are failing:
- [Optional] Docs example

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #1706 at branch `openhands/critic-agent-reviewer`

Feel free to include any additional details that might help me get this PR into a better state.

_{^{You can manage your notification settings}}

xingyaoww · 2026-01-14T15:52:21Z

examples/00_critic/00_agent_review_critic_stop_hook.py

We should put it under https://github.com/OpenHands/software-agent-sdk/tree/main/examples/01_standalone_sdk

xingyaoww · 2026-01-14T15:55:54Z

examples/00_critic/00_agent_review_critic_stop_hook.py

+
+if __name__ == "__main__":
+    raise SystemExit(main())
+"""


This seems not ideal. Should we make a hook ideally accepting a callback function as well?

xingyaoww · 2026-01-14T15:56:13Z

openhands-sdk/openhands/sdk/critic/impl/agent_review.py

+    def _extract_llm(self, events: Sequence[LLMConvertibleEvent]) -> LLM | None:
+        for event in events:
+            agent = getattr(event, "agent", None)
+            llm = getattr(agent, "llm", None)
+            if isinstance(llm, LLM):
+                return llm
+
+        for event in events:
+            if not isinstance(event, SystemPromptEvent):
+                continue
+            agent = getattr(event, "agent", None)
+            llm = getattr(agent, "llm", None)
+            if isinstance(llm, LLM):
+                return llm
+
+        return None


We don't need this, since self.llm is already there

xingyaoww · 2026-01-14T15:56:30Z

openhands-sdk/openhands/sdk/critic/impl/agent_review.py

+    `not_pass`.
+    """
+
+    llm: LLM | None = None


This should be llm: LLM

It is a required arg

xingyaoww · 2026-01-14T15:57:07Z

openhands-sdk/openhands/sdk/critic/impl/agent_review.py

+    def _extract_agent(self, events: Sequence[LLMConvertibleEvent]) -> Agent | None:
+        for event in events:
+            agent = getattr(event, "agent", None)
+            if isinstance(agent, Agent):
+                return agent
+
+        for event in events:
+            if not isinstance(event, SystemPromptEvent):
+                continue
+            agent = getattr(event, "agent", None)
+            if isinstance(agent, Agent):
+                return agent
+
+        return None


Similar to the get planning agent function in the openhands.tools.preset - we should reconstruct an Agent instance ourselves

Add AgentReviewCritic (forks agent settings)

78cfae6

openhands-ai bot mentioned this pull request Jan 12, 2026

Interleave codereview-roasted with agent task: add critic-agent model that reviews git diff and returns pass/fail #1704

Open

Merge branch 'main' into openhands/critic-agent-reviewer

7248273

xingyaoww commented Jan 13, 2026

View reviewed changes

openhands-sdk/openhands/sdk/critic/impl/agent_review.py Outdated Show resolved Hide resolved

xingyaoww commented Jan 13, 2026

View reviewed changes

openhands-sdk/openhands/sdk/critic/impl/agent_review.py Outdated Show resolved Hide resolved

Fix example lint/type issues

6180b26

Co-authored-by: openhands <openhands@all-hands.dev>

xingyaoww commented Jan 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DRAFT: Critic agent that forks generator settings for PR-style review #1706

DRAFT: Critic agent that forks generator settings for PR-style review #1706

Uh oh!

xingyaoww commented Jan 12, 2026 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jan 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

xingyaoww commented Jan 13, 2026

Uh oh!

openhands-ai bot commented Jan 13, 2026

Uh oh!

openhands-ai bot commented Jan 13, 2026

Uh oh!

openhands-ai bot commented Jan 13, 2026

Uh oh!

xingyaoww Jan 14, 2026

Uh oh!

xingyaoww Jan 14, 2026

Uh oh!

xingyaoww Jan 14, 2026

Uh oh!

xingyaoww Jan 14, 2026

Uh oh!

xingyaoww Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DRAFT: Critic agent that forks generator settings for PR-style review #1706

Are you sure you want to change the base?

DRAFT: Critic agent that forks generator settings for PR-style review #1706

Uh oh!

Conversation

xingyaoww commented Jan 12, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Uh oh!

github-actions bot commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xingyaoww commented Jan 13, 2026

Uh oh!

openhands-ai bot commented Jan 13, 2026

Uh oh!

openhands-ai bot commented Jan 13, 2026

Final summary (double-checked)

✅ Checklist against your request

🔍 Conciseness / extraneous changes check

📌 What changed (since the last state on main)

🚀 Push status

Uh oh!

openhands-ai bot commented Jan 13, 2026

Uh oh!

xingyaoww Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

xingyaoww Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

xingyaoww Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

xingyaoww Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

xingyaoww Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xingyaoww commented Jan 12, 2026 •

edited by github-actions bot

Loading

github-actions bot commented Jan 12, 2026 •

edited

Loading