Fix miscellaneous type hints #45

evanjhoward11 · 2025-12-20T01:46:03Z

Description

Please include a summary of the changes and the related issue. Include relevant motivation and context.
I fixed the type hints for the following files:

src/agentunit/adapters/agentops_adapter.py
src/agentunit/core/scenario.py
src/agentunit/datasets/builtins.py
src/agentunit/metrics/builtin.py
src/agentunit/production/integrations.py

Note: /metrics/builtin.py has one import-untyped error

Fixes #13

Type of Change

Please delete options that are not relevant:

Code refactoring
Test coverage improvement

Changes Made

Please provide a detailed list of changes:

Miscellaneous changes to all the files

Testing

Please describe the tests you ran to verify your changes:

Existing test suite passes (pytest)

Test Configuration

Python version: 3.11.4
Operating System: Windows 11
Relevant adapters tested: N/A

Test Results

# Paste relevant test output or results
$ mypy --follow-imports=silent src/agentunit/adapters/agentops_adapter.py
Success: no issues found in 1 source file

$ mypy --follow-imports=silent src/agentunit/core/scenario.py
Success: no issues found in 1 source file

$ mypy --follow-imports=silent src/agentunit/datasets/builtins.py
Success: no issues found in 1 source file

$ mypy --follow-imports=silent src/agentunit/metrics/builtin.py
src\agentunit\metrics\builtin.py:19: error: Skipping analyzing "ragas.metrics": module is installed, but missing library stubs or py.typed marker  [import-untyped]
src\agentunit\metrics\builtin.py:19: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
Found 1 error in 1 file (checked 1 source file)

$ mypy --follow-imports=silent src/agentunit/production/integrations.py
Success: no issues found in 1 source file

Code Quality

My code follows the project's style guidelines (Ruff, Black)
I have run pre-commit run --all-files and addressed any issues
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings or errors
I have added type hints where appropriate
I have updated docstrings following Google style

Documentation

I have updated relevant documentation (README, docs/, inline comments)
I have added/updated docstrings for new/modified functions
I have updated CHANGELOG.md under [Unreleased]
I have added examples if introducing new features

Breaking Changes

If this PR introduces breaking changes, please describe:

What breaks:
Migration path for users:
Deprecation warnings added (if applicable):

Dependencies

No new dependencies added
New dependencies added (list below with justification)

New dependencies:

package-name: Reason for adding

Performance Impact

Describe any performance implications:

No performance impact
Performance improvement (describe and provide benchmarks)
Potential performance regression (describe and justify)

Screenshots/Recordings (if applicable)

Add screenshots or recordings to help explain your changes:

Additional Context

Add any other context about the PR here:

Links to related issues or PRs
References to external documentation
Design decisions and trade-offs
Known limitations or future work

Checklist

I have read the CONTRIBUTING.md guide
My branch name follows the convention (feature/, fix/, docs/, etc.)
My commit messages follow the conventional commit format
I have tested my changes locally
I have updated the documentation
I have added tests that prove my fix is effective or that my feature works
All new and existing tests pass
I have checked for security vulnerabilities in any new dependencies

Reviewer Notes

Please pay special attention to:

Post-Merge Tasks

Tasks to complete after merging (if any):

Update external documentation
Announce in discussions/community
Create follow-up issues for future work

Summary by CodeRabbit

New Features
- AgentOps tracing integration with optional tracing toggle for richer run traces
- Scenario execution now runs asynchronously
Refactor
- Broadened factory method options for more flexible integrations
- Introduced structured dataset row format and optional metadata support
- Safer metadata access in metrics collection and clarified type annotations across the codebase

_{✏️ Tip: You can customize this high-level summary in your review settings.}

continue · 2025-12-20T01:46:05Z

Learn more

All Green is an AI agent that automatically:

✅ Addresses code review comments

✅ Fixes failing CI checks

✅ Resolves merge conflicts

Unsubscribe from All Green comments

coderabbitai · 2025-12-20T01:46:11Z

Walkthrough

Replaces LangSmith run management with AgentOps tracing across the AgentOps adapter, makes run_scenario async, switches project identifier usage to project_id, broadens factory method option typings to Any, introduces a DatasetRow TypedDict, and adds defensive metadata access and explicit type annotations.

Changes

Cohort / File(s)	Change Summary
AgentOps Integration & Tracing Refactor `src/agentunit/adapters/agentops_adapter.py`	Adds `enable_tracing` handling and `self.client`, replaces LangSmith run lifecycle with `agentops.start_trace` / `agentops.update_trace_metadata` / `agentops.end_trace`, converts `run_scenario` to `async`, updates session/message/interaction methods to emit AgentOps metadata, and switches internal usage from `project_name` to `project_id`.
Factory Method Type Hints `src/agentunit/core/scenario.py`	Replaces `options: object` with `options: Any` across multiple factory methods (from_autogen, from_haystack, from_llama_index, from_semantic_kernel, from_phidata, from_promptflow, from_openai_swarm, from_anthropic_bedrock, from_mistral_server, from_rasa_endpoint, from_openai_agents, from_crewai). Adds `Any` to TYPE_CHECKING imports and refines name resolution in `from_openai_agents`.
Dataset Type Structure `src/agentunit/datasets/builtins.py`	Introduces `DatasetRow` TypedDict (fields: `id`, `query`, `expected_output`, `tools`, `context`, optional `metadata`), updates `_GAIA_L1_SHOPPING` and `_SWE_BENCH_LITE` to `list[DatasetRow]`, and updates `_build_loader(rows: list[DatasetRow])`.
Type Safety & Defensive Coding `src/agentunit/metrics/builtin.py`, `src/agentunit/production/integrations.py`	Replaces direct `trace.metadata` access with `getattr(trace, "metadata", {})` in `CostMetric` and `TokenUsageMetric`. Adds explicit local type annotation `baseline_stats: dict[str, dict[str, dict[str, float]]]` in `_calculate_baseline_stats`.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant AgentOpsAdapter as Adapter
  participant ScenarioRunner as Runner
  participant AgentOps as AgentOpsClient

  Client->>Adapter: run_scenario(scenario)
  Note right of Adapter: run_scenario is async
  Adapter->>AgentOps: start_trace(trace_metadata)
  AgentOps-->>Adapter: trace_id
  Adapter->>Runner: execute scenario (trace_id)
  Runner-->>Adapter: scenario result / events
  Adapter->>AgentOps: update_trace_metadata(trace_id, result/metrics)
  alt success
    Adapter->>AgentOps: end_trace(trace_id, status=completed)
  else failure
    Adapter->>AgentOps: end_trace(trace_id, status=failed, details)
  end
  Adapter-->>Client: ScenarioResult (includes scenario_run_id derived from trace_id)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Pre-merge checks and finishing touches

❌ Failed checks (3 warnings, 1 inconclusive)

Check name	Status	Explanation	Resolution
Linked Issues check	⚠️ Warning	Issue #13 specifically requires type hints for DatasetCase in src/agentunit/datasets/base.py, but the PR changes focus on multiple other files with only indirect related changes to builtins.py, leaving the primary objective unaddressed.	Ensure that src/agentunit/datasets/base.py receives explicit type hints for all DatasetCase fields to directly address issue #13's acceptance criteria.
Out of Scope Changes check	⚠️ Warning	The PR modifies five files (agentops_adapter.py, scenario.py, builtins.py, builtin.py, integrations.py) with extensive type hint and signature changes beyond the scope of issue #13 which targets only DatasetCase in base.py.	Clarify whether all modified files are part of issue #13 or if additional scope has been added; alternatively, create separate issues for the out-of-scope changes in adapters, core, metrics, and production modules.
Docstring Coverage	⚠️ Warning	Docstring coverage is 48.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Title check	❓ Inconclusive	The title 'Fix miscellaneous type hints' is vague and generic, using non-descriptive language that doesn't convey the specific nature or scope of the changes.	Provide a more descriptive title that specifies the type of changes (e.g., 'Add type hints and improve type safety across multiple modules' or 'Fix type hint compliance in adapters, core, datasets, metrics, and production modules').

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description check	✅ Passed	The description covers the key sections but is sparse; it lists affected files and testing results without providing comprehensive context about the motivation or detailed changes.

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/agentunit/adapters/agentops_adapter.py (1)
73-87: Remove inconsistent and outdated docstring block.

Lines 73-87 contain a duplicate/outdated docstring that references "LangSmith" instead of "AgentOps" and has incorrect parameter descriptions (e.g., "Langsmith project ID" on line 78). This conflicts with the actual AgentOps docstring above (lines 56-65).
🔎 Proposed fix
-        """
-        Initialize LangSmith adapter.
-
-        Args:
-            api_key: LangSmith API key
-            project_id: Langsmith project ID
-            endpoint: Optional custom LangSmith endpoint
-            enable_tracing: Whether to enable automatic tracing
-            enable_feedback: Whether to collect feedback data
-        """
-        self.api_key = api_key
-        self.project_id = project_id
-        self.default_tags = default_tags or []
-        self.auto_start_session = auto_start_session
-

🧹 Nitpick comments (2)

src/agentunit/adapters/agentops_adapter.py (2)

70-71: Type annotation added for new attributes.

The self.client: Any annotation provides some type information, though Any sacrifices type safety. Consider using a protocol or the actual agentops client type when available for better type checking.

103-103: Document the import suppression or address the underlying issue.

The # type: ignore[import-not-found] comment suppresses mypy's missing import error. This aligns with the PR description noting that ragas import issues remain. Consider documenting why this is acceptable (e.g., optional dependency) in a comment.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 315a9a4 and cec0c1d.

📒 Files selected for processing (5)

src/agentunit/adapters/agentops_adapter.py (13 hunks)
src/agentunit/core/scenario.py (12 hunks)
src/agentunit/datasets/builtins.py (4 hunks)
src/agentunit/metrics/builtin.py (2 hunks)
src/agentunit/production/integrations.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (3)

src/agentunit/adapters/agentops_adapter.py (4)

src/agentunit/adapters/langsmith_adapter.py (2)

platform (104-106)

run_scenario (355-516)

src/agentunit/production/integrations.py (1)

MonitoringPlatform (33-41)

src/agentunit/core/scenario.py (1)

Scenario (23-278)

src/agentunit/reporting/results.py (1)

ScenarioResult (32-66)

src/agentunit/datasets/builtins.py (1)

src/agentunit/datasets/base.py (1)

DatasetCase (19-27)

src/agentunit/core/scenario.py (2)

src/agentunit/datasets/base.py (1)

name (38-39)

tests/test_framework_adapters.py (1)

flow (105-107)

🔇 Additional comments (11)

src/agentunit/production/integrations.py (1)

100-100: LGTM! Type annotation correctly describes the baseline stats structure.

The explicit type annotation accurately reflects the nested dictionary structure used for categorizing baseline statistics by metric category, metric name, and statistical measures.

src/agentunit/metrics/builtin.py (2)

146-148: LGTM! Defensive metadata access prevents AttributeError.

The change from direct trace.metadata access to getattr(trace, "metadata", {}) safely handles cases where the trace object may not have a metadata attribute, falling back to an empty dict.

172-174: LGTM! Consistent defensive pattern applied.

The same safe metadata access pattern is correctly applied here, maintaining consistency with CostMetric above.

src/agentunit/datasets/builtins.py (4)

16-24: LGTM! DatasetRow TypedDict correctly structures dataset rows.

The TypedDict definition properly mirrors the DatasetCase dataclass structure from base.py, with appropriate use of NotRequired for the optional metadata field. This provides strong type checking for dataset row dictionaries.

27-27: LGTM! Explicit type annotation improves type safety.

The type annotation list[DatasetRow] provides clear type information for the dataset, enabling better IDE support and type checking.

44-44: LGTM! Consistent type annotation applied.

The same list[DatasetRow] annotation is consistently applied to this dataset.

64-64: LGTM! Function signature properly typed.

The updated signature with list[DatasetRow] parameter type and Iterable[DatasetCase] return type provides clear type contracts for the loader function.
src/agentunit/core/scenario.py (2)
98-98: **LGTM! Using Any for kwargs is more appropriate than object.

The change from **options: object to **options: Any is the correct approach for variadic keyword arguments, providing better flexibility while maintaining type safety at call sites.

Also applies to: 116-116, 131-131, 146-146, 161-161, 176-176, 191-191, 207-207, 222-222, 242-242

71-71: Verify the necessity of the additional None check.

The change from getattr(flow, "__name__", "openai-agents-scenario") to getattr(flow, "__name__", None) or "openai-agents-scenario" adds protection against falsy __name__ values (like empty strings). However, this differs from the pattern used in other factory methods (lines 89, 107, 122, etc.).

Is there a specific case where flow.__name__ could be an empty string or other falsy value? If not, consider keeping consistency with other factory methods for maintainability.
#!/bin/bash
# Check if other factory methods in the same file use similar patterns
rg -n "getattr.*__name__" src/agentunit/core/scenario.py
src/agentunit/adapters/agentops_adapter.py (2)

355-355: LGTM! Explicit type annotation improves clarity.

The agent_counts: dict[str, int] annotation clearly documents the structure of the dictionary being built.

385-385: AgentOpsAdapter.run_scenario is async, but inconsistently implemented across adapters.

The run_scenario method in AgentOpsAdapter is now async, while LangSmithAdapter and WandbAdapter retain sync implementations. This creates API inconsistency: the abstract base class declares it as async, but not all adapters comply. Additionally, the runner's main execution path calls adapter.execute() rather than adapter.run_scenario(), so this breaking change only affects direct callers of the method. Consider either making all adapters async or clarifying whether run_scenario is the intended public interface (if not, it could be internal).

src/agentunit/adapters/agentops_adapter.py

coderabbitai · 2025-12-20T01:54:02Z

src/agentunit/adapters/agentops_adapter.py

+            if isinstance(interaction.from_agent, list):
+                unique_agents.update(interaction.from_agent)
+            else:
+                unique_agents.add(interaction.from_agent)
            if interaction.to_agent:
-                unique_agents.add(interaction.to_agent)
+                if isinstance(interaction.to_agent, list):
+                    unique_agents.update(interaction.to_agent)
+                else:
+                    unique_agents.add(interaction.to_agent)


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find the agentops_adapter.py file find . -type f -name "agentops_adapter.py" | head -20 # Check the file structure fd "agentunit" --type d | head -10

Repository: aviralgarg05/agentunit

Length of output: 125

🏁 Script executed:

#!/bin/bash # First, let's check if the file exists and get basic info if [ -f "src/agentunit/adapters/agentops_adapter.py" ]; then echo "File found. Checking lines 240-260 for AgentInteraction definition..." sed -n '240,260p' src/agentunit/adapters/agentops_adapter.py echo "" echo "Checking lines 342-350 for the code in review..." sed -n '342,350p' src/agentunit/adapters/agentops_adapter.py else echo "File not found at expected location. Searching for it..." find . -name "agentops_adapter.py" -type f 2>/dev/null fi # Also search for AgentInteraction class definition echo "" echo "Searching for AgentInteraction class definition..." rg "class AgentInteraction" --type py -A 10 2>/dev/null || echo "Pattern not found with rg"

Repository: aviralgarg05/agentunit

Length of output: 2159

Remove unnecessary list handling for from_agent.

The to_agent field is properly typed as AgentID | list[AgentID] to support broadcast messaging, so the list checks for that field are correct. However, from_agent is typed only as AgentID (string), not as a union type. Remove the isinstance check for from_agent being a list, or update its type hint to AgentID | list[AgentID] if list support is intentional.

🤖 Prompt for AI Agents

In src/agentunit/adapters/agentops_adapter.py around lines 342 to 350, the code treats interaction.from_agent as possibly being a list but its type is only AgentID (string); remove the unnecessary isinstance(interaction.from_agent, list) branch and simply call unique_agents.add(interaction.from_agent), unless you intentionally want to support broadcasting from multiple agents—in that case update the type hint for from_agent to AgentID | list[AgentID] and keep the existing list handling; ensure any callers and type annotations are updated to match the chosen approach.

src/agentunit/adapters/agentops_adapter.py

aviralgarg05

Please fix the issues

codecov-commenter · 2025-12-27T07:00:21Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 5.26316% with 36 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/agentunit/adapters/agentops_adapter.py	11.76%	15 Missing ⚠️
src/agentunit/datasets/builtins.py	0.00%	12 Missing ⚠️
src/agentunit/metrics/builtin.py	0.00%	6 Missing ⚠️
src/agentunit/core/scenario.py	0.00%	2 Missing ⚠️
src/agentunit/production/integrations.py	0.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (8)

src/agentunit/adapters/agentops_adapter.py (8)
73-86: Remove duplicate LangSmith code block.

Lines 73-86 contain a duplicate docstring and attribute assignments copied from the LangSmith adapter. The docstring incorrectly states "Initialize LangSmith adapter" and "Langsmith project ID", and the attribute assignments (api_key, project_id, default_tags, auto_start_session) redundantly repeat lines 66-69. This dead code reduces maintainability and creates confusion.
🔎 Proposed fix
-        """
-        Initialize LangSmith adapter.
-
-        Args:
-            api_key: LangSmith API key
-            project_id: Langsmith project ID
-            endpoint: Optional custom LangSmith endpoint
-            enable_tracing: Whether to enable automatic tracing
-            enable_feedback: Whether to collect feedback data
-        """
-        self.api_key = api_key
-        self.project_id = project_id
-        self.default_tags = default_tags or []
-        self.auto_start_session = auto_start_session
-
         # Initialize AgentOps client
         self._initialize_agentops()
380-388: Update docstring and log message to reference AgentOps instead of LangSmith.

The docstring at lines 380-386 states "Run a scenario with LangSmith integration" and the log message at line 389 references "LangSmith", but this is the AgentOps adapter. Update all references to reflect the correct integration.
🔎 Proposed fix
     async def run_scenario(self, scenario: Scenario) -> ScenarioResult:
         """
-        Run a scenario with LangSmith integration.
+        Run a scenario with AgentOps integration.
 
         Args:
             scenario: Scenario to execute
 
         Returns:
-            ScenarioResult: Execution results with LangSmith trace data
+            ScenarioResult: Execution results with AgentOps trace data
         """
-        logger.info(f"Running scenario with LangSmith: {scenario.name}")
+        logger.info(f"Running scenario with AgentOps: {scenario.name}")
 
-        # Start LangSmith run for the scenario
+        # Start AgentOps run for the scenario
         scenario_run_id = None
484-496: Update remaining LangSmith references to AgentOps.

Lines 484, 524, and associated code comments still reference "LangSmith run" when they should reference AgentOps traces. Update these comments for consistency.
🔎 Proposed fix
-            # Update LangSmith run with results
+            # Update AgentOps trace with results
             if scenario_run_id and self.enable_tracing:
                 try:
                     self.agentops.update_trace_metadata(
Apply similar changes to line 524.
Also applies to: 524-534

537-548: Update collect_metrics docstring to reference AgentOps.

The docstring at lines 539-547 states "Collect production metrics from LangSmith" but this is the AgentOps adapter. Update the docstring to reflect the correct integration platform.
🔎 Proposed fix
     def collect_metrics(self, scenario: Any, result: Any, **kwargs) -> ProductionMetrics:
         """
-        Collect production metrics from LangSmith.
+        Collect production metrics from AgentOps.
 
         Args:
             scenario: The scenario being evaluated
612-677: Update establish_baseline docstring and log messages to reference AgentOps.

The docstring at line 616 states "Establish baseline metrics from historical LangSmith data" and log messages reference "LangSmith baseline" and "LangSmith" errors. Update all references to reflect AgentOps integration.
🔎 Proposed fix
     def establish_baseline(
         self, historical_data: list[dict[str, Any]], metrics: list[str], **kwargs
     ) -> BaselineMetrics:
         """
-        Establish baseline metrics from historical LangSmith data.
+        Establish baseline metrics from historical AgentOps data.
 
         Args:
             historical_data: Historical data for baseline calculation
Apply similar updates to log messages at lines 642, 670, etc.
679-697: Update helper method docstring to reference AgentOps.

Line 680 docstring states "Extract metrics from LangSmith runs" but should reference AgentOps for consistency with the adapter's purpose.
🔎 Proposed fix
     def _extract_run_metrics(self, runs):
-        """Extract metrics from LangSmith runs."""
+        """Extract metrics from AgentOps runs."""
         durations = []
713-747: Update create_evaluation_dataset docstring and log messages to reference AgentOps.

The docstring at line 717 and log messages at lines 742 and 746 reference "LangSmith" when they should reference AgentOps.
🔎 Proposed fix
     def create_evaluation_dataset(
         self, name: str, examples: list[dict[str, Any]], description: str | None = None
     ) -> str:
         """
-        Create an evaluation dataset in LangSmith.
+        Create an evaluation dataset in AgentOps.
 
         Args:
             name: Dataset name
Apply similar updates to log messages.
749-776: Critical: run_evaluation uses LangSmith SDK instead of AgentOps SDK.

The run_evaluation method at lines 749-776 imports and calls langsmith.evaluation.evaluate, but in an AgentOps adapter where self.client is initialized as the AgentOps SDK (line 71: self.client = agentops). This will cause a runtime failure. Additionally, the current AgentOps Python SDK does not provide evaluation dataset or evaluation-run APIs (marked as "forthcoming" in their roadmap). Either remove this method or completely rewrite it using AgentOps APIs once they become available. Also fix the duplicate LangSmith docstring in __init__ (lines 73-82).

🧹 Nitpick comments (1)

src/agentunit/adapters/agentops_adapter.py (1)
72-72: Consider making platform a property for consistency with other adapters.

Line 72 assigns self.platform directly as an instance attribute. However, other adapters (LangSmithAdapter and WandbAdapter) implement platform as a @property method returning MonitoringPlatform. For consistency across adapters, consider refactoring this to match the pattern:
🔎 Proposed refactor

Remove line 72 and add a property method after the __init__ method:
-        self.platform = MonitoringPlatform.AGENTOPS
Then add after the __init__ method (around line 98):
    @property
    def platform(self) -> MonitoringPlatform:
        """Return the monitoring platform type."""
        return MonitoringPlatform.AGENT_OPS
Note: Use MonitoringPlatform.AGENT_OPS (with underscore) to match the enum definition in src/agentunit/production/integrations.py.
Based on relevant code snippets showing the pattern used by LangSmithAdapter and WandbAdapter.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cec0c1d and e0c8efd.

📒 Files selected for processing (1)

src/agentunit/adapters/agentops_adapter.py

🧰 Additional context used

🧬 Code graph analysis (1)

src/agentunit/adapters/agentops_adapter.py (4)

src/agentunit/adapters/langsmith_adapter.py (1)

platform (104-106)

src/agentunit/adapters/wandb_adapter.py (1)

platform (117-119)

src/agentunit/production/integrations.py (1)

MonitoringPlatform (33-41)

src/agentunit/reporting/results.py (1)

ScenarioResult (32-66)

🔇 Additional comments (4)

src/agentunit/adapters/agentops_adapter.py (4)

245-245: LGTM! Default to "broadcast" addresses validation concern.

The fallback to_agent or "broadcast" correctly prevents the ValueError("to_agent is required") that would occur with an empty string, and aligns with the broadcast semantics used in other adapters.

348-348: LGTM! Explicit type annotation improves clarity.

Adding the explicit type hint dict[str, int] for agent_counts improves code clarity and helps with type checking.

70-71: LGTM! Good type hint additions.

Extracting enable_tracing from kwargs and adding the explicit type annotation self.client: Any improves code clarity. While Any is broad, it's reasonable for a dynamically-typed client library.

378-378: No changes needed - async signature is correct and required by MultiAgentAdapter abstract method.

The run_scenario method must be async because it implements the abstract method from MultiAgentAdapter (line 372 in src/agentunit/multiagent/__init__.py), which defines async def run_scenario. This is not a premature change—it is the required implementation signature for the base class. No internal callers were found because this is an abstract method implemented by subclasses; external consumers would already be designed to handle async execution when using these adapters.

Likely an incorrect or invalid review comment.

coderabbitai · 2025-12-29T23:51:37Z

src/agentunit/adapters/agentops_adapter.py

+            import agentops  # type: ignore[import-not-found]

            self.agentops = agentops
+            self.client = agentops


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for list_runs usage in the codebase echo "=== Searching for list_runs usage ===" rg -n "\.list_runs\(" --type py -C 3 echo "" echo "=== Searching for self.client usage in agentops_adapter.py ===" rg -n "self\.client\." src/agentunit/adapters/agentops_adapter.py -C 2 echo "" echo "=== Check lines 551, 637 specifically ===" sed -n '549,553p; 635,639p' src/agentunit/adapters/agentops_adapter.py echo "" echo "=== Check if AgentOps client is instantiated ===" rg -n "agentops\.(Client|init|get_session)" --type py -C 2

Repository: aviralgarg05/agentunit

Length of output: 4591

🌐 Web query:

AgentOps Python SDK client API list_runs method

💡 Result:

Do you mean the AgentOps SDK's method signature / example usage for a client.list_runs (to list runs/traces), or are you seeing an error when calling list_runs? Which language (pure agentops Python SDK, or the Public API via HTTP) and which version do you have?

🌐 Web query:

AgentOps Python SDK init method returns client session

💡 Result:

Yes — agentops.init() creates the SDK client and (by default) starts a session and returns a Session object. If you call agentops.init(auto_start_session=False) it will initialize the client without starting/returning a session. The docs also note the init call constructs the underlying Client (accessible via get_client()). [1][2]

Sources:

SDK reference (init, get_client). [1]

Sessions / Starting a session (auto_start_session behavior). [2]

🌐 Web query:

AgentOps SDK API methods list_runs create_dataset create_example

💡 Result:

I couldn't find public AgentOps SDK methods named exactly list_runs, create_dataset, or create_example in the AgentOps docs or SDK source. The official docs show the Python/TS SDK surface and a REST API (you can fetch the OpenAPI spec for full endpoint names), and AgentOps exposes session/run-related endpoints via its REST API. See:

SDK reference / quickstart (Python SDK overview). [1]

REST API / integrations page (mentions OpenAPI spec & session/run auth flow). [2]

If you want, I can:

fetch the AgentOps OpenAPI (JSON) and extract exact endpoint/method names (recommended), or

search the AgentOps GitHub repos for those exact method names and return code examples.

Which would you prefer?

Sources:
[1] https://docs.agentops.ai/v2/usage/sdk-reference
[2] https://docs.agentops.ai/v1/integrations/rest

🏁 Script executed:

#!/bin/bash # Check the context around agentops.init() to see what it returns echo "=== Lines around agentops.init() calls ===" sed -n '105,125p' src/agentunit/adapters/agentops_adapter.py echo "" echo "=== Check what self.client is used for ===" rg -n "self\.client\." src/agentunit/adapters/agentops_adapter.py | head -20 echo "" echo "=== Check if there's a get_client or session pattern ===" rg -n "agentops\.(get_client|get_session|Session|Client)" --type py

Repository: aviralgarg05/agentunit

Length of output: 1232

Critical issue: self.client is assigned to the agentops module but code calls non-existent LangSmith API methods.

At lines 103–106, the code assigns the agentops module to both self.agentops and self.client. However, agentops.init() (lines 110, 116) returns a Session object that is never captured. More critically, lines 551, 636, 728, and 735 attempt to call self.client.list_runs(), self.client.create_dataset(), and self.client.create_example() — methods that do not exist in the AgentOps SDK. These are LangSmith API methods (as evidenced by the comment "Query recent runs from LangSmith" at line 550). The adapter appears to be an incompletely adapted copy of the LangSmith adapter and will fail at runtime with AttributeError when these methods are invoked. Either capture the Session object returned by agentops.init() and use AgentOps-compatible APIs, or remove functionality that relies on non-existent methods.

🤖 Prompt for AI Agents

In src/agentunit/adapters/agentops_adapter.py around lines 103 to 106, the adapter assigns the agentops module to self.client but never captures the Session returned by agentops.init(), and later calls LangSmith-specific methods (list_runs, create_dataset, create_example) that do not exist on AgentOps; fix by changing the initialization to store the Session returned by agentops.init() into self.client/self.session and refactor subsequent calls to use AgentOps-compatible APIs on that Session, or remove/replace the LangSmith-specific methods—ensure every call site (lines ~551, 636, 728, 735) either calls a real AgentOps method on the Session or is removed, and update imports/docs accordingly.

coderabbitai · 2025-12-29T23:51:37Z

src/agentunit/adapters/agentops_adapter.py

+                run_id = self.agentops.start_trace(
                    name=f"Scenario: {scenario.name}",
                    run_type="chain",
-                    project_name=self.project_name,
-                    inputs={"scenario": scenario.name, "description": scenario.description},
+                    project_name=self.project_id,
+                    inputs={"scenario": scenario.name},
                    tags=["agentunit", "scenario"],
                )
-                scenario_run_id = str(run.id)
+                scenario_run_id = str(run_id)


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

AgentOps Python SDK start_trace update_trace_metadata end_trace API documentation

💡 Result:

Here are the AgentOps Python SDK docs and the key details for start_trace, update_trace_metadata, and end_trace.

start_trace(trace_name: str = None, tags: Union[Dict, List] = None) — manually start a trace; returns a TraceContext (can be used as a context manager with with). Use when auto-start is disabled or you need separate traces. [1][2]

update_trace_metadata(metadata: Dict[str, Any], prefix: str = "trace.metadata") -> bool — update running-trace metadata (strings/numbers/booleans/lists), maps common keys to semantic conventions (e.g., tags, agent_name), validates types, returns True/False. Useful to add progress, stage, results mid-run. [1][2][3]

end_trace(trace: TraceContext = None, end_state: str = None, end_prompt: str = None, error_message: str = None) — end a specific trace (or all active traces if trace omitted); you can set an end_state like "Success"/"Error" and provide optional end details. Context-manager usage automatically ends traces on exit (Success or Error depending on exception). [1][2][4]

Concise examples (from docs):

Manual start/update/end
import agentops
agentops.init(api_key="KEY", auto_start_session=False)
trace = agentops.start_trace("my-workflow", tags=["prod"])
... run ...
agentops.update_trace_metadata({"stage":"processing","tags":["in-progress"]})
agentops.end_trace(trace, "Success")
[1][2]

Docs pages:

SDK Reference (start_trace, update_trace_metadata, end_trace). [1]

Manual Trace Control / examples. [2]

Quickstart / updating trace metadata snippet. [3]

Context manager / with start_trace behavior. [4]

Would you like direct links to any of those pages or a full code snippet tailored to your app?

🏁 Script executed:

cd /tmp && find . -name "agentops_adapter.py" -type f 2>/dev/null | head -1

Repository: aviralgarg05/agentunit

Length of output: 48

🏁 Script executed:

# Search for the file in the repository git ls-files | grep agentops_adapter.py

Repository: aviralgarg05/agentunit

Length of output: 108

🏁 Script executed:

# Check imports and initialization in the adapter file head -50 src/agentunit/adapters/agentops_adapter.py

Repository: aviralgarg05/agentunit

Length of output: 1388

🏁 Script executed:

# Search for other start_trace calls in the codebase rg "start_trace" --type py

Repository: aviralgarg05/agentunit

Length of output: 160

🏁 Script executed:

# Check the full __init__ method and how self.agentops is initialized sed -n '40,150p' src/agentunit/adapters/agentops_adapter.py

Repository: aviralgarg05/agentunit

Length of output: 3978

🏁 Script executed:

# Look for agentops import and initialization rg "import.*agentops|from.*agentops|self\.agentops\s*=" src/agentunit/adapters/agentops_adapter.py -A 3

Repository: aviralgarg05/agentunit

Length of output: 235

🏁 Script executed:

# Check around line 394-401 for more context sed -n '380,410p' src/agentunit/adapters/agentops_adapter.py

Repository: aviralgarg05/agentunit

Length of output: 1221

🏁 Script executed:

# Check if there's a custom wrapper around agentops being used rg "class.*AgentOps|def start_trace|def update_trace_metadata|def end_trace" src/agentunit/adapters/agentops_adapter.py -B 2 -A 5

Repository: aviralgarg05/agentunit

Length of output: 316

🏁 Script executed:

# Check the entire run_scenario method to see how start_trace, update_trace_metadata, end_trace are used sed -n '370,550p' src/agentunit/adapters/agentops_adapter.py

Repository: aviralgarg05/agentunit

Length of output: 6746

All three AgentOps trace method calls use incorrect parameter signatures incompatible with the official SDK.

Lines 394-401 call start_trace(name=, run_type=, project_name=, inputs=, tags=) but the AgentOps SDK signature is start_trace(trace_name: str = None, tags: Union[Dict, List] = None). The parameters run_type, project_name, and inputs do not exist in the SDK.

Lines 487-491 and 527-531 call update_trace_metadata(trace_id=, metadata=) and end_trace(trace_id=, status_code=) which use non-existent parameters. The correct signatures are:

update_trace_metadata(metadata: Dict[str, Any], prefix: str = "trace.metadata")

end_trace(trace: TraceContext = None, end_state: str = None, ...)

Additionally, the method docstring incorrectly states "Run a scenario with LangSmith integration" (line ~382) despite being in the AgentOpsAdapter class.

Refactor all three method calls to match the official AgentOps SDK signatures, and correct the docstring.

🤖 Prompt for AI Agents

In src/agentunit/adapters/agentops_adapter.py around lines 382 and 394-401 (and also update calls near 487-491 and 527-531), the AgentOps SDK calls use the wrong parameter names and the docstring is wrong: change the method docstring to reference AgentOps integration (not LangSmith), refactor the start_trace call to use the SDK signature start_trace(trace_name=..., tags=...) (move project_id, run_type and inputs into the tags or into a metadata dict passed via update_trace_metadata), refactor update_trace_metadata calls to call update_trace_metadata(metadata_dict, prefix="trace.metadata") (pass a dict of metadata as first arg), and refactor end_trace calls to call end_trace(trace=trace_context_or_id, end_state="success"/"failed") using the proper end_state string rather than status_code; ensure trace identifiers/types match the SDK expected TraceContext or trace id usage and build metadata/tags as dicts/lists accordingly.

aviralgarg05

Pls fix the issues

evanjhoward11 added 3 commits December 19, 2025 11:13

chore(type hints): update DatasetRow docstring

229b195

chore(type hints): fix NotRequired conflict

717e45f

chore(type hints): fix types hints in 5 more files

cec0c1d

evanjhoward11 mentioned this pull request Dec 20, 2025

Add missing type hints to DatasetCase #13

Open

coderabbitai bot reviewed Dec 20, 2025

View reviewed changes

aviralgarg05 requested changes Dec 20, 2025

View reviewed changes

Chore(type hints): Fix adapters/agentops_adapter.py per PR comments

e0c8efd

coderabbitai bot reviewed Dec 29, 2025

View reviewed changes

aviralgarg05 requested changes Dec 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix miscellaneous type hints #45

Fix miscellaneous type hints #45

evanjhoward11 commented Dec 20, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

continue bot commented Dec 20, 2025

Uh oh!

coderabbitai bot commented Dec 20, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Dec 20, 2025

Uh oh!

Uh oh!

aviralgarg05 left a comment

Uh oh!

codecov-commenter commented Dec 27, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 29, 2025

Uh oh!

coderabbitai bot Dec 29, 2025

Uh oh!

aviralgarg05 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix miscellaneous type hints #45

Are you sure you want to change the base?

Fix miscellaneous type hints #45

Conversation

evanjhoward11 commented Dec 20, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Changes Made

Testing

Test Configuration

Test Results

Code Quality

Documentation

Breaking Changes

Dependencies

Performance Impact

Screenshots/Recordings (if applicable)

Additional Context

Checklist

Reviewer Notes

Please pay special attention to:

Post-Merge Tasks

Summary by CodeRabbit

Uh oh!

continue bot commented Dec 20, 2025

Uh oh!

coderabbitai bot commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aviralgarg05 left a comment

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Dec 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 29, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 29, 2025

Choose a reason for hiding this comment

... run ...

Uh oh!

aviralgarg05 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

evanjhoward11 commented Dec 20, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 20, 2025 •

edited

Loading

codecov-commenter commented Dec 27, 2025 •

edited

Loading