Add command-coverage scenarios and extend scenario runner for command output/assertions #7

akiselev · 2026-01-21T04:16:36Z

Motivation

Provide per-language test scenarios that exercise the CLI command surface (breakpoints, stepping, stack, threads, context, output, session control) for C, Rust, Go and Python.
Enhance the scenario runner to support common command-based assertions and optional failures so scenarios can express realistic, adapter-dependent behavior.

Description

Added an allow_failure flag and output_contains field to CommandExpectation in src/testing/config.rs to allow commands to be marked as permitted to fail and to assert on command outputs.
Updated execute_command_step in src/testing/runner.rs to honor allow_failure, validate output_contains against the serialized command response, and print a concise pass message for allowed failures.
Extended parse_command in src/testing/runner.rs to support the context/where command (with optional line count) and the output command with --tail/-t and --clear options, mapping them to Command::Context and Command::GetOutput.
Added four new YAML scenarios under tests/scenarios/: commands_c.yml, commands_rust.yml, commands_go.yml, and commands_python.yml, each exercising command coverage against the tests/fixtures/simple.* programs and using the new expectation fields.

Testing

No automated test run was executed as part of this change (no CI/tests were invoked during the rollout).

Codex Task

gemini-code-assist · 2026-01-21T04:16:54Z

Summary of Changes

Hello @akiselev, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the debugger's testing framework by introducing a suite of new, language-specific scenarios that validate the functionality of various CLI commands. It also upgrades the underlying scenario runner to provide more sophisticated assertion capabilities, allowing tests to check for specific output content and to gracefully handle commands that are expected to fail. These changes collectively improve the robustness and expressiveness of the debugger's test suite.

Highlights

Expanded Command Coverage Testing: New test scenarios have been added for C, Rust, Go, and Python to comprehensively test the debugger's CLI command surface, including breakpoints, stepping, stack, threads, context, output, and session control.
Enhanced Scenario Runner Assertions: The scenario runner now supports asserting on the content of command outputs using the new output_contains field in CommandExpectation.
Flexible Command Failure Handling: A new allow_failure flag has been introduced in CommandExpectation, enabling scenarios to mark commands that are expected to fail without halting the entire test run.
New Debugger Commands Supported: The scenario runner's command parser has been extended to support the context (or where) command with line count, and the output command with --tail/-t and --clear options.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces significant enhancements to the test scenario runner, allowing for more flexible command expectation handling and extending command parsing capabilities. The addition of allow_failure and output_contains fields to CommandExpectation provides better control over test assertions, especially for commands that might not always succeed or whose output needs specific validation. The new context and output command parsing further expands the test runner's ability to simulate CLI interactions. The accompanying language-specific test scenarios for C, Rust, Go, and Python effectively demonstrate and validate these new features, ensuring broader command coverage. Overall, these changes improve the robustness and expressiveness of the testing framework.

gemini-code-assist · 2026-01-21T04:18:19Z

src/testing/runner.rs

+
+    if let Some(exp) = expect {
+        if let Some(expected_substr) = &exp.output_contains {
+            let output = serde_json::to_string(&value).unwrap_or_default();


The use of unwrap_or_default() on serde_json::to_string(&value) can mask potential serialization errors. If value fails to serialize, output will become an empty string, leading to a TestAssertion error that the output is missing the substring, rather than indicating a serialization issue. This could make debugging test failures more difficult. Consider handling the serde_json::to_string result explicitly to provide a more informative error message if serialization fails.

let output = serde_json::to_string(&value).map_err(|e| { Error::TestAssertion(format!( "Failed to serialize command result for output check: {}", e )) })?;

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d58beebe5e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-01-21T04:25:14Z

src/testing/runner.rs

+        "context" | "where" => {
+            let lines = if let Some(value) = args.first() {
+                value.parse().map_err(|_| {
+                    Error::Config(format!("Invalid context line count: {}", value))
+                })?


Support CLI-style --lines for context

The scenario parser only accepts a positional number for context, so a valid CLI form like context --lines 5 (the CLI definition uses a --lines flag; see src/commands.rs:100-105) will be parsed as a non-numeric argument and fail with Invalid context line count: --lines. This breaks scenarios that mirror the real CLI syntax even though the command is valid for users. Consider handling --lines in the parse_command branch so test scenarios can use the same flags as the CLI.

Useful? React with 👍 / 👎.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-21T04:45:30Z

src/testing/config.rs

    pub success: Option<bool>,
    /// Substring that should be in the output
    pub output_contains: Option<String>,
+    /// Allow failures without failing the test


The documentation comment for the allow_failure field is incomplete. It should explain when this flag should be used versus success: false, and clarify the behavior when a command succeeds despite allow_failure: true being set (does it pass or fail?). Consider expanding this to: "Allow the command to fail without failing the test. Unlike success: false which expects failure, this permits either success or failure."

Suggested change

/// Allow failures without failing the test

/// Allow the command to fail without failing the test. Unlike `success: false`

/// which expects failure, this permits either success or failure.

Copilot · 2026-01-21T04:45:30Z

src/testing/runner.rs

+        "context" | "where" => {
+            let lines = if let Some(value) = args.first() {
+                value.parse().map_err(|_| {
+                    Error::Config(format!("Invalid context line count: {}", value))
+                })?
+            } else {
+                5
+            };
+            Ok(Command::Context { lines })
+        }


The context and where command parsing accepts an optional line count parameter, but if a non-numeric value is provided as the first argument, the error message refers to "context line count" regardless of whether the user typed "context" or "where". Consider making the error message dynamic to reflect the actual command used, or accept that "context" is the canonical name shown in error messages.

Copilot · 2026-01-21T04:45:30Z

src/testing/runner.rs

+        "context" | "where" => {
+            let lines = if let Some(value) = args.first() {
+                value.parse().map_err(|_| {
+                    Error::Config(format!("Invalid context line count: {}", value))
+                })?
+            } else {
+                5
+            };
+            Ok(Command::Context { lines })
+        }
+
+        "output" => {
+            let mut tail = None;
+            let mut clear = false;
+            let mut idx = 0;
+
+            while idx < args.len() {
+                match args[idx] {
+                    "--tail" | "-t" => {
+                        if idx + 1 >= args.len() {
+                            return Err(Error::Config(
+                                "output --tail requires a number".to_string(),
+                            ));
+                        }
+                        tail = Some(args[idx + 1].parse().map_err(|_| {
+                            Error::Config(format!("Invalid tail value: {}", args[idx + 1]))
+                        })?);
+                        idx += 2;
+                    }
+                    "--clear" => {
+                        clear = true;
+                        idx += 1;
+                    }
+                    other => {
+                        return Err(Error::Config(format!(
+                            "Unknown output option: {}",
+                            other
+                        )));
+                    }
+                }
+            }
+
+            Ok(Command::GetOutput { tail, clear })
+        }


The newly added command parsing for context/where and output commands lacks unit test coverage. The existing test module includes tests for other command parsers (simple commands, break commands, breakpoint subcommands, print commands), but the new additions are not tested. Consider adding test cases such as test_parse_context_commands() and test_parse_output_commands() to verify correct parsing of line counts, tail options, and clear flags.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Add command coverage scenarios per language

d58beeb

Copilot AI review requested due to automatic review settings January 21, 2026 04:16

akiselev added the codex label Jan 21, 2026 — with ChatGPT Codex Connector

Copilot started reviewing on behalf of akiselev January 21, 2026 04:16 View session

gemini-code-assist bot reviewed Jan 21, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Jan 21, 2026

View reviewed changes

Copilot AI reviewed Jan 21, 2026

View reviewed changes

akiselev requested a review from Copilot January 21, 2026 04:40

Copilot started reviewing on behalf of akiselev January 21, 2026 04:41 View session

Copilot AI reviewed Jan 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add command-coverage scenarios and extend scenario runner for command output/assertions #7

Add command-coverage scenarios and extend scenario runner for command output/assertions #7

Uh oh!

akiselev commented Jan 21, 2026

Uh oh!

gemini-code-assist bot commented Jan 21, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 21, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Jan 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 21, 2026

Uh oh!

Copilot AI Jan 21, 2026

Uh oh!

Copilot AI Jan 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	/// Allow failures without failing the test
	/// Allow the command to fail without failing the test. Unlike `success: false`
	/// which expects failure, this permits either success or failure.

Add command-coverage scenarios and extend scenario runner for command output/assertions #7

Are you sure you want to change the base?

Add command-coverage scenarios and extend scenario runner for command output/assertions #7

Uh oh!

Conversation

akiselev commented Jan 21, 2026

Motivation

Description

Testing

Uh oh!

gemini-code-assist bot commented Jan 21, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant