-
Notifications
You must be signed in to change notification settings - Fork 1
Fix run regressions #102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix run regressions #102
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes regressions in the CLI run functionality that were introduced when the MCP server was added. The main issue was that the run functionality was refactored to use a pure inner function for both CLI and MCP server, but this broke several CLI behaviors.
- Restores proper newline formatting for remote runs and colored output
- Adds separate sinks for different output formatting needs (ChannelSink vs PlainChannelSink)
- Improves error handling for 422 validation errors in both CLI and MCP contexts
- Reintroduces spinners for CLI runs and ensures proper status monitoring
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/mock-api-server/main.py | Adds mock 422 error handling for testing invalid parameters |
| tests/integration/features/steps/mcp_steps.py | Adds comprehensive test steps for MCP plain text output and error handling |
| tests/integration/features/steps/cli_steps.py | New CLI-specific test steps for verifying colors, formatting, and spinners |
| tests/integration/features/mcp_app_management.feature | Adds scenarios testing MCP plain text output and error handling |
| tests/integration/features/cli_runs.feature | New feature file testing CLI-specific run behaviors |
| crates/tower-cmd/src/run.rs | Major refactoring with separate sinks, improved error handling, and restored formatting |
| crates/tower-cmd/src/output.rs | Makes format_timestamp public for use in different sinks |
| crates/tower-cmd/src/mcp.rs | Updates MCP run tool to use plain text sink and accept parameters |
| crates/tower-cmd/src/api.rs | Enhances error handling for 4xx/5xx responses |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
crates/tower-cmd/src/run.rs
Outdated
| let (output_result, status_result) = tokio::join!(output_task, status_task); | ||
|
|
||
| let final_result = output_result.unwrap(); | ||
| status_result.unwrap(); // This will report crash/success status |
Copilot
AI
Sep 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment on line 171 is misleading. The unwrap() call will panic if the status task fails, but it won't 'report' anything - it will just crash the program. Consider using proper error handling or updating the comment to reflect the actual behavior.
| status_result.unwrap(); // This will report crash/success status | |
| if let Err(e) = status_result { | |
| eprintln!("Error monitoring app status: {:?}", e); | |
| return Err(Error::from(e)); | |
| } |
crates/tower-cmd/src/run.rs
Outdated
| if let Some((ts, msg)) = line.split_once(": ") { | ||
| output::log_line(ts, msg, output::LogLineType::Remote); | ||
| } else { | ||
| output::write(&format!("{}\n", line)); | ||
| } |
Copilot
AI
Sep 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This timestamp parsing logic is duplicated in multiple sink implementations (StdoutSink, ChannelSink, PlainChannelSink). Consider extracting this into a common helper function to reduce code duplication.
| entity: None, | ||
| })) | ||
| } | ||
| response => unwrap_api_response(async { response }).await, |
Copilot
AI
Sep 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 168 creates an unnecessary async block just to pass the response to unwrap_api_response. Consider refactoring to handle the success case directly or modify unwrap_api_response to accept the response directly.
| response => unwrap_api_response(async { response }).await, | |
| response => unwrap_api_response(response).await, |
bradhe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I don't think there's anything critical here that should block the PR. I think there's a bit of follow-up/cleanup we should do later on!
| Ok(response) if response.status.is_client_error() || response.status.is_server_error() => { | ||
| Err(Error::ResponseError(tower_api::apis::ResponseContent { | ||
| tower_trace_id: response.tower_trace_id, | ||
| status: response.status, | ||
| content: response.content, | ||
| entity: None, | ||
| })) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder if this can be generalized and, as Copilot suggests, this can get pushed into update_api_response? Nothing really fancy going on here.
| struct RunRequest { | ||
| #[serde(flatten)] | ||
| common: CommonParams, | ||
| parameters: Option<std::collections::HashMap<String, String>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I was just thinking that it'd be good to push use std::collections::HashMap onto the top of the file. But, then I thought there's probably a guideline for when we should use something versus directly referencing it, and I'd bet it's about how many times the thing is used in a file?
So, now I'm not sure :) Just thinking out loud, no real action in this comment but perhaps it will inspire conversation elsewhere.
crates/tower-cmd/src/run.rs
Outdated
| impl OutputSink for ChannelSink { | ||
| fn send_line(&self, line: String) { | ||
| let _ = self.0.send(line); | ||
| if let Some((ts, msg)) = line.split_once(": ") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Man this really makes me think that we should be pushing the timestamp into a parameter to send_line on the OutputSink trait, considering we're doing it in two places. Probably want to be able to format the ts differently for different cases...
Could be follow-up.
08eaa72 to
82762ca
Compare
remote run fixes on CLI: - newlines (!) - colorized output - parameter errors correctly reported local run fixes on CLI: - error out properly (previously would abort before the status was actually reporting the error) MCP server: - new plain text sink instead of using the same between MCP and CLI - prefix errors with "ERROR: " (probably better than "Oh no!" for an LLM to parse?) - propagate 422 error from api so that it gets interpreted as an error, and thus able to check parameter validation correctly
…onal logic everywhere
…ied architecture Enable real-time progress notifications during tool execution and clean up redundant code patterns. ## Real-time Streaming Implementation - Add RequestContext<RoleServer> parameter to tower_run_local and tower_run_remote - Implement setup_streaming_output() to forward write() calls to notify_progress() - Create with_streaming() wrapper functions to abstract the pattern WHY: Users should see output as it happens during long-running operations, not wait for completion. The SSE infrastructure was designed for this but tools were batching output. ## Remove Redundant Capture Functions - Delete do_run_local_capture, do_run_remote_capture_plain, do_follow_run_capture_plain - Fix monitor_output_capture to use global output::write() system properly WHY: These functions duplicated the global write() capture mechanism and were only used once. The global system already handles capture when CAPTURE_SENDER is set. ## Consolidate Error Handling - Extract error parsing into dedicated functions: extract_api_error_message, parse_error_response - Replace nested conditionals with early returns and helper functions WHY: Error handling logic was scattered and duplicated across handlers. Centralized functions ensure consistent error messages and reduce repetition. ## Validated by Integration Tests - All 22 test scenarios pass, confirming functionality preserved - Tests verify both success cases and error handling work correctly - Streaming behavior validated through existing MCP client integration 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Separate "should we capture" from "where to send" by splitting CAPTURE_SENDER -> (CAPTURE_MODE, CURRENT_SENDER) So now: - CAPTURE_MODE: immutable context flag (MCP vs CLI mode) - CURRENT_SENDER: mutable destination that can change per operation - Enable concurrent tool executions with client-specific senders - Update write() and Spinner to use decoupled architecture Huh? The previous design complected capture context with sender destination. E.G., previous OnceLock design couldn't handle multiple MCP clients - each tool execution needs output routed to the requesting client, but OnceLock meant all output went to whoever connected first.
Collect all output during streaming operations and include it in final responses to ensure MCP integration tests receive expected output. - Collect output in Arc<Mutex<Vec<String>>> during streaming - Include captured output in both success and error responses - Add small delay for message processing completion (unhappy about this but claude suggested it and it fixes the integration test, so committing for now) Integration tests expected complete output in MCP tool responses, but previous implementation only sent real-time notifications without including output in the final CallToolResult (oops).
82762ca to
83f09e6
Compare
- Convert deploy functions to return Results instead of void - Add proper error variants for DeployApp and DescribeApp API errors - Update MCP deploy tool to handle and report deploy failures - Ensure deploy errors are properly propagated up the call stack 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Update test scenarios for better CLI validation coverage - Enhance mock server with better error responses and endpoints - Improve test environment setup and configuration - Add better test step definitions for MCP scenarios - Update documentation for mock server usage 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
0d11894 to
91aa8fa
Compare
|
closed in favour of the PR stack that was eventually folded into fix/tow-963-switch-to-global-output-writing (#105) |
The MCP server resulted in a bunch of regression for running an app via the actual CLI — this was because I refactored it to have a pure inner function and a side-effectful outer function, and re-used the inner function in both the MCP server and in the CLI. I tried to keep most of the functionality by using a ChannelSink at the time, with the idea that we could still stream logs in. However, fundamentally the behaviour needs to be different.
So:
remote runs have newlines again (😅)
there's two different sinks now,
ChannelSinkfor the normal formatted lines, andPlainChannelSinkfor no formatting (for MCP server responses)422 errors were not being successfully interpreted as errors by the MCP server, and fundamentally it seemed to be because the
unwrap_api_responsefrom the api.rs didn't treat them as errors (which they're not really — they're 4xx issues). I'm not 100% sure this solution is the best and am interested in other suggestionsHowever we want to be able to see we've passed in the wrong parameters to a run, and the MCP server should accordingly return with an error when this fails, so now we wrap the call to
unwrap_api_responsewith a check to see if there was a client error or server error — and when there is, return an actual Err response that we can match for in the sink logic (see here and here)reintroduce spinners (and pass in a boolean to enable or disable them so as not to spam the MCP server)
timestamps are colourized again
the lines appear one by one. Not sure exactly why that was broken before, maybe the fact it was all on one line they got buffered
local runs correctly detect when the app errored out locally (this got subtly broke in the refactor,
status_task.abort();got called before we printed out the failure)lots of regression tests so this won't happen again (and of course, the tests caught a bunch of ways my fixes weren't working 🙃)