Fix run regressions #102

socksy · 2025-09-19T18:50:50Z

The MCP server resulted in a bunch of regression for running an app via the actual CLI — this was because I refactored it to have a pure inner function and a side-effectful outer function, and re-used the inner function in both the MCP server and in the CLI. I tried to keep most of the functionality by using a ChannelSink at the time, with the idea that we could still stream logs in. However, fundamentally the behaviour needs to be different.

So:

remote runs have newlines again (😅)
there's two different sinks now, ChannelSink for the normal formatted lines, and PlainChannelSink for no formatting (for MCP server responses)
422 errors were not being successfully interpreted as errors by the MCP server, and fundamentally it seemed to be because the unwrap_api_response from the api.rs didn't treat them as errors (which they're not really — they're 4xx issues). I'm not 100% sure this solution is the best and am interested in other suggestions

However we want to be able to see we've passed in the wrong parameters to a run, and the MCP server should accordingly return with an error when this fails, so now we wrap the call to unwrap_api_response with a check to see if there was a client error or server error — and when there is, return an actual Err response that we can match for in the sink logic (see here and here)
reintroduce spinners (and pass in a boolean to enable or disable them so as not to spam the MCP server)
timestamps are colourized again
the lines appear one by one. Not sure exactly why that was broken before, maybe the fact it was all on one line they got buffered
local runs correctly detect when the app errored out locally (this got subtly broke in the refactor, status_task.abort(); got called before we printed out the failure)
lots of regression tests so this won't happen again (and of course, the tests caught a bunch of ways my fixes weren't working 🙃)

Copilot

Pull Request Overview

This PR fixes regressions in the CLI run functionality that were introduced when the MCP server was added. The main issue was that the run functionality was refactored to use a pure inner function for both CLI and MCP server, but this broke several CLI behaviors.

Restores proper newline formatting for remote runs and colored output
Adds separate sinks for different output formatting needs (ChannelSink vs PlainChannelSink)
Improves error handling for 422 validation errors in both CLI and MCP contexts
Reintroduces spinners for CLI runs and ensures proper status monitoring

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
tests/mock-api-server/main.py	Adds mock 422 error handling for testing invalid parameters
tests/integration/features/steps/mcp_steps.py	Adds comprehensive test steps for MCP plain text output and error handling
tests/integration/features/steps/cli_steps.py	New CLI-specific test steps for verifying colors, formatting, and spinners
tests/integration/features/mcp_app_management.feature	Adds scenarios testing MCP plain text output and error handling
tests/integration/features/cli_runs.feature	New feature file testing CLI-specific run behaviors
crates/tower-cmd/src/run.rs	Major refactoring with separate sinks, improved error handling, and restored formatting
crates/tower-cmd/src/output.rs	Makes format_timestamp public for use in different sinks
crates/tower-cmd/src/mcp.rs	Updates MCP run tool to use plain text sink and accept parameters
crates/tower-cmd/src/api.rs	Enhances error handling for 4xx/5xx responses

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-19T18:52:23Z

crates/tower-cmd/src/run.rs

+    let (output_result, status_result) = tokio::join!(output_task, status_task);
+
+    let final_result = output_result.unwrap();
+    status_result.unwrap(); // This will report crash/success status


The comment on line 171 is misleading. The unwrap() call will panic if the status task fails, but it won't 'report' anything - it will just crash the program. Consider using proper error handling or updating the comment to reflect the actual behavior.

Suggested change

status_result.unwrap(); // This will report crash/success status

if let Err(e) = status_result {

eprintln!("Error monitoring app status: {:?}", e);

return Err(Error::from(e));

}

Copilot · 2025-09-19T18:52:23Z

crates/tower-cmd/src/run.rs

+        if let Some((ts, msg)) = line.split_once(": ") {
+            output::log_line(ts, msg, output::LogLineType::Remote);
+        } else {
+            output::write(&format!("{}\n", line));
+        }


This timestamp parsing logic is duplicated in multiple sink implementations (StdoutSink, ChannelSink, PlainChannelSink). Consider extracting this into a common helper function to reduce code duplication.

Copilot · 2025-09-19T18:52:24Z

crates/tower-cmd/src/api.rs

+                entity: None,
+            }))
+        }
+        response => unwrap_api_response(async { response }).await,


Line 168 creates an unnecessary async block just to pass the response to unwrap_api_response. Consider refactoring to handle the success case directly or modify unwrap_api_response to accept the response directly.

Suggested change

response => unwrap_api_response(async { response }).await,

response => unwrap_api_response(response).await,

bradhe

Personally, I don't think there's anything critical here that should block the PR. I think there's a bit of follow-up/cleanup we should do later on!

bradhe · 2025-09-20T10:34:27Z

crates/tower-cmd/src/api.rs

+        Ok(response) if response.status.is_client_error() || response.status.is_server_error() => {
+            Err(Error::ResponseError(tower_api::apis::ResponseContent {
+                tower_trace_id: response.tower_trace_id,
+                status: response.status,
+                content: response.content,
+                entity: None,
+            }))
+        }


Wonder if this can be generalized and, as Copilot suggests, this can get pushed into update_api_response? Nothing really fancy going on here.

bradhe · 2025-09-20T10:36:07Z

crates/tower-cmd/src/mcp.rs

+struct RunRequest {
+    #[serde(flatten)]
+    common: CommonParams,
+    parameters: Option<std::collections::HashMap<String, String>>,


nit: I was just thinking that it'd be good to push use std::collections::HashMap onto the top of the file. But, then I thought there's probably a guideline for when we should use something versus directly referencing it, and I'd bet it's about how many times the thing is used in a file?

So, now I'm not sure :) Just thinking out loud, no real action in this comment but perhaps it will inspire conversation elsewhere.

bradhe · 2025-09-20T10:38:20Z

crates/tower-cmd/src/run.rs

 impl OutputSink for ChannelSink {
    fn send_line(&self, line: String) {
-        let _ = self.0.send(line);
+        if let Some((ts, msg)) = line.split_once(": ") {


Man this really makes me think that we should be pushing the timestamp into a parameter to send_line on the OutputSink trait, considering we're doing it in two places. Probably want to be able to format the ts differently for different cases...

Could be follow-up.

remote run fixes on CLI: - newlines (!) - colorized output - parameter errors correctly reported local run fixes on CLI: - error out properly (previously would abort before the status was actually reporting the error) MCP server: - new plain text sink instead of using the same between MCP and CLI - prefix errors with "ERROR: " (probably better than "Oh no!" for an LLM to parse?) - propagate 422 error from api so that it gets interpreted as an error, and thus able to check parameter validation correctly

…onal logic everywhere

…ied architecture Enable real-time progress notifications during tool execution and clean up redundant code patterns. ## Real-time Streaming Implementation - Add RequestContext<RoleServer> parameter to tower_run_local and tower_run_remote - Implement setup_streaming_output() to forward write() calls to notify_progress() - Create with_streaming() wrapper functions to abstract the pattern WHY: Users should see output as it happens during long-running operations, not wait for completion. The SSE infrastructure was designed for this but tools were batching output. ## Remove Redundant Capture Functions - Delete do_run_local_capture, do_run_remote_capture_plain, do_follow_run_capture_plain - Fix monitor_output_capture to use global output::write() system properly WHY: These functions duplicated the global write() capture mechanism and were only used once. The global system already handles capture when CAPTURE_SENDER is set. ## Consolidate Error Handling - Extract error parsing into dedicated functions: extract_api_error_message, parse_error_response - Replace nested conditionals with early returns and helper functions WHY: Error handling logic was scattered and duplicated across handlers. Centralized functions ensure consistent error messages and reduce repetition. ## Validated by Integration Tests - All 22 test scenarios pass, confirming functionality preserved - Tests verify both success cases and error handling work correctly - Streaming behavior validated through existing MCP client integration 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

Separate "should we capture" from "where to send" by splitting CAPTURE_SENDER -> (CAPTURE_MODE, CURRENT_SENDER) So now: - CAPTURE_MODE: immutable context flag (MCP vs CLI mode) - CURRENT_SENDER: mutable destination that can change per operation - Enable concurrent tool executions with client-specific senders - Update write() and Spinner to use decoupled architecture Huh? The previous design complected capture context with sender destination. E.G., previous OnceLock design couldn't handle multiple MCP clients - each tool execution needs output routed to the requesting client, but OnceLock meant all output went to whoever connected first.

Collect all output during streaming operations and include it in final responses to ensure MCP integration tests receive expected output. - Collect output in Arc<Mutex<Vec<String>>> during streaming - Include captured output in both success and error responses - Add small delay for message processing completion (unhappy about this but claude suggested it and it fixes the integration test, so committing for now) Integration tests expected complete output in MCP tool responses, but previous implementation only sent real-time notifications without including output in the final CallToolResult (oops).

…there

- Convert deploy functions to return Results instead of void - Add proper error variants for DeployApp and DescribeApp API errors - Update MCP deploy tool to handle and report deploy failures - Ensure deploy errors are properly propagated up the call stack 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

- Update test scenarios for better CLI validation coverage - Enhance mock server with better error responses and endpoints - Improve test environment setup and configuration - Add better test step definitions for MCP scenarios - Update documentation for mock server usage 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

socksy · 2025-09-29T16:40:15Z

closed in favour of the PR stack that was eventually folded into fix/tow-963-switch-to-global-output-writing (#105)

socksy requested review from bradhe, Copilot, jo-sm, konstantinoscs and sammuti September 19, 2025 18:50

Copilot AI reviewed Sep 19, 2025

View reviewed changes

bradhe approved these changes Sep 20, 2025

View reviewed changes

bradhe self-requested a review September 25, 2025 12:42

socksy marked this pull request as draft September 25, 2025 17:10

socksy force-pushed the fix/cli-run-regressions branch from 08eaa72 to 82762ca Compare September 26, 2025 10:35

socksy marked this pull request as ready for review September 26, 2025 13:06

socksy and others added 16 commits September 26, 2025 18:51

refactor: get rid of timeout (wrong place, wrong time)

8f8cb29

chore: fix tests to actually run CLI properly

9990c34

fix: use global output::write() function instead of sinks and conditi…

45ed31d

…onal logic everywhere

feat(integration tests): add ability to println debug

0bad02c

chore: black fmt

a999a25

chore: clean up imports and cargo fmt

47451dd

chore: change from brittle sleep to a join_handle

8f218c4

fix: runs return correct error type when they fail

d890899

fix: working dir set properly in deploy

e343b03

fix+feat: add --auto-create for apps, fixes mcp's deploy, which hung …

dc8ac31

…there

fix: error handling in mcp server

eb67e3b

refactor: use struct instead of tuple, misc small improvements

83f09e6

socksy force-pushed the fix/cli-run-regressions branch from 82762ca to 83f09e6 Compare September 26, 2025 16:53

socksy and others added 2 commits September 26, 2025 22:30

socksy force-pushed the fix/cli-run-regressions branch from 0d11894 to 91aa8fa Compare September 26, 2025 20:55

chore: cargo fmt

b99c502

socksy mentioned this pull request Sep 26, 2025

fix:use global output::write() function instead of sinks and conditional logic everywhere #105

Merged

socksy closed this Sep 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix run regressions #102

Fix run regressions #102

Uh oh!

socksy commented Sep 19, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Sep 19, 2025

Uh oh!

Copilot AI Sep 19, 2025

Uh oh!

Copilot AI Sep 19, 2025

Uh oh!

bradhe left a comment

Uh oh!

bradhe Sep 20, 2025

Uh oh!

bradhe Sep 20, 2025

Uh oh!

bradhe Sep 20, 2025

Uh oh!

socksy commented Sep 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-    status_result.unwrap(); // This will report crash/success status
+    if let Err(e) = status_result {
+        eprintln!("Error monitoring app status: {:?}", e);
+        return Err(Error::from(e));
+    }

	response => unwrap_api_response(async { response }).await,
	response => unwrap_api_response(response).await,

Fix run regressions #102

Fix run regressions #102

Uh oh!

Conversation

socksy commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

bradhe left a comment

Choose a reason for hiding this comment

Uh oh!

bradhe Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

bradhe Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

bradhe Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

socksy commented Sep 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

socksy commented Sep 19, 2025 •

edited

Loading