Skip to content

Conversation

@LukeButters
Copy link
Contributor

@LukeButters LukeButters commented Jan 6, 2026

Summary

We often see failures because we are unable to clean up after a script because the "disk space check" has failed. Meaning we don't cleanup because the tentacle is low on disk space.

Since that is insane, this PR changes tentacle so that it can cleanup and check status of scripts without checking if the tentacle is low on disk space.

Also this allows a script to be cancelled and cleaned up when it is low on disk space.

See: https://octopusdeploy.slack.com/archives/CG4JP65N2/p1762406103092919?thread_ts=1762404506.174969&cid=CG4JP65N2

AI Summary

  • Fixed catch-22 where low disk space prevented cleanup operations that would free up space
  • Added WorkspaceReadinessCheck enum with explicit Perform/Skip values
  • Updated IScriptWorkspaceFactory.GetWorkspace() to require readiness check specification
  • Updated all call sites to explicitly choose whether disk space validation is needed

Problem

When completing/cleaning up a script, the system was calling GetWorkspace() which always performed a disk space check via CheckReadiness(). When disk space was low, this check would throw an IOException preventing:

  1. Cleanup operations that would actually free up the needed space
  2. Status checks from reading logs to report progress
  3. Cancel operations from accessing workspace to report cancellation status
  4. Any read-only operations that don't require disk space

Solution

Replaced implicit disk space checking with explicit control via a WorkspaceReadinessCheck enum:

  • WorkspaceReadinessCheck.Skip: For cleanup operations (freeing space) and read-only operations (reading logs, checking state)
  • WorkspaceReadinessCheck.Perform: For write operations that require disk space (starting scripts, writing files)

The enum parameter is required (not optional), making it explicit at every call site whether space validation is needed.

Changes

  • New file: WorkspaceReadinessCheck.cs - Enum with Perform and Skip values
  • IScriptWorkspaceFactory.cs - Changed signature to require WorkspaceReadinessCheck parameter
  • ScriptWorkspaceFactory.cs - Updated implementation to conditionally check readiness based on enum
  • ScriptService.cs - Skip check in CompleteScriptAsync and GetResponse (V1)
  • ScriptServiceV2.cs - Skip check in CompleteScriptAsync and GetResponse; perform check when starting scripts (V2)
  • KubernetesScriptServiceV1.cs - Skip check in CompleteScriptAsync (Kubernetes)
  • ScriptPodLogEncryptionKeyProvider.cs - Perform check when writing keys, skip when reading
  • ScriptPodSinceTimeStore.cs - Perform check (used for both read and write)
  • Test fixtures updated to explicitly specify readiness check behavior

Test plan

  • Verify cleanup succeeds when disk space is below threshold
  • Verify status checks work when disk space is below threshold
  • Verify cancel operations work when disk space is below threshold
  • Verify normal workspace operations (create/prepare) still perform space checks
  • Test all three script service variants (V1, V2, Kubernetes)
  • Confirm workspace cleanup frees up space successfully
  • Verify read-only operations (logs, status) work under low disk space

🤖 Generated with Claude Code

@LukeButters LukeButters requested review from a team as code owners January 6, 2026 05:26
@LukeButters LukeButters force-pushed the luke/skip-readiness-check-on-workspace-cleanup branch 2 times, most recently from aff1dd7 to e8f6549 Compare January 6, 2026 05:50
LukeButters and others added 3 commits January 6, 2026 17:13
… operations

When completing/cleaning up a script, the system was calling GetWorkspace()
which always performed a disk space check. This created a catch-22: when disk
space was low, the check would throw an exception preventing cleanup operations
that would actually free up space. Additionally, read-only operations like
checking status or reading logs were unnecessarily blocked by disk space checks.

Added WorkspaceReadinessCheck enum with Perform/Skip values as a required
parameter to GetWorkspace(). This makes it explicit at each call site whether
disk space validation is needed:
- Skip: For cleanup operations (freeing space) and read-only operations
- Perform: For write operations that require disk space

Changes:
- Created WorkspaceReadinessCheck enum
- Updated IScriptWorkspaceFactory.GetWorkspace() with required enum parameter
- Updated ScriptWorkspaceFactory implementation to conditionally check readiness
- Updated all call sites in ScriptService, ScriptServiceV2, KubernetesScriptServiceV1
- Updated Kubernetes crypto and state store operations
- Updated test fixtures

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@LukeButters LukeButters force-pushed the luke/skip-readiness-check-on-workspace-cleanup branch from cbf6e76 to c2230e8 Compare January 6, 2026 06:13
public interface IScriptWorkspaceFactory
{
IScriptWorkspace GetWorkspace(ScriptTicket ticket);
IScriptWorkspace GetWorkspace(ScriptTicket ticket, WorkspaceReadinessCheck readinessCheck);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opted for making this a choice the deb must make each time they get the workspace.

Copy link
Contributor

@evolutionise evolutionise left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Happy for it to ship but I wonder whether it would've been simpler to change the behaviour to not throw an exception OR to catch the exception when we're doing a clean up.

@LukeButters
Copy link
Contributor Author

I am not sure we why do a disk space check on new scripts maybe a good reason exists, so I left that.

If we try catch then we wont have a workspace to call delete on so we can't clean it with that approach.

@LukeButters LukeButters merged commit 1f238ab into main Jan 7, 2026
51 checks passed
@LukeButters LukeButters deleted the luke/skip-readiness-check-on-workspace-cleanup branch January 7, 2026 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants