Skip to content

Conversation

@greynewell
Copy link
Contributor

@greynewell greynewell commented Jan 15, 2026

Summary

Implements transparent auto-zipping of repositories for codebase analysis, eliminating the need for manual ZIP creation by agents and users.

Key Features

Automatic ZIP Creation

  • New directory parameter accepts repository paths directly
  • Agents no longer need to run git archive or create ZIPs manually
  • Cross-platform support (Windows, macOS, Linux)

Smart Exclusions

  • Respects .gitignore patterns automatically
  • Excludes sensitive files (.env, *.pem, credentials, keys)
  • Excludes dependencies (node_modules, venv, vendor)
  • Excludes build outputs (dist, build, out)
  • Excludes IDE files (.idea, .vscode)

Resource Management

  • Automatic cleanup of temporary ZIP files after each request
  • Cleanup of stale ZIPs (>24 hours) on server startup
  • No disk clutter from failed operations

Error Handling

  • Clear error messages for common issues (disk space, permissions)
  • 50MB size limit with helpful guidance
  • Validation for non-existent directories and file paths

Backward Compatibility

  • Maintains support for pre-zipped file parameter
  • No breaking changes to existing integrations

Changes

New Files

  • src/utils/zip-repository.ts: Core zipping functionality with gitignore support

Modified Files

  • src/tools/create-supermodel-graph.ts: Updated to support directory parameter
  • src/server.ts: Added cleanup of old ZIPs on startup
  • package.json: Added archiver, ignore, @types/archiver dependencies
  • README.md: Updated documentation with new usage examples

Testing

Comprehensive testing performed:

  • ✅ Auto-zipping test repository (5 files, 1KB)
  • ✅ Auto-zipping MCP codebase itself (26 files, 62.53KB)
  • ✅ Gitignore pattern respect verification
  • ✅ Sensitive file exclusion verification
  • ✅ Error handling (non-existent dirs, file paths, empty dirs)
  • ✅ Cleanup functionality validation

Usage

Before (manual):

git archive -o /tmp/repo.zip HEAD
# Then call tool with file=/tmp/repo.zip

After (automatic):

# Just call tool with directory=/path/to/repo
# Zipping happens transparently

Breaking Changes

None. The file parameter is still supported for backward compatibility.

Related Issues

  • Closes: N/A (proactive improvement)

Co-authored-by

Co-Authored-By: Claude Sonnet 4.5 noreply@anthropic.com

Summary by CodeRabbit

Release Notes

  • New Features

    • Repository analysis now accepts directory paths with automatic ZIP creation respecting .gitignore patterns.
    • Added query parameter support for processing workflows.
    • Automatic cleanup of temporary ZIP files.
  • Improvements

    • Enhanced error handling with detailed guidance for permission issues and disk space constraints.
    • Improved troubleshooting documentation with debug log details.
  • Documentation

    • Updated usage examples to reflect directory-based input.
    • Expanded troubleshooting section with common issue resolutions.

✏️ Tip: You can customize this high-level summary in your review settings.

Implements transparent auto-zipping of repositories for codebase analysis,
eliminating the need for manual ZIP creation by agents and users.

Key Features:
- Automatic ZIP creation from directory paths using 'directory' parameter
- Cross-platform support (Windows, macOS, Linux)
- Respects .gitignore patterns automatically
- Excludes sensitive files (.env, *.pem, credentials, keys)
- Excludes dependencies (node_modules, venv, vendor)
- Excludes build outputs (dist, build, out)
- Automatic cleanup of temporary ZIP files after use
- Cleanup of stale ZIPs (>24h) on server startup
- 50MB size limit with clear error messages
- Comprehensive error handling (disk space, permissions, validation)

Changes:
- Add src/utils/zip-repository.ts with zipRepository() and cleanupOldZips()
- Update src/tools/create-supermodel-graph.ts to support 'directory' param
- Update src/server.ts to cleanup old ZIPs on startup
- Update tool schema to accept 'directory' (recommended) or 'file' (deprecated)
- Update README.md with new usage examples and troubleshooting
- Add dependencies: archiver, ignore, @types/archiver

Backward Compatibility:
- Maintains support for pre-zipped 'file' parameter
- No breaking changes to existing integrations

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Jan 15, 2026

Walkthrough

This PR introduces directory-based input for repository analysis, replacing the prior requirement for pre-zipped files. It adds automatic ZIP creation with gitignore support, implements temporary file cleanup mechanisms, and updates documentation to reflect these changes.

Changes

Cohort / File(s) Summary
Documentation Updates
README.md
Updated usage examples and parameter descriptions to reflect directory-based input instead of pre-zipped files; added automatic features notes and expanded troubleshooting guidance for ZIP-related operations
Dependency Additions
package.json
Added archiver and ignore as dependencies, and @types/archiver as a dev dependency to support auto-zipping with gitignore filtering
Cleanup Mechanism
src/server.ts
Integrated cleanupOldZips on server startup with a 24-hour threshold to remove stale temporary archives
Core Tool Changes
src/tools/create-supermodel-graph.ts
Migrated from required file input to optional directory input; added auto-zipping workflow with gitignore support, input validation (mutual exclusivity), error handling for common failure modes, and cleanup guarantees via finally block
ZIP Utility Module
src/utils/zip-repository.ts
New module providing zipRepository() for creating ZIP archives with standard exclusions and .gitignore support, plus cleanupOldZips() for periodic cleanup; includes robust error handling and size enforcement

Sequence Diagram

sequenceDiagram
    participant Client as Client/API
    participant Tool as create-supermodel-graph<br/>Tool
    participant ZipUtil as zipRepository<br/>Utility
    participant Archiver as Archiver +<br/>Ignore Libs
    participant FileSystem as File System

    Client->>Tool: Request with directory path
    Tool->>Tool: Validate input (directory XOR file)
    alt Directory provided
        Tool->>ZipUtil: zipRepository(directoryPath)
        ZipUtil->>ZipUtil: Build ignore filter<br/>(standards + .gitignore)
        ZipUtil->>Archiver: Create archive
        ZipUtil->>FileSystem: Traverse & add eligible files
        ZipUtil->>FileSystem: Enforce size cap
        Archiver-->>ZipUtil: ZIP created
        ZipUtil-->>Tool: ZipResult {path, cleanup, ...}
    else File provided (deprecated)
        Tool->>Tool: Use file path directly
    end
    Tool->>Tool: Process ZIP content
    Tool->>Client: Return results
    Tool->>FileSystem: cleanup() — remove temp ZIP
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested reviewers

  • robotson
  • jonathanpopham

Poem

📦 From ZIP files came a better way,

Auto-archive at the light of day,

Gitignore whispers "skip this file,"

Cleanup sweeps with graceful style,

Directory dreams, now realized at last! 🎉

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main feature being added: automatic repository zipping with gitignore support, which is the core change across all modified and new files.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

@greynewell greynewell requested a review from robotson January 15, 2026 16:36
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/utils/zip-repository.ts`:
- Around line 311-321: The code currently calls fs.stat(fullPath) (variable
fullPath) which follows symlinks and can include files outside rootDir; change
this to use fs.lstat(fullPath) to detect and skip symlinks outright (consistent
with other special-file handling), or if you must follow links, resolve them
with fs.realpath(fullPath) and verify the resolved path is inside rootDir before
including; update the error handling branch accordingly to skip symlinks and log
a warning when a symlink or an out-of-repo resolved path is encountered.
🧹 Nitpick comments (5)
src/utils/zip-repository.ts (2)

167-181: Potential race condition in size limit enforcement.

So here's the thing: you're checking totalSize in the entry event, but entry.stats?.size gives you the uncompressed size of each file, not the compressed size that ends up in the ZIP. This means:

  1. The actual ZIP file could be smaller than totalSize (compression wins)
  2. Or larger if there's metadata overhead

Also, there's a subtle timing issue - archive.abort() is async-ish, so more files might get added between when you call abort and when it actually stops. Not a huge deal in practice, but worth knowing.

Consider checking the actual output stream size periodically or using the progress event if available.

💡 Alternative approach using actual file size
+  // Track actual compressed size
+  let compressedSize = 0;
+  output.on('data', (chunk) => {
+    compressedSize += chunk.length;
+    if (compressedSize > maxSizeBytes) {
+      archive.abort();
+      archiveError = new Error(
+        `ZIP size exceeds limit (${formatBytes(maxSizeBytes)}). ` +
+        `Consider excluding more directories.`
+      );
+    }
+  });

193-209: Minor: Redundant error check after promise.

Lines 204-209 check archiveError again after the promise resolves, but if archiveError was set, the promise on line 193-202 would have already rejected. This code path is unreachable.

That said, it's harmless defensive code - just a tiny bit of dead code. Up to you if you want to clean it up.

🧹 Simplified error handling
   // Wait for output stream to finish
   await new Promise<void>((resolve, reject) => {
     output.on('close', () => {
       if (archiveError) {
         reject(archiveError);
       } else {
         resolve();
       }
     });
     output.on('error', reject);
   });
-
-  // Check for errors during archiving
-  if (archiveError) {
-    // Clean up partial ZIP
-    await fs.unlink(zipPath).catch(() => {});
-    throw archiveError;
-  }

Note: You'd want to add cleanup logic inside the reject path if you remove lines 204-209.

src/tools/create-supermodel-graph.ts (3)

191-209: Error handling is good, but relies on string matching.

The error mapping works, but checking error.message.includes('does not exist') is a bit fragile - if the message text changes in zip-repository.ts, these checks break silently.

A more robust approach would be to use custom error classes or error codes in zipRepository, but honestly for a tool like this, string matching is probably fine. Just something to keep in mind if you refactor later.


249-257: Duplicate formatBytes function.

This is identical to the one in src/utils/zip-repository.ts (line 347). You could export it from there and import it here to keep things DRY.

Not a big deal for a 5-line helper, but if you ever need to change the formatting logic, you'd have to remember to change it in both places.

🔧 Extract to shared utility

In src/utils/zip-repository.ts:

-function formatBytes(bytes: number): string {
+export function formatBytes(bytes: number): string {

In src/tools/create-supermodel-graph.ts:

-import { zipRepository } from '../utils/zip-repository';
+import { zipRepository, formatBytes } from '../utils/zip-repository';

-/**
- * Format bytes as human-readable string
- */
-function formatBytes(bytes: number): string {
-  if (bytes < 1024) return `${bytes} B`;
-  if (bytes < 1024 * 1024) return `${(bytes / 1024).toFixed(2)} KB`;
-  if (bytes < 1024 * 1024 * 1024) return `${(bytes / (1024 * 1024)).toFixed(2)} MB`;
-  return `${(bytes / (1024 * 1024 * 1024)).toFixed(2)} GB`;
-}

1-1: Consider removing @ts-nocheck.

This disables all TypeScript checking for the file. If there's a specific type issue you're working around, a more targeted @ts-expect-error or @ts-ignore on just that line would be better. That way TypeScript can still catch bugs in the rest of the file.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 06e916f and e1089da.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (5)
  • README.md
  • package.json
  • src/server.ts
  • src/tools/create-supermodel-graph.ts
  • src/utils/zip-repository.ts
🧰 Additional context used
🧬 Code graph analysis (1)
src/server.ts (1)
src/utils/zip-repository.ts (1)
  • cleanupOldZips (358-396)
🔇 Additional comments (12)
README.md (3)

151-157: Documentation looks solid!

The parameter table clearly explains the new directory option and properly marks file as deprecated. The footnote about mutual exclusivity is helpful. Nice and clear for users! 🎯


175-181: Great addition of the "Automatic features" section.

This gives users a nice at-a-glance view of what the auto-zipping handles for them. Super helpful for setting expectations about what gets excluded.


195-198: Helpful troubleshooting additions.

Good coverage of the new error scenarios (permissions, disk space, directory existence). These will save users a lot of head-scratching when things go wrong.

src/utils/zip-repository.ts (3)

17-83: Nice comprehensive exclusion list!

The security-sensitive files section (lines 54-70) is particularly important - good job covering .env, credentials, keys, and cloud provider secrets. This prevents accidental credential leaks when users zip their repos.

One small thing to consider: the env pattern on line 28 might be a bit aggressive. Some projects have legitimate env/ directories that aren't Python virtualenvs (like config directories). But this is a minor tradeoff - better safe than sorry for a default exclusion list.


256-273: Note: Only root .gitignore is parsed.

Just so you're aware - this implementation only reads the .gitignore file in the root directory. Real git supports nested .gitignore files in subdirectories, each applying to their own subtree.

For most repos this is probably fine since the root .gitignore usually covers everything, but some monorepos or projects with complex structures might have patterns only in nested .gitignore files that won't be respected here.

If you want to keep it simple (totally reasonable!), maybe add a note in the docs. If you want full fidelity, you'd need to check for .gitignore in each directory during addFilesRecursively.


358-396: Cleanup function looks solid!

Good job on:

  • Only touching files with the supermodel- prefix (won't accidentally delete other stuff)
  • Graceful error handling (won't crash if files are deleted by another process)
  • Logging the cleanup count for debugging

The naming convention supermodel-{random}.zip makes it easy to identify your temp files.

src/server.ts (2)

60-63: Nice server startup cleanup integration!

Running the cleanup before the transport connects is a good call - it ensures old temp files from crashed/killed processes don't pile up. The 24-hour threshold is reasonable.

One tiny thing: since cleanupOldZips handles its own errors internally and never throws, the await here is fine but you might consider logging that cleanup is starting, just for visibility during debugging.


21-22: The instructions update is thorough!

The updated instructions cover the new directory-based workflow well. The explanation of automatic zipping, exclusions, and the 50MB limit gives users everything they need to know.

src/tools/create-supermodel-graph.ts (3)

67-74: Clear schema update for the new directory parameter.

The descriptions are helpful - users will immediately understand that directory is the new recommended way and file is deprecated. Good UX!


163-170: Solid input validation!

The mutual exclusivity check (directory XOR file) is clean and the error messages are clear. Users won't be confused about what went wrong.


218-246: Cleanup logic is correct!

Good use of try/finally to ensure the temp ZIP gets cleaned up even if the query processing fails. The shouldCleanup flag correctly tracks whether we created a temp file that needs cleanup.

The flow is:

  1. If zipRepository throws → early return with error, no cleanup needed (zip wasn't created)
  2. If zipRepository succeeds → shouldCleanup = true, cleanup is set
  3. If anything in the try block fails → finally still runs and cleans up

Nice and tidy! 🧹

package.json (1)

39-40: Dependencies are spot-on. All versions are current—you've got archiver 7.0.1 and ignore 7.0.5 which are the latest releases, so nothing outdated here.

To break it down simply: archiver is the standard go-to library for creating ZIP files in Node.js (basically your toolbox for zipping things), and ignore handles gitignore patterns (the file that tells git what to skip). Both are well-maintained and widely trusted in the Node ecosystem. Using the caret (^) in your version specs means you'll automatically get bug fixes and minor improvements when they drop—which is exactly what you want.

Great choices for the ZIP functionality you're adding! 👍

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@greynewell greynewell merged commit 243a71e into main Jan 15, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants