Add e2e tests #525

trevor-e · 2025-12-12T20:42:23Z

Adds E2E tests by providing a mock web server to simulate the Sentry API and also a real Kafka server to send messages. While not quite the same as the E2E test for Emerge that eventually posts a status check, this should hopefully provide more coverage than before.

The aim is to catch a bug like I fixed in 127cd52 before it makes it to production.

tests/e2e/mock-sentry-api/server.py

The test-runner container doesn't have launchpad installed, so mounting the full tests directory caused pytest to load the main conftest.py which imports launchpad. Fixed by: - Copy e2e tests to /app/e2e_tests to avoid parent conftest.py discovery - Add standalone e2e conftest.py that doesn't import launchpad - Add missing build deps (gcc, g++, librdkafka-dev) for confluent-kafka - Add missing deps to mock API (curl for healthcheck, python-multipart for uploads) - Update fixture path to match new directory structure Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The e2e tests require Docker services (mock-sentry-api) that only exist in the dedicated e2e job's docker-compose environment. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add sanitize_id helper to validate artifact_id parameters contain only safe characters (alphanumeric, hyphens, underscores). This fixes CodeQL path injection warnings in the mock API server. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

tests/e2e/mock-sentry-api/server.py

The preprod-artifact-events topic must exist before Launchpad starts, otherwise the Kafka consumer fails with UNKNOWN_TOPIC_OR_PART error. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The wait_for_processing function was returning as soon as artifact metadata was present, but the size analysis file is uploaded after the metadata update. Now wait for both to be complete. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The analysis results use download_size and install_size, not total_size. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Replace regex-based sanitize_id with safe_path() that: 1. Constructs the target path 2. Resolves it to absolute path (removes .., symlinks) 3. Validates it stays within the base directory This pattern is recognized by CodeQL as a proper path traversal sanitizer. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

tests/e2e/mock-sentry-api/server.py

The e2e mock server is test code that runs in an isolated Docker container. Exclude it from CodeQL to avoid false positives on path handling in test utilities. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Instead of trying to sanitize user input, hash the artifact_id to create safe filenames. This ensures user-controlled data never directly becomes part of file paths. Also removes unused CodeQL config file. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add is_valid_chunk_checksum() to validate that chunk checksums are valid SHA1 hex strings before using them in file paths. This prevents potential path traversal attacks in the assemble_file endpoint. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Replace regex-based validation with path.resolve() pattern that CodeQL recognizes as a proper path traversal sanitizer. The safe_chunk_path() function now resolves the path and verifies it stays within CHUNKS_DIR. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

tests/e2e/mock-sentry-api/server.py

Apply the same safe_filename pattern to chunk storage - hash the checksum before using it as a filename. This ensures user-controlled data never directly becomes part of file paths, preventing path traversal attacks. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Split the E2E CI job into distinct steps so each has its own collapsible log section in GitHub Actions: - Build E2E Docker images - Start E2E services - Run E2E tests This makes it much easier to find the actual test output without wading through Docker build and service startup logs. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

This is a temporary commit to verify E2E tests properly detect analysis failures. Will be reverted after verification. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

This reverts commit fdec1a0.

Validate the full API response schema in E2E tests to catch issues like the build_date commit that broke production. This includes: - iOS: verify required fields (app_name, app_id, build_version, short_version) - iOS: verify optional fields exist (minimum_os_version, sdk_version, is_simulator, codesigning_type, build_date) - iOS: validate build_date is ISO format when present - Android: verify required fields (app_name, app_id, version_code, version_name) - All: verify download_size is an integer, insights is a list This would have caught issues where new fields break serialization or where field names don't match what Sentry expects. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Fix field names to match the UpdateData model: - Common fields: app_name, app_id, build_version, artifact_type - iOS-specific fields are nested in apple_app_info - Android-specific fields are nested in android_app_info The E2E tests now validate the actual API contract that Sentry expects from Launchpad. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The size analysis insights field is organized as a dict with category keys (duplicate_files, image_optimization, etc.), not a flat list. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Replace weak field existence checks with exact value assertions: iOS (HackerNews.xcarchive.zip): - app_name == "HackerNews" - app_id == "com.emergetools.hackernews" - build_version == "3.8", build_number == 1 - codesigning_type == "development" - build_date == "2025-05-19T16:15:12" - main_binary_uuid == "BEB3C0D6-2518-343D-BB6F-FF5581C544E8" Android APK (hn.apk): - app_name == "Hacker News" - app_id == "com.emergetools.hackernews" - has_proguard_mapping == False Android AAB (hn.aab): - build_version == "1.0.2", build_number == 13 - has_proguard_mapping == True Also verify specific insight categories are generated for each platform. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Updated E2E tests to verify exact values instead of weak existence checks. This ensures API contract changes are caught, like the build_date issue in fabe8d8. iOS test now verifies: - Exact download_size (6502319) - Treemap root matches download_size and app name - Insight categories have proper structure with total_savings Android APK test now verifies: - Exact download_size (3670839) and treemap root size (7886041) - Exact duplicate_files savings (51709) - Exact multiple_native_library_archs savings (1891208) Android AAB test now verifies: - Exact treemap root size (5932249) - Treemap structure with expected children count - Insight structure validation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The treemap root name is the app name without the .app extension. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Treemap root size is 9728000 (install size), which is different from download_size (6502319, compressed). Fixed comment and assertion. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Security fixes: - Use hmac.compare_digest() for timing-safe signature comparison - Return error instead of silently ignoring JSON parsing failures - Return error for unknown assemble_type values Code quality: - Remove unused Pydantic models (UpdateRequest, ChunkOptionsResponse, etc.) - Remove unused assembled_files variable - Remove unused List import - Fix Content-Range header calculation (use correct end byte) - Remove obsolete docker-compose version field - Replace emoji checkmarks with [OK] text Robustness: - Add Kafka delivery callback to verify message sends - Add failure handling for CI health check loop - Add failure path test for non-existent artifacts Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

tests/e2e/mock-sentry-api/server.py

…rough an exception Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

test

c2b7f75

github-advanced-security bot found potential problems Dec 12, 2025

View reviewed changes

trevor-e and others added 4 commits December 12, 2025 15:55

fix

44f628b

fix: exclude e2e tests from regular test job

13241b0

The e2e tests require Docker services (mock-sentry-api) that only exist in the dedicated e2e job's docker-compose environment. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

github-advanced-security bot found potential problems Jan 8, 2026

View reviewed changes

tests/e2e/mock-sentry-api/server.py Fixed Show fixed Hide fixed

tests/e2e/mock-sentry-api/server.py Fixed Show fixed Hide fixed

tests/e2e/mock-sentry-api/server.py Fixed Show fixed Hide fixed

tests/e2e/mock-sentry-api/server.py Fixed Show fixed Hide fixed

trevor-e and others added 4 commits January 8, 2026 17:56

fix: create Kafka topic before running e2e tests

76b9f99

The preprod-artifact-events topic must exist before Launchpad starts, otherwise the Kafka consumer fails with UNKNOWN_TOPIC_OR_PART error. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

fix: use correct field name download_size in e2e tests

92ce4cb

The analysis results use download_size and install_size, not total_size. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

github-advanced-security bot found potential problems Jan 8, 2026

View reviewed changes

tests/e2e/mock-sentry-api/server.py Fixed Show fixed Hide fixed

tests/e2e/mock-sentry-api/server.py Fixed Show fixed Hide fixed

tests/e2e/mock-sentry-api/server.py Fixed Show fixed Hide fixed

trevor-e and others added 4 commits January 9, 2026 18:43

chore: exclude e2e tests from CodeQL scanning

0e44479

The e2e mock server is test code that runs in an isolated Docker container. Exclude it from CodeQL to avoid false positives on path handling in test utilities. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

github-advanced-security bot found potential problems Jan 12, 2026

View reviewed changes

tests/e2e/mock-sentry-api/server.py Fixed Show fixed Hide fixed

trevor-e and others added 12 commits January 12, 2026 13:42

test: add deliberate error to verify E2E catches failures

fdec1a0

This is a temporary commit to verify E2E tests properly detect analysis failures. Will be reverted after verification. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Revert "test: add deliberate error to verify E2E catches failures"

4080337

This reverts commit fdec1a0.

fix: insights is a dict keyed by category, not a list

25de302

The size analysis insights field is organized as a dict with category keys (duplicate_files, image_optimization, etc.), not a flat list. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

fix: correct iOS treemap root name (HackerNews, not HackerNews.app)

3a47ae6

The treemap root name is the app name without the .app extension. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

fix: use correct iOS treemap root size (install size, not download size)

81e3cfc

Treemap root size is 9728000 (install size), which is different from download_size (6502319, compressed). Fixed comment and assertion. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

github-advanced-security bot found potential problems Jan 13, 2026

View reviewed changes

tests/e2e/mock-sentry-api/server.py Fixed Show fixed Hide fixed

trevor-e and others added 2 commits January 13, 2026 18:26

Potential fix for code scanning alert no. 40: Information exposure th…

61d9476

…rough an exception Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

fix: remove unused exception variable

9bcadf8

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

trevor-e changed the title ~~WIP: full e2e tests~~ Add e2e tests Jan 15, 2026

trevor-e marked this pull request as ready for review January 15, 2026 00:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add e2e tests #525

Add e2e tests #525

Uh oh!

trevor-e commented Dec 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Add e2e tests #525

Are you sure you want to change the base?

Add e2e tests #525

Uh oh!

Conversation

trevor-e commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

trevor-e commented Dec 12, 2025 •

edited

Loading