Skip to content

Conversation

@bwalsh
Copy link
Contributor

@bwalsh bwalsh commented Dec 23, 2025

Context

The DRS server is implemented in Go (Gin) with server scaffolding generated from OpenAPI. Contract correctness matters for interoperability and operational safety. The OpenAPI spec includes deep schemas and recursive references (e.g., DrsObject → ContentsObject → DrsObject), making “generator-only correctness” insufficient.

We need:

  • Strong request/response contract guarantees
  • Clear operational visibility into contract drift
  • A safe rollout path (audit-first in prod)
  • CI enforcement to prevent regressions

Decision

Implement OpenAPI validation middleware using kin-openapi (openapi3filter) as the validation engine.

  • Request validation: enforce at runtime by default (reject invalid requests)

  • Response validation: buffering + late-commit middleware with two modes:

    • Audit mode (default in production): log/metrics on violations, allow response
    • Enforce mode (CI / pre-prod): block invalid responses (fail closed with 500)

Black-box contract testing complements runtime validation in CI.

Summary

flowchart TB
    Client[Client] -->|HTTP Request| Gin[gin Engine]

    subgraph MiddlewareChain[Middleware Chain]
        LogReq[Request Logging redact auth] --> ReqVal[OpenAPI Request Validation enforce]
        ReqVal --> Handler[Application Handlers generated]
        Handler --> RespVal[OpenAPI Response Validation audit or enforce]
        RespVal --> Commit[Late Commit Writer buffered]
    end

    Gin --> MiddlewareChain -->|HTTP Response| Client

    Spec[OpenAPI Spec local or embedded] --> ReqVal
    Spec --> RespVal

    CI[CI Pipeline] --> Contract[Contract Tests]
    Spec --> Contract
    Contract --> Report[Fail build on violations]

Loading

See docs/adr-0001-request-response-validation.md

Changes

  • Added a GET /service-info route in cmd/server/service_info.go.
  • Return a constant JSON payload that matches the GA4GH DRS Service Info example:
    • Top-level service metadata (id, name, description, organization, contactUrl, documentationUrl, environment, etc.).
    • Version fields (version, drs_version, service_url).
    • Deprecated top-level maxBulkRequestLength retained for backward compatibility.
    • drs section with implementation details (bulk limits, object counts, upload/delete capabilities, validation flags, etc.).
  • Adjusted the handler to respond with http.StatusOK and the static serviceInfoResponse map.

Motivation

  • The OpenAPI spec already defines /service-info, and middleware attempts to validate requests and responses for this path.
  • Previously, FindRoute and security validation failures were blocking proper validation; this change wires up a concrete implementation of the endpoint.
  • Returning a static, well-formed document makes it easier to:
    • Verify routing and OpenAPI middleware behavior.
    • Provide a predictable response for health/diagnostic checks and external integrations.

Testing

  • Started the server and performed GET /service-info:
    • Confirmed a 200 OK response.
    • Verified the JSON body matches the expected GA4GH DRS Service Info structure.
  • Observed logs to ensure the route is discovered from the OpenAPI spec and validated without security errors.

Copilot AI review requested due to automatic review settings December 23, 2025 01:16
@bwalsh bwalsh requested review from kellrott and removed request for Copilot December 23, 2025 01:19
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +62 to +70
// Enforce max buffer
if w.maxBody > 0 && int64(w.body.Len()+len(p)) > w.maxBody {
w.tooLarge = true
// We still buffer up to maxBody, and then ignore the rest (or you can stop buffering entirely)
remaining := int(w.maxBody - int64(w.body.Len()))
if remaining > 0 {
_, _ = w.body.Write(p[:remaining])
}
return len(p), nil

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid truncating responses beyond MaxBodyBytes

The buffering writer drops any bytes beyond MaxBodyBytes while still returning len(p) and only commits the buffered prefix. Because the response validator later commits w.body instead of the full stream, any handler that writes more than the 2 MiB default will silently return a truncated, corrupt payload even in audit mode instead of simply skipping validation. This breaks large downloads and any endpoint that legitimately returns bodies above the cap.

Useful? React with 👍 / 👎.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chatgpt-codex-connector
@copilot

Should we raise a 413 Content Too Large exception here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot AI review requested due to automatic review settings December 23, 2025 14:00
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements OpenAPI validation middleware for the DRS server with both request and response validation capabilities. The implementation uses kin-openapi for validation with configurable audit and enforce modes, allowing safe production rollout. The PR also adds GA4GH DRS-compliant service info and health check endpoints.

  • Adds comprehensive OpenAPI request/response validation middleware with buffering support
  • Implements /service-info endpoint returning GA4GH DRS Service Info metadata
  • Adds /healthz health check endpoint for monitoring and orchestration systems

Reviewed changes

Copilot reviewed 10 out of 12 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
internal/apigen/go.mod Updates Go version to 1.24.0 and upgrades golang.org/x dependencies
internal/apigen/go.sum Adds checksums for updated dependencies
docs/adr-0001-request-response-validation.md Comprehensive ADR documenting the OpenAPI validation architecture, configuration, and implementation approach
cmd/server/middleware/openapi_validator.go Enhances request validator with route attachment, security handling, and lenient mode logging
cmd/server/middleware/openapi_response_validator.go Implements response validation middleware with audit/enforce modes and buffering
cmd/server/middleware/openapi_response_log.go Adds detailed logging for OpenAPI validation failures with redaction support
cmd/server/middleware/buffering_writer.go Implements buffering writer for late-commit response validation
cmd/server/middleware/logging_middleware.go Provides request/response logging with auth header redaction
cmd/server/handlers/service_info.go Implements GA4GH DRS Service Info endpoint with static metadata
cmd/server/handlers/healthz.go Adds health check endpoint
cmd/server/main.go Integrates request and response validators into middleware chain
cmd/server/service_info.go Removes old service info implementation (replaced by handlers/service_info.go)
Comments suppressed due to low confidence (1)

cmd/server/middleware/openapi_validator.go:36

  • Empty comment line without any explanation. This should either be removed or have a descriptive comment explaining why the servers are being neutralized. Consider replacing with something like "// Neutralize server URLs to allow matching requests regardless of host/base path".

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

"drs_version": "1.3.0",
"service_url": "https://drs.example.org",
"maxBulkRequestLength": 1000, // Deprecated
"timestamp": "2024-01-01T12:00:00.000Z",
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timestamp "2024-01-01T12:00:00.000Z" appears to be a placeholder or test value from the past. For a production service, this should either be dynamically generated to reflect the actual current time or updated to a more recent date if it represents a static deployment timestamp.

Copilot uses AI. Check for mistakes.
Comment on lines +29 to +30
"createdAt": "2020-01-01T00:00:00.000Z",
"updatedAt": "2020-01-02T00:00:00.000Z",
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dates "2020-01-01T00:00:00.000Z" and "2020-01-02T00:00:00.000Z" for createdAt and updatedAt are placeholder values from 5 years ago. For a production service, these should be replaced with actual service deployment dates or removed if they're not meaningful for this implementation.

Copilot uses AI. Check for mistakes.
Comment on lines +67 to +72
// Construct a JSON response using Gin's `H` helper, which is a
// shorthand for `map[string]any`. The response includes:
// * `name`: a static identifier for this service.
// * `version`: the service version read from the SERVICE_VERSION
// environment variable (may be empty if not set).
// * `timestamp`: the current UTC time, formatted as an RFC3339Nano string.
Copy link

Copilot AI Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment references outdated handler behavior. The comment mentions "version" being read from SERVICE_VERSION environment variable and "timestamp" being the current UTC time, but the actual implementation now returns static values from the serviceInfoResponse map defined above. This comment should be updated to reflect that the handler returns a static GA4GH DRS Service Info structure.

Suggested change
// Construct a JSON response using Gin's `H` helper, which is a
// shorthand for `map[string]any`. The response includes:
// * `name`: a static identifier for this service.
// * `version`: the service version read from the SERVICE_VERSION
// environment variable (may be empty if not set).
// * `timestamp`: the current UTC time, formatted as an RFC3339Nano string.
// Return the static GA4GH DRS Service Info structure defined in
// `serviceInfoResponse` above. This includes fields such as:
// * top-level service metadata (e.g. `id`, `name`, `type`, `version`)
// * timestamps (`createdAt`, `updatedAt`, `timestamp`)
// * organization and contact details
// * DRS-specific capabilities and limits under the `drs` key.

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Contributor

Copilot AI commented Dec 23, 2025

@bwalsh I've opened a new pull request, #5, to work on those changes. Once the pull request is ready, I'll request review from you.

bwalsh and others added 6 commits December 26, 2025 09:56
* Initial plan

* Switch to mkdocs-material theme with Mermaid support

Co-authored-by: bwalsh <47808+bwalsh@users.noreply.github.com>

* Add SRI integrity hash for Mermaid.js security

Co-authored-by: bwalsh <47808+bwalsh@users.noreply.github.com>

* Add DOMContentLoaded event listener for Mermaid initialization

Co-authored-by: bwalsh <47808+bwalsh@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: bwalsh <47808+bwalsh@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants