Skip to content

Place code validation is inconsistent with published place codes #109

@dyve

Description

@dyve

Our current place-code validator rejects valid place codes that are present in the published EURING place codes, as packaged. We need to clarify the actual constraints for place codes (per the EURING manual and/or published tables) and align validation accordingly.

Evidence / Repro
Place codes are validated by parse_place_code using ^[A-Z]{2}([A-Z]{2}|[0-9]{2}|--)$ (codes.py (line 227))

But the published EURING Place Codes as collected in our packaged table includes codes that do not match this pattern, for example:

  • -D00 (Bay of Biscay)
  • +A00
  • CS1-
  • ES0-
  • Many others

A quick scan shows:

  • Total unique place codes in table: 2052
  • Rejected by current regex: 92

Why this matters

This is a correctness issue: valid EURING place codes may be rejected during decode/validation, even though we ship them in our own reference data.

What we need to clarify
We should define the real constraints for place codes. Questions:

  1. Are place codes “any 4-character code from the official place-code table”?
  2. Are characters like +, -, and trailing - officially valid?
  3. Should validation be table-driven rather than regex-driven?

Proposed direction (not yet implementation)

Unclear, ask EURING Committee for clarity first.

Acceptance criteria

  • Document the place-code constraints clearly (manual + table reality).
  • Ensure valid codes like -D00, +A00, and CS1- pass validation.
  • Add tests that cover these known-valid “non-classic” codes.
  • Ensure invalid garbage still fails with a clear error.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions