-
-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Our current place-code validator rejects valid place codes that are present in the published EURING place codes, as packaged. We need to clarify the actual constraints for place codes (per the EURING manual and/or published tables) and align validation accordingly.
Evidence / Repro
Place codes are validated by parse_place_code using ^[A-Z]{2}([A-Z]{2}|[0-9]{2}|--)$ (codes.py (line 227))
But the published EURING Place Codes as collected in our packaged table includes codes that do not match this pattern, for example:
-D00(Bay of Biscay)+A00CS1-ES0-- Many others
A quick scan shows:
- Total unique place codes in table: 2052
- Rejected by current regex: 92
Why this matters
This is a correctness issue: valid EURING place codes may be rejected during decode/validation, even though we ship them in our own reference data.
What we need to clarify
We should define the real constraints for place codes. Questions:
- Are place codes “any 4-character code from the official place-code table”?
- Are characters like +, -, and trailing - officially valid?
- Should validation be table-driven rather than regex-driven?
Proposed direction (not yet implementation)
Unclear, ask EURING Committee for clarity first.
Acceptance criteria
- Document the place-code constraints clearly (manual + table reality).
- Ensure valid codes like -D00, +A00, and CS1- pass validation.
- Add tests that cover these known-valid “non-classic” codes.
- Ensure invalid garbage still fails with a clear error.