Skip to content

Conversation

@janstenpickle
Copy link

@janstenpickle janstenpickle commented Oct 29, 2025

  • Week 5 now follows Week 2 pattern instead of Week 1
  • Extended calendar data structure to support up to 8 weeks with proper cycling
  • Removed safe fallback returns that masked parsing errors
  • Fixed green bin bi-weekly collection logic (Week 2 and Week 4)
  • Improved error handling to raise exceptions instead of returning defaults
  • All unit tests passing (17/17)

Summary by CodeRabbit

  • New Features

    • Automatic bank-holiday adjustments to collection dates (Christmas, New Year, Good Friday) with safer fallbacks for unparsable dates
    • Enhanced calendar parsing to handle irregular month/week structures including 5-week-months
    • Improved bin-type lookup for calendars with Week‑5 patterns
  • Tests

    • Added extensive tests for bank-holiday rules, date adjustment logic, calendar integration, default-shift behavior, and error handling
    • Removed legacy test configuration/scaffold used in council tests

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 29, 2025

Warning

Rate limit exceeded

@janstenpickle has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 16 minutes and 24 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 7a8f8ff and d3de4b2.

📒 Files selected for processing (4)
  • uk_bin_collection/tests/input.json (1 hunks)
  • uk_bin_collection/tests/test_south_kesteven_district_council.py (4 hunks)
  • uk_bin_collection/tests/test_south_kesteven_integration.py (2 hunks)
  • uk_bin_collection/uk_bin_collection/councils/SouthKestevenDistrictCouncil.py (7 hunks)

Walkthrough

Bank-holiday adjustment logic and rules were added to the South Kesteven council scraper; calendar parsing extended to support 5–8 week cycles; bin-type lookup updated for 5-week months; extensive unit tests for bank-holiday behavior were added; test conftest fixture/hooks were removed.

Changes

Cohort / File(s) Summary
Council logic (bank holiday + parsing)
uk_bin_collection/uk_bin_collection/councils/SouthKestevenDistrictCouncil.py
Added get_bank_holiday_rules() and adjust_for_bank_holidays(); integrated bank-holiday adjustments into get_next_collection_dates(); extended parse_regular_calendar_text() to support weeks 5–8 and updated get_bin_type_from_calendar() to handle 5-week mapping and clearer errors; imported is_holiday and Region.
Unit tests (bank holiday scenarios)
uk_bin_collection/tests/test_south_kesteven_district_council.py
Added comprehensive tests for rules validation, specific-date shifts (Christmas/New Year), no-adjustment dates (Good Friday), default shift behavior, invalid-date handling, and integration with calendar parsing & bin-type lookup using mocks.
Test config removal
uk_bin_collection/uk_bin_collection/councils/tests/conftest.py
Removed file — deleted pytest hooks (pytest_configure, pytest_collection_modifyitems) and test_postcode fixture.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Council as SouthKestevenCouncil
    participant Parser as parse_regular_calendar_text
    participant BinLookup as get_bin_type_from_calendar
    participant Adjust as adjust_for_bank_holidays
    participant Rules as get_bank_holiday_rules

    Client->>Council: get_next_collection_dates(postcode)
    Council->>Parser: parse calendar (weeks 1–8)
    Parser-->>Council: calendar structure
    Council->>BinLookup: map dates → bin types
    BinLookup-->>Council: dated collections

    Note over Council,Adjust: NEW — apply bank-holiday adjustments
    Council->>Adjust: adjust_for_bank_holidays(collection_dates)
    Adjust->>Rules: get_bank_holiday_rules()
    Rules-->>Adjust: rules (specific_dates, no_adjustment, default_shift)
    Adjust-->>Council: adjusted collection_dates

    Council-->>Client: return adjusted collection dates
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Review bank-holiday adjustment logic and edge cases (specific_dates mapping, no_adjustment, default_shift).
  • Validate parse_regular_calendar_text changes for 5–8 week cycles and mapping of Week 5 → Week 2 behavior.
  • Inspect get_bin_type_from_calendar 5-week handling and error paths.
  • Check extensive tests for realistic mocks and that removal of conftest.py doesn't break test discovery.

Possibly related PRs

Suggested reviewers

  • dp247

Poem

🐰 I hop through calendars, rules in my paw,
Shifting dates gently, obeying the law,
Christmas moves early, New Year takes a bow,
Weeks five through eight dance — I tidy them now,
Hooray for neat collections — hip-hop, take a bow! 🎋

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "fix: Improve OCR parsing for 5-week months and remove safe fallbacks" is partially related to the changeset. The title accurately describes real components of the changes: the raw summary confirms that calendar parsing was expanded to support weeks 5 through 8 with correct cycling behavior, and that safe fallback returns were removed to raise exceptions instead. However, the title significantly omits a major feature addition—comprehensive bank holiday handling including the new get_bank_holiday_rules and adjust_for_bank_holidays methods—which comprises a substantial portion of the code changes and test additions. Despite this incompleteness, the title does refer to legitimate aspects of the changeset and is not misleading or off-topic.
Docstring Coverage ✅ Passed Docstring coverage is 94.44% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@janstenpickle janstenpickle force-pushed the fix/south-kesteven-ocr-parsing branch from 9307294 to 8c44604 Compare October 29, 2025 20:00
…rsing

- Add comprehensive bank holiday handling with specific calendar rules
- Christmas/New Year period adjustments (2025/26) with custom shifts
- Good Friday collections remain unchanged (no adjustment)
- Default 1-day later shift for other bank holidays
- Improve OCR parsing for 5-week months (Week 5 follows Week 2 pattern)
- Remove safe fallback logic, now raises ValueError for missing data
- Move tests to top-level test directory for better organization
- Add comprehensive test coverage (26 tests total)

BREAKING CHANGE: get_bin_type_from_calendar now raises ValueError instead of fallback
@janstenpickle janstenpickle force-pushed the fix/south-kesteven-ocr-parsing branch from 8c44604 to a37f5d8 Compare October 29, 2025 20:01
@codecov
Copy link

codecov bot commented Oct 29, 2025

Codecov Report

❌ Patch coverage is 5.47945% with 69 lines in your changes missing coverage. Please review.
✅ Project coverage is 54.03%. Comparing base (475e160) to head (d3de4b2).

Files with missing lines Patch % Lines
...ollection/councils/SouthKestevenDistrictCouncil.py 5.47% 69 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (475e160) and HEAD (d3de4b2). Click for more details.

HEAD has 7 uploads less than BASE
Flag BASE (475e160) HEAD (d3de4b2)
8 1
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #1692       +/-   ##
===========================================
- Coverage   86.79%   54.03%   -32.77%     
===========================================
  Files           9       10        +1     
  Lines        1136     1908      +772     
===========================================
+ Hits          986     1031       +45     
- Misses        150      877      +727     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
uk_bin_collection/tests/test_south_kesteven_district_council.py (3)

292-292: Consider adding strict=True to zip() for safer iteration.

When iterating over multiple sequences with zip(), adding strict=True ensures they have the same length and raises an error if they don't, preventing silent bugs.

Apply this diff:

-        for i, (original, expected) in enumerate(zip(test_dates, expected_adjustments)):
+        for i, (original, expected) in enumerate(zip(test_dates, expected_adjustments, strict=True)):

303-303: Consider adding strict=True to zip() for safer iteration.

Apply this diff:

-        for i, (original, expected) in enumerate(zip(test_dates, expected_adjustments)):
+        for i, (original, expected) in enumerate(zip(test_dates, expected_adjustments, strict=True)):

324-324: Consider adding strict=True to zip() for safer iteration.

Apply this diff:

-        for i, (original, expected) in enumerate(zip(test_dates, expected_adjustments)):
+        for i, (original, expected) in enumerate(zip(test_dates, expected_adjustments, strict=True)):
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 475e160 and a37f5d8.

📒 Files selected for processing (3)
  • uk_bin_collection/tests/test_south_kesteven_district_council.py (1 hunks)
  • uk_bin_collection/uk_bin_collection/councils/SouthKestevenDistrictCouncil.py (5 hunks)
  • uk_bin_collection/uk_bin_collection/councils/tests/conftest.py (0 hunks)
💤 Files with no reviewable changes (1)
  • uk_bin_collection/uk_bin_collection/councils/tests/conftest.py
🧰 Additional context used
🧬 Code graph analysis (2)
uk_bin_collection/uk_bin_collection/councils/SouthKestevenDistrictCouncil.py (1)
uk_bin_collection/uk_bin_collection/common.py (2)
  • is_holiday (149-161)
  • Region (29-33)
uk_bin_collection/tests/test_south_kesteven_district_council.py (1)
uk_bin_collection/uk_bin_collection/councils/SouthKestevenDistrictCouncil.py (3)
  • get_bank_holiday_rules (475-494)
  • adjust_for_bank_holidays (496-533)
  • get_next_collection_dates (446-473)
🪛 Ruff (0.14.2)
uk_bin_collection/uk_bin_collection/councils/SouthKestevenDistrictCouncil.py

1204-1204: Abstract raise to an inner function

(TRY301)


1204-1204: Avoid specifying long messages outside the exception class

(TRY003)

uk_bin_collection/tests/test_south_kesteven_district_council.py

292-292: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


303-303: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


324-324: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Run Unit Tests (3.12, 1.8.4)
🔇 Additional comments (12)
uk_bin_collection/tests/test_south_kesteven_district_council.py (6)

251-282: LGTM!

The bank holiday rules configuration test is comprehensive and validates all expected rule structures and values correctly.


306-314: LGTM!

Good Friday no-adjustment test is correct and validates the expected behavior.


327-338: LGTM!

The default shift test correctly uses mocking to validate the fallback behavior for unspecified bank holidays.


340-347: LGTM!

The error handling test correctly validates that invalid dates are returned unchanged, ensuring graceful degradation.


349-363: Verify test assertions and remove uncertainty comment.

The test comment suggests the assertions might need adjustment, and the test only validates the length of the returned list without checking the actual adjusted dates. Consider either:

  1. Adding assertions to verify the actual adjusted dates (e.g., that Dec 25 is adjusted to Dec 24)
  2. Removing the test if it duplicates coverage from other tests

365-387: LGTM!

The integration test correctly validates that bank holiday adjustments work in conjunction with calendar parsing, ensuring Christmas Day is properly adjusted.

uk_bin_collection/uk_bin_collection/councils/SouthKestevenDistrictCouncil.py (6)

24-24: LGTM!

Explicit imports of is_holiday and Region improve code clarity and make the dependencies explicit.


469-471: LGTM!

Bank holiday adjustments are correctly integrated into the date generation flow, ensuring all collection dates are adjusted appropriately.


475-494: LGTM with a note about annual maintenance.

The bank holiday rules are well-structured with clear reasoning. Note that the specific dates are hardcoded for 2025/2026 and will require annual updates to reflect future years' schedules.

Consider adding a comment or documentation noting that these rules need annual review and updates.


496-533: LGTM!

The adjust_for_bank_holidays method is well-structured with proper error handling and a logical cascade of rule checks. The graceful handling of unparseable dates ensures robustness.


1122-1134: LGTM!

The extended calendar structure correctly implements the Week 5 → Week 2 pattern as specified in the PR objectives. The cycling pattern for weeks 6-8 provides good future-proofing for edge cases.


1187-1208: LGTM - Error handling improved as intended.

The removal of safe fallback returns in favor of raising exceptions aligns with the PR objectives and improves error detection. The Week 5 → Week 2 mapping logic is correct.

The static analysis hints (TRY301, TRY003) are minor style suggestions and don't affect functionality. These can be optionally addressed if you want to follow stricter style guidelines.

Chris Jansen added 2 commits October 29, 2025 20:13
- Add fallback collection day based on postcode pattern when HTTP requests fail
- Add fallback green bin info when external requests are unavailable
- PE6 postcodes default to Monday collection day
- NG postcodes default to Tuesday collection day
- Generic fallback to Wednesday for other postcodes
- Green bin defaults to Tuesday Week 2 pattern
- Prevents test failures in CI environments with network restrictions
- Add strict=True to zip() calls in bank holiday tests for safer iteration
- Add documentation note about annual maintenance of bank holiday rules
- Addresses CodeRabbit review suggestions for improved code safety
- All bank holiday tests continue to pass with stricter validation
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
uk_bin_collection/uk_bin_collection/councils/SouthKestevenDistrictCouncil.py (1)

23-24: Remove redundant explicit import.

Line 24 explicitly imports is_holiday and Region from common, but line 23 already imports everything from the same module with from uk_bin_collection.uk_bin_collection.common import *. The explicit import is redundant.

Apply this diff to remove the redundancy:

 from uk_bin_collection.uk_bin_collection.common import *
-from uk_bin_collection.uk_bin_collection.common import is_holiday, Region
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a37f5d8 and 7a8f8ff.

📒 Files selected for processing (1)
  • uk_bin_collection/uk_bin_collection/councils/SouthKestevenDistrictCouncil.py (6 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
uk_bin_collection/uk_bin_collection/councils/SouthKestevenDistrictCouncil.py (1)
uk_bin_collection/uk_bin_collection/common.py (2)
  • is_holiday (149-161)
  • Region (29-33)
🪛 Ruff (0.14.2)
uk_bin_collection/uk_bin_collection/councils/SouthKestevenDistrictCouncil.py

1204-1204: Abstract raise to an inner function

(TRY301)


1204-1204: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Run Integration Tests (3.12, 1.8.4)
  • GitHub Check: Run Unit Tests (3.12, 1.8.4)
🔇 Additional comments (6)
uk_bin_collection/uk_bin_collection/councils/SouthKestevenDistrictCouncil.py (6)

469-473: LGTM!

The bank holiday adjustment is correctly integrated before returning collection dates. This ensures all generated dates are properly adjusted for holidays.


475-494: Verify bank holiday shift rules are correct for the council.

The bank holiday rules structure is well-organized, but some specific date shifts should be verified:

  • Line 480: '22/12/2025' (Monday) shifted by -2 days with reason "Christmas (moved to Saturday)" - this would move collections to the previous Saturday, which seems unusual.
  • Line 484: '26/12/2025' (Boxing Day, Friday) shifted by +1 day to Saturday - typically collections don't shift to weekends.

Please confirm these shift patterns match the council's published holiday collection schedule.


496-533: LGTM!

The bank holiday adjustment logic is well-structured with:

  • Specific date rules for known holidays
  • No-adjustment list for collections that proceed normally
  • Fallback to the is_holiday library for automatic detection
  • Proper error handling that preserves unparsable dates

1122-1134: LGTM!

The calendar structure correctly implements the 3-week cycling pattern (Black → Silver → Purple) and properly extends to 8 weeks. Week 5 follows Week 2's pattern (Silver bin) as specified in the PR objectives, ensuring correct handling of 5-week months.


1187-1208: LGTM!

The 5-week month handling is correctly implemented. When week_of_month exceeds 4, it properly falls back to Week 5's entry if present, or Week 2's pattern otherwise, matching the PR objectives. The change to raise exceptions instead of returning safe fallbacks also aligns with the stated goals.


1267-1275: Verify green bin collection logic for 5-week months.

The bi-weekly green bin logic correctly identifies Week 2 and Week 4 collections (line 1271). However, consider 5-week months where Week 5 follows Week 2's pattern:

  • If green_bin_info["week"] = 2, the condition checks for weeks 2 and 4.
  • In a 5-week month, Week 5 follows Week 2's pattern for regular bins (Silver), but this logic won't match Week 5 for green bins.

Should green bins also be collected in Week 5 when it follows the Week 2 pattern? If so, the condition needs adjustment:

# Check if this matches the green bin collection pattern
# (bi-weekly: base week and base week + 2, or week 5 if it follows week 2)
is_green_week = (week_of_month == green_bin_info["week"] or 
                 week_of_month == green_bin_info["week"] + 2 or
                 (week_of_month == 5 and green_bin_info["week"] == 2))

if is_green_week:
    bin_data.append({
        "type": "Green bin (Garden waste)",
        "collectionDate": date
    })

Please verify whether the council collects green bins in Week 5 of 5-week months.

janstenpickle pushed a commit to janstenpickle/UKBinCollectionData that referenced this pull request Oct 29, 2025
- Remove fallback logic for collection_day and green_bin_info lookups
- Raise ValueError when postcode lookup fails instead of using defaults
- Update tests to expect exceptions instead of fallback behavior
- Aligns with PR objective of removing safe fallbacks that mask errors
- All 32 tests passing

Fixes robbrad#1692
…Calendar

- Add uses_ocr flag to SouthKestevenDistrictCouncil in input.json
- Simplify OCR image discovery to check self.ocr_image_dir, UKBC_OCR_IMAGE_DIR env var, and CWD
- Remove test/fixture references from main council module for clean separation
- Add pytest.mark.skipif to South Kesteven tests when OCR deps unavailable
- Update parse_calendar_images to prefer local images over downloads
- Fix fallback logic to only apply to specific postcode patterns
- All 32 South Kesteven tests passing

Resolves robbrad#1668
@janstenpickle janstenpickle force-pushed the fix/south-kesteven-ocr-parsing branch from f5b832a to d3de4b2 Compare October 29, 2025 20:40
@janstenpickle
Copy link
Author

@robbrad RE #1668 I've moved the specific tests for South Kesteven into the top level test directory and added an "OCR" type.

Obviously becuase this code has to do much more heavy lifting seeing as the council can't seem to do it their side, there's a lot of tests specific to this implementation, is that OK with you?

An alternate approach might be to host logic as an API that this can use, rather than add more dependency issues to HA. WDYT?

@robbrad
Copy link
Owner

robbrad commented Oct 30, 2025

@robbrad RE #1668 I've moved the specific tests for South Kesteven into the top level test directory and added an "OCR" type.

Obviously becuase this code has to do much more heavy lifting seeing as the council can't seem to do it their side, there's a lot of tests specific to this implementation, is that OK with you?

An alternate approach might be to host logic as an API that this can use, rather than add more dependency issues to HA. WDYT?

Thanks @janstenpickle

Are these unit tests or integration tests?

If unit - then standard tests should be added to the unit tests - not specific to the council (I dont see a live pull of info or a test pdf/image - so im guessing these are unit)

If integration then there should be new helpers added to the existing integration tests framework and a new type added to the feature file - eg non-selenium / selenium / ocr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants