From 50fcb7dd759b4f18c963e054bea795288545488f Mon Sep 17 00:00:00 2001 From: Tim Froehlich Date: Fri, 4 Jul 2025 15:54:57 -0500 Subject: [PATCH 1/4] feat: add issue tracking framework and initial monitoring issues MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add systematic issue tracking framework in docs/issues/ with: - Standardized naming convention: --.md - Priority levels 0-3 (critical to low) - Issue types: bug, feature, docs, chore - Comprehensive documentation and workflow guidelines Initial issues documented: - Startup duplicate notifications bug (priority 1) - Seen submission race condition (priority 1) - Timezone handling inconsistencies (priority 2) - Enhanced monitoring loop debug logging (priority 2) 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude --- .../1-bug-seen-submission-race-condition.md | 107 +++++++++++++++++ .../1-bug-startup-duplicate-notifications.md | 73 ++++++++++++ .../2-bug-timezone-handling-inconsistency.md | 100 ++++++++++++++++ ...2-feature-monitoring-loop-debug-logging.md | 107 +++++++++++++++++ docs/issues/CLAUDE.md | 111 ++++++++++++++++++ 5 files changed, 498 insertions(+) create mode 100644 docs/issues/1-bug-seen-submission-race-condition.md create mode 100644 docs/issues/1-bug-startup-duplicate-notifications.md create mode 100644 docs/issues/2-bug-timezone-handling-inconsistency.md create mode 100644 docs/issues/2-feature-monitoring-loop-debug-logging.md create mode 100644 docs/issues/CLAUDE.md diff --git a/docs/issues/1-bug-seen-submission-race-condition.md b/docs/issues/1-bug-seen-submission-race-condition.md new file mode 100644 index 0000000..eac4194 --- /dev/null +++ b/docs/issues/1-bug-seen-submission-race-condition.md @@ -0,0 +1,107 @@ +# Seen Submission Race Condition + +**Priority**: 1 +**Type**: bug +**Status**: open +**Created**: 2025-07-04 +**Updated**: 2025-07-04 + +## Description +Race condition exists between filtering submissions and marking them as seen, potentially allowing duplicate notifications if the monitoring loop runs multiple times rapidly or if API responses change between calls. + +## Reproduction Steps +1. Set up monitoring targets with very short poll intervals (difficult to reproduce consistently) +2. Force multiple rapid checks using `.trigger` command +3. Monitor for duplicate notifications of the same submission +4. Check database for timing issues in `seen_submissions` table + +## Expected vs Actual Behavior +- **Expected**: Each submission should only be notified once, regardless of timing +- **Actual**: Rapid polling or API inconsistencies could cause duplicate notifications + +## Technical Details + +### Code Location +- **File**: `src/cogs/runner.py` +- **Functions**: `filter_new_submissions()` (line 386-398), `post_submissions()`, `mark_submissions_seen()` + +### Race Condition Flow +```python +# Time T1: First check starts +submissions = await fetch_submissions_for_location(...) # API call +new_submissions = self.db.filter_new_submissions(channel_id, submissions) # Query DB + +# Time T2: Second check starts (before first completes) +submissions_2 = await fetch_submissions_for_location(...) # Same API response +new_submissions_2 = self.db.filter_new_submissions(channel_id, submissions_2) # Same query result + +# Time T3: Both post notifications +await self.post_submissions(new_submissions) # Posts duplicates +await self.post_submissions(new_submissions_2) # Posts duplicates + +# Time T4: Both mark as seen +self.db.mark_submissions_seen(channel_id, submission_ids) # Second call gets IntegrityError +``` + +### Database Protection +- `seen_submissions` table has unique constraint on `(channel_id, submission_id)` +- `mark_submissions_seen()` handles `IntegrityError` gracefully +- **Problem**: Notifications already sent before constraint violation + +### Potential Triggers +- Multiple rapid manual checks (`.trigger` command) +- Cloud Run scaling events causing multiple instances +- API response timing variations +- Database connection delays + +## Proposed Solutions + +### Option 1: Atomic Transaction +```python +async def process_submissions_atomically(self, channel_id, submissions): + async with self.db.transaction(): + new_submissions = self.db.filter_new_submissions(channel_id, submissions) + if new_submissions: + self.db.mark_submissions_seen(channel_id, [s["id"] for s in new_submissions]) + await self.post_submissions(new_submissions) +``` + +### Option 2: Channel-Level Locking +```python +class Runner: + def __init__(self): + self.channel_locks = {} + + async def run_checks_for_channel(self, channel_id, config, is_manual_check=False): + if channel_id not in self.channel_locks: + self.channel_locks[channel_id] = asyncio.Lock() + + async with self.channel_locks[channel_id]: + # Existing check logic +``` + +### Option 3: Submission ID Tracking +```python +# Track recently processed submission IDs in memory +self.recently_processed = {} # {channel_id: set(submission_ids)} + +async def filter_new_submissions(self, channel_id, submissions): + # Filter by database AND recent memory + recent = self.recently_processed.get(channel_id, set()) + truly_new = [s for s in db_filtered if s["id"] not in recent] + return truly_new +``` + +## Acceptance Criteria +- [ ] No duplicate notifications under rapid polling scenarios +- [ ] Multiple manual checks don't cause duplicates +- [ ] Database integrity maintained +- [ ] Performance impact minimal +- [ ] Cloud Run scaling doesn't trigger race conditions +- [ ] Test with concurrent `.trigger` commands + +## Notes +- Difficult to reproduce consistently in testing +- May require load testing to verify fix +- Related to Cloud Run scaling and multiple instances +- Consider distributed locking for multi-instance deployments \ No newline at end of file diff --git a/docs/issues/1-bug-startup-duplicate-notifications.md b/docs/issues/1-bug-startup-duplicate-notifications.md new file mode 100644 index 0000000..9614062 --- /dev/null +++ b/docs/issues/1-bug-startup-duplicate-notifications.md @@ -0,0 +1,73 @@ +# Startup Duplicate Notifications + +**Priority**: 1 +**Type**: bug +**Status**: open +**Created**: 2025-07-04 +**Updated**: 2025-07-04 + +## Description +When the bot starts up, it sends notifications for old submissions that have already been shown previously. The startup check bypasses normal polling logic and treats all submissions as "new" even though they were previously processed. + +## Reproduction Steps +1. Start the bot with existing monitoring targets +2. Observe that one or more channels receive notifications +3. Check log timestamps - notifications are for submissions that occurred before the current session +4. Compare with previous bot runs - same submissions are re-notified + +## Expected vs Actual Behavior +- **Expected**: Startup should only initialize monitoring, not send duplicate notifications +- **Actual**: Startup sends notifications for previously seen submissions + +## Technical Details + +### Code Location +- **File**: `src/cogs/runner.py` +- **Function**: `_run_startup_checks()` (lines 457-508) +- **Issue**: Calls `run_checks_for_channel(is_manual_check=False)` during startup + +### Root Cause +1. `_run_startup_checks()` runs immediately when bot becomes ready +2. Calls `run_checks_for_channel()` with `is_manual_check=False` +3. Since `last_poll_at` is None, API fetches all recent submissions without date filtering +4. `filter_new_submissions()` has no seen submissions in database yet (fresh start) +5. All submissions get posted as notifications +6. Submissions are then marked as seen, preventing future duplicates + +### Database State +- `seen_submissions` table is empty on startup +- `last_poll_at` is None for channels that haven't been polled +- API returns submissions from past 24 hours without filtering + +## Proposed Solution + +### Option 1: Change Startup to Manual Check +```python +# In _run_startup_checks() +result = await self.run_checks_for_channel(channel_id, config, is_manual_check=True) +``` +This would show "Last 5 submissions" format instead of posting as notifications. + +### Option 2: Mark as Seen Without Posting +```python +# In _run_startup_checks() +submissions = await self._fetch_submissions_for_channel(channel_id, config) +if submissions: + self.db.mark_submissions_seen(channel_id, [s["id"] for s in submissions]) +# Don't call post_submissions() +``` + +### Option 3: Skip Startup Checks Entirely +Remove `_run_startup_checks()` and let normal 1-minute polling handle first check. + +## Acceptance Criteria +- [ ] Bot startup does not trigger duplicate notifications +- [ ] Normal polling behavior unchanged +- [ ] Manual check commands still show recent submissions +- [ ] No regression in monitoring functionality +- [ ] Test with multiple channels and monitoring targets + +## Notes +- This issue is particularly noticeable with active monitoring targets +- Users report confusion when old submissions appear as "new" +- May be related to timezone handling in date filtering logic \ No newline at end of file diff --git a/docs/issues/2-bug-timezone-handling-inconsistency.md b/docs/issues/2-bug-timezone-handling-inconsistency.md new file mode 100644 index 0000000..f32bae7 --- /dev/null +++ b/docs/issues/2-bug-timezone-handling-inconsistency.md @@ -0,0 +1,100 @@ +# Timezone Handling Inconsistency + +**Priority**: 2 +**Type**: bug +**Status**: open +**Created**: 2025-07-04 +**Updated**: 2025-07-04 + +## Description +Inconsistent timezone handling across the application may cause polling schedule issues, incorrect date filtering in API calls, and database timestamp problems. + +## Reproduction Steps +1. Review datetime handling in different parts of the codebase +2. Check for naive vs timezone-aware datetime objects +3. Monitor for polling schedule irregularities +4. Verify API date filtering works correctly across time zones + +## Expected vs Actual Behavior +- **Expected**: All datetime operations should be timezone-aware and consistent +- **Actual**: Mix of naive and timezone-aware datetime objects causing potential issues + +## Technical Details + +### Code Locations +- **File**: `src/database.py` - `get_channel_config()` (lines 113-131) +- **File**: `src/cogs/runner.py` - Various datetime operations +- **File**: `src/api.py` - Date filtering in API calls + +### Specific Issues + +#### Database Operations +```python +# In get_channel_config() - timezone conversion happens here +if config.last_poll_at and config.last_poll_at.tzinfo is None: + config.last_poll_at = config.last_poll_at.replace(tzinfo=timezone.utc) +``` + +#### Polling Schedule Logic +```python +# In _should_poll_channel() - may use naive datetime +current_time = datetime.now() # Potentially naive +time_since_last_poll = current_time - config.last_poll_at # Mixed types +``` + +#### API Date Filtering +```python +# Date calculations for min_date_of_submission parameter +yesterday = datetime.now() - timedelta(days=1) # Potentially naive +date_str = yesterday.strftime("%Y-%m-%d") # May not account for timezone +``` + +### Potential Impact +- **Polling Schedule**: Channels may poll too frequently or not frequently enough +- **API Filtering**: Submissions might be missed or duplicated due to date range issues +- **Database Consistency**: Timestamps may not align properly +- **Multi-timezone Users**: Different behavior based on server vs user timezones + +## Proposed Solution + +### Centralized Timezone Handling +```python +# Create utility functions for consistent datetime handling +from datetime import datetime, timezone +import pytz + +def now_utc() -> datetime: + """Get current UTC time, always timezone-aware""" + return datetime.now(timezone.utc) + +def ensure_utc(dt: datetime) -> datetime: + """Ensure datetime is timezone-aware and in UTC""" + if dt.tzinfo is None: + # Assume naive datetime is UTC + return dt.replace(tzinfo=timezone.utc) + return dt.astimezone(timezone.utc) + +def format_api_date(dt: datetime) -> str: + """Format datetime for API calls consistently""" + return ensure_utc(dt).strftime("%Y-%m-%d") +``` + +### Update All Datetime Operations +1. Replace `datetime.now()` with `now_utc()` +2. Ensure all database datetime fields are timezone-aware +3. Standardize API date formatting +4. Add timezone validation in tests + +## Acceptance Criteria +- [ ] All datetime objects are timezone-aware +- [ ] Polling schedules work correctly regardless of server timezone +- [ ] API date filtering is consistent and accurate +- [ ] Database timestamps are properly handled +- [ ] No regression in existing functionality +- [ ] Add timezone-specific tests + +## Notes +- May require database migration if existing timestamps are naive +- Consider impact on existing data +- Test with different server timezone configurations +- Document timezone assumptions for deployment \ No newline at end of file diff --git a/docs/issues/2-feature-monitoring-loop-debug-logging.md b/docs/issues/2-feature-monitoring-loop-debug-logging.md new file mode 100644 index 0000000..f3196d1 --- /dev/null +++ b/docs/issues/2-feature-monitoring-loop-debug-logging.md @@ -0,0 +1,107 @@ +# Enhanced Monitoring Loop Debug Logging + +**Priority**: 2 +**Type**: feature +**Status**: open +**Created**: 2025-07-04 +**Updated**: 2025-07-04 + +## Description +Add comprehensive debug logging to the monitoring loop to help diagnose timing issues, polling schedule problems, and unexpected behavior. Current logging is insufficient for troubleshooting complex monitoring issues. + +## Use Cases +- Debugging why monitoring loop appears to stop working +- Understanding polling schedule and timing behavior +- Tracking down duplicate notification sources +- Monitoring API call patterns and responses +- Analyzing performance and bottlenecks + +## Acceptance Criteria +- [ ] Log monitoring loop state transitions (start, stop, iteration) +- [ ] Log channel polling decisions and timing calculations +- [ ] Log API call details (URL, response time, submission counts) +- [ ] Log database operations (queries, results, timing) +- [ ] Log seen submission filtering details +- [ ] Configurable log levels for different components +- [ ] Performance metrics logging (iteration time, API latency) +- [ ] Error context logging with full state information + +## Technical Details + +### Proposed Logging Categories + +#### Loop State Logging +```python +logger.debug(f"LOOP_STATE: iteration #{self.iteration_count} starting") +logger.debug(f"LOOP_STATE: {len(active_channels)} active channels found") +logger.debug(f"LOOP_STATE: iteration completed in {duration:.2f}s") +``` + +#### Channel Polling Decisions +```python +logger.debug(f"POLL_DECISION: channel {channel_id} - last_poll: {last_poll}, " + f"poll_rate: {poll_rate}min, should_poll: {should_poll}") +logger.debug(f"POLL_TIMING: channel {channel_id} - time_since_last: {time_since}min, " + f"next_poll_in: {next_poll_in}min") +``` + +#### API Call Details +```python +logger.debug(f"API_CALL: {api_type} for channel {channel_id} - " + f"URL: {url}, min_date: {min_date}") +logger.debug(f"API_RESPONSE: {api_type} - {len(submissions)} submissions, " + f"response_time: {response_time:.2f}s") +``` + +#### Database Operations +```python +logger.debug(f"DB_QUERY: filtering {len(submissions)} submissions for channel {channel_id}") +logger.debug(f"DB_RESULT: {len(new_submissions)} new submissions after filtering") +logger.debug(f"DB_UPDATE: marking {len(submission_ids)} submissions as seen") +``` + +#### Performance Metrics +```python +logger.info(f"PERF_METRICS: iteration #{iteration} - " + f"channels: {channel_count}, api_calls: {api_call_count}, " + f"notifications: {notification_count}, duration: {total_time:.2f}s") +``` + +### Configuration +- Environment variable `MONITORING_LOG_LEVEL` to control verbosity +- Separate log levels for different components +- Option to enable/disable performance metrics logging + +### Code Locations to Enhance +- `src/cogs/runner.py` - Main monitoring loop +- `src/database.py` - Submission filtering and marking +- `src/api.py` - API call timing and results +- `src/notifier.py` - Notification posting + +## Implementation Plan + +### Phase 1: Core Loop Logging +- Add state transition logging to `monitor_task_loop()` +- Log iteration timing and channel processing +- Add polling decision logging to `_should_poll_channel()` + +### Phase 2: API and Database Logging +- Enhance API call logging with timing and results +- Add detailed database operation logging +- Log submission filtering details + +### Phase 3: Performance and Error Logging +- Add performance metrics collection +- Enhance error logging with full context +- Add configurable log levels + +### Phase 4: Integration and Testing +- Test with local development environment +- Verify log output is useful for debugging +- Document log analysis procedures + +## Notes +- Balance between useful information and log volume +- Consider log rotation and storage impact +- Ensure sensitive data (tokens, etc.) not logged +- Design for both local development and production debugging \ No newline at end of file diff --git a/docs/issues/CLAUDE.md b/docs/issues/CLAUDE.md new file mode 100644 index 0000000..7e63c53 --- /dev/null +++ b/docs/issues/CLAUDE.md @@ -0,0 +1,111 @@ +# Issue Tracking Framework + +This directory contains systematically documented issues for the DisPinMap project, organized for easy prioritization and tracking. + +## File Naming Convention + +All issues follow the format: `--.md` + +### Priority Levels (0-3) +- **0**: Critical - System breaking, production down, data loss +- **1**: High - Major functionality broken, security issues, performance problems +- **2**: Medium - Important features not working, notable bugs, usability issues +- **3**: Low - Minor bugs, enhancements, documentation updates + +### Issue Types +- **bug**: Defects in existing functionality +- **feature**: New functionality or capabilities +- **docs**: Documentation improvements or additions +- **chore**: Maintenance tasks, refactoring, technical debt + +## Example File Names +- `0-bug-monitoring-loop-not-starting.md` - Critical bug +- `1-bug-duplicate-notifications-on-startup.md` - High priority bug +- `2-feature-add-location-search.md` - Medium priority feature +- `3-docs-update-api-documentation.md` - Low priority documentation + +## Issue File Structure + +Each issue file should contain: + +```markdown +# Issue Title + +**Priority**: [0-3] +**Type**: [bug|feature|docs|chore] +**Status**: [open|in-progress|blocked|closed] +**Assignee**: [optional] +**Created**: [date] +**Updated**: [date] + +## Description +Clear description of the issue, bug, or feature request. + +## Reproduction Steps (for bugs) +1. Step by step instructions +2. To reproduce the issue +3. Expected vs actual behavior + +## Acceptance Criteria (for features) +- [ ] Specific deliverable 1 +- [ ] Specific deliverable 2 +- [ ] Testing requirements + +## Technical Details +- Code locations involved +- Dependencies or related issues +- Potential solutions or approaches + +## Notes +Additional context, related discussions, or implementation notes. +``` + +## Workflow + +1. **Create Issue**: Document problems/features as they're discovered +2. **Prioritize**: Assign appropriate priority and type +3. **Track Progress**: Update status as work progresses +4. **Close**: Move to closed status when resolved +5. **Archive**: Optionally move closed issues to subdirectories + +## Priority Guidelines + +### Priority 0 (Critical) +- Production system down +- Data corruption or loss +- Security vulnerabilities +- Core functionality completely broken + +### Priority 1 (High) +- Major features not working +- Performance issues affecting users +- Monitoring/alerting failures +- High-impact bugs + +### Priority 2 (Medium) +- Minor feature bugs +- Usability improvements +- Non-critical performance issues +- Documentation gaps + +### Priority 3 (Low) +- Cosmetic issues +- Code organization +- Nice-to-have features +- Non-urgent documentation + +## Integration with Development + +- Reference issues in commit messages: `git commit -m "fix: resolve startup duplicates (refs 1-bug-startup-duplicates)"` +- Link to issues in PR descriptions +- Update issue status when work begins/completes +- Use issue analysis to guide development priorities + +## AI Agent Guidelines + +When working on this project: +1. **Check existing issues** before starting work +2. **Create new issues** for discovered problems +3. **Update issue status** as work progresses +4. **Reference issues** in commits and PRs +5. **Prioritize work** based on issue priorities \ No newline at end of file From f10fb4aca91ee7ad47fc79af92216cad6e7f4c84 Mon Sep 17 00:00:00 2001 From: Tim Froehlich Date: Fri, 4 Jul 2025 17:38:58 -0500 Subject: [PATCH 2/4] fix: format markdown files with prettier MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fix prettier formatting issues in issue tracking framework: - Reformat all markdown files in docs/issues/ directory - Ensure consistent markdown formatting across the project - Resolve CI lint failures for prettier check 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude --- .../1-bug-seen-submission-race-condition.md | 42 ++++++++++++------ .../1-bug-startup-duplicate-notifications.md | 43 +++++++++++++------ .../2-bug-timezone-handling-inconsistency.md | 31 +++++++++---- ...2-feature-monitoring-loop-debug-logging.md | 26 ++++++++--- docs/issues/CLAUDE.md | 31 +++++++++---- 5 files changed, 123 insertions(+), 50 deletions(-) diff --git a/docs/issues/1-bug-seen-submission-race-condition.md b/docs/issues/1-bug-seen-submission-race-condition.md index eac4194..bba3e49 100644 --- a/docs/issues/1-bug-seen-submission-race-condition.md +++ b/docs/issues/1-bug-seen-submission-race-condition.md @@ -1,31 +1,39 @@ # Seen Submission Race Condition -**Priority**: 1 -**Type**: bug -**Status**: open -**Created**: 2025-07-04 +**Priority**: 1 **Type**: bug **Status**: open **Created**: 2025-07-04 **Updated**: 2025-07-04 ## Description -Race condition exists between filtering submissions and marking them as seen, potentially allowing duplicate notifications if the monitoring loop runs multiple times rapidly or if API responses change between calls. + +Race condition exists between filtering submissions and marking them as seen, +potentially allowing duplicate notifications if the monitoring loop runs +multiple times rapidly or if API responses change between calls. ## Reproduction Steps -1. Set up monitoring targets with very short poll intervals (difficult to reproduce consistently) + +1. Set up monitoring targets with very short poll intervals (difficult to + reproduce consistently) 2. Force multiple rapid checks using `.trigger` command 3. Monitor for duplicate notifications of the same submission 4. Check database for timing issues in `seen_submissions` table ## Expected vs Actual Behavior -- **Expected**: Each submission should only be notified once, regardless of timing -- **Actual**: Rapid polling or API inconsistencies could cause duplicate notifications + +- **Expected**: Each submission should only be notified once, regardless of + timing +- **Actual**: Rapid polling or API inconsistencies could cause duplicate + notifications ## Technical Details ### Code Location + - **File**: `src/cogs/runner.py` -- **Functions**: `filter_new_submissions()` (line 386-398), `post_submissions()`, `mark_submissions_seen()` +- **Functions**: `filter_new_submissions()` (line 386-398), + `post_submissions()`, `mark_submissions_seen()` ### Race Condition Flow + ```python # Time T1: First check starts submissions = await fetch_submissions_for_location(...) # API call @@ -44,11 +52,14 @@ self.db.mark_submissions_seen(channel_id, submission_ids) # Second call gets In ``` ### Database Protection -- `seen_submissions` table has unique constraint on `(channel_id, submission_id)` + +- `seen_submissions` table has unique constraint on + `(channel_id, submission_id)` - `mark_submissions_seen()` handles `IntegrityError` gracefully - **Problem**: Notifications already sent before constraint violation ### Potential Triggers + - Multiple rapid manual checks (`.trigger` command) - Cloud Run scaling events causing multiple instances - API response timing variations @@ -57,6 +68,7 @@ self.db.mark_submissions_seen(channel_id, submission_ids) # Second call gets In ## Proposed Solutions ### Option 1: Atomic Transaction + ```python async def process_submissions_atomically(self, channel_id, submissions): async with self.db.transaction(): @@ -67,20 +79,22 @@ async def process_submissions_atomically(self, channel_id, submissions): ``` ### Option 2: Channel-Level Locking + ```python class Runner: def __init__(self): self.channel_locks = {} - + async def run_checks_for_channel(self, channel_id, config, is_manual_check=False): if channel_id not in self.channel_locks: self.channel_locks[channel_id] = asyncio.Lock() - + async with self.channel_locks[channel_id]: # Existing check logic ``` ### Option 3: Submission ID Tracking + ```python # Track recently processed submission IDs in memory self.recently_processed = {} # {channel_id: set(submission_ids)} @@ -93,6 +107,7 @@ async def filter_new_submissions(self, channel_id, submissions): ``` ## Acceptance Criteria + - [ ] No duplicate notifications under rapid polling scenarios - [ ] Multiple manual checks don't cause duplicates - [ ] Database integrity maintained @@ -101,7 +116,8 @@ async def filter_new_submissions(self, channel_id, submissions): - [ ] Test with concurrent `.trigger` commands ## Notes + - Difficult to reproduce consistently in testing - May require load testing to verify fix - Related to Cloud Run scaling and multiple instances -- Consider distributed locking for multi-instance deployments \ No newline at end of file +- Consider distributed locking for multi-instance deployments diff --git a/docs/issues/1-bug-startup-duplicate-notifications.md b/docs/issues/1-bug-startup-duplicate-notifications.md index 9614062..3be3b99 100644 --- a/docs/issues/1-bug-startup-duplicate-notifications.md +++ b/docs/issues/1-bug-startup-duplicate-notifications.md @@ -1,40 +1,50 @@ # Startup Duplicate Notifications -**Priority**: 1 -**Type**: bug -**Status**: open -**Created**: 2025-07-04 +**Priority**: 1 **Type**: bug **Status**: open **Created**: 2025-07-04 **Updated**: 2025-07-04 ## Description -When the bot starts up, it sends notifications for old submissions that have already been shown previously. The startup check bypasses normal polling logic and treats all submissions as "new" even though they were previously processed. + +When the bot starts up, it sends notifications for old submissions that have +already been shown previously. The startup check bypasses normal polling logic +and treats all submissions as "new" even though they were previously processed. ## Reproduction Steps + 1. Start the bot with existing monitoring targets 2. Observe that one or more channels receive notifications -3. Check log timestamps - notifications are for submissions that occurred before the current session +3. Check log timestamps - notifications are for submissions that occurred before + the current session 4. Compare with previous bot runs - same submissions are re-notified ## Expected vs Actual Behavior -- **Expected**: Startup should only initialize monitoring, not send duplicate notifications + +- **Expected**: Startup should only initialize monitoring, not send duplicate + notifications - **Actual**: Startup sends notifications for previously seen submissions ## Technical Details ### Code Location + - **File**: `src/cogs/runner.py` - **Function**: `_run_startup_checks()` (lines 457-508) -- **Issue**: Calls `run_checks_for_channel(is_manual_check=False)` during startup +- **Issue**: Calls `run_checks_for_channel(is_manual_check=False)` during + startup ### Root Cause + 1. `_run_startup_checks()` runs immediately when bot becomes ready -2. Calls `run_checks_for_channel()` with `is_manual_check=False` -3. Since `last_poll_at` is None, API fetches all recent submissions without date filtering -4. `filter_new_submissions()` has no seen submissions in database yet (fresh start) +2. Calls `run_checks_for_channel()` with `is_manual_check=False` +3. Since `last_poll_at` is None, API fetches all recent submissions without date + filtering +4. `filter_new_submissions()` has no seen submissions in database yet (fresh + start) 5. All submissions get posted as notifications 6. Submissions are then marked as seen, preventing future duplicates ### Database State + - `seen_submissions` table is empty on startup - `last_poll_at` is None for channels that haven't been polled - API returns submissions from past 24 hours without filtering @@ -42,13 +52,16 @@ When the bot starts up, it sends notifications for old submissions that have alr ## Proposed Solution ### Option 1: Change Startup to Manual Check + ```python # In _run_startup_checks() result = await self.run_checks_for_channel(channel_id, config, is_manual_check=True) ``` + This would show "Last 5 submissions" format instead of posting as notifications. ### Option 2: Mark as Seen Without Posting + ```python # In _run_startup_checks() submissions = await self._fetch_submissions_for_channel(channel_id, config) @@ -58,9 +71,12 @@ if submissions: ``` ### Option 3: Skip Startup Checks Entirely -Remove `_run_startup_checks()` and let normal 1-minute polling handle first check. + +Remove `_run_startup_checks()` and let normal 1-minute polling handle first +check. ## Acceptance Criteria + - [ ] Bot startup does not trigger duplicate notifications - [ ] Normal polling behavior unchanged - [ ] Manual check commands still show recent submissions @@ -68,6 +84,7 @@ Remove `_run_startup_checks()` and let normal 1-minute polling handle first chec - [ ] Test with multiple channels and monitoring targets ## Notes + - This issue is particularly noticeable with active monitoring targets - Users report confusion when old submissions appear as "new" -- May be related to timezone handling in date filtering logic \ No newline at end of file +- May be related to timezone handling in date filtering logic diff --git a/docs/issues/2-bug-timezone-handling-inconsistency.md b/docs/issues/2-bug-timezone-handling-inconsistency.md index f32bae7..34c2d39 100644 --- a/docs/issues/2-bug-timezone-handling-inconsistency.md +++ b/docs/issues/2-bug-timezone-handling-inconsistency.md @@ -1,27 +1,30 @@ # Timezone Handling Inconsistency -**Priority**: 2 -**Type**: bug -**Status**: open -**Created**: 2025-07-04 +**Priority**: 2 **Type**: bug **Status**: open **Created**: 2025-07-04 **Updated**: 2025-07-04 ## Description -Inconsistent timezone handling across the application may cause polling schedule issues, incorrect date filtering in API calls, and database timestamp problems. + +Inconsistent timezone handling across the application may cause polling schedule +issues, incorrect date filtering in API calls, and database timestamp problems. ## Reproduction Steps + 1. Review datetime handling in different parts of the codebase 2. Check for naive vs timezone-aware datetime objects 3. Monitor for polling schedule irregularities 4. Verify API date filtering works correctly across time zones ## Expected vs Actual Behavior + - **Expected**: All datetime operations should be timezone-aware and consistent -- **Actual**: Mix of naive and timezone-aware datetime objects causing potential issues +- **Actual**: Mix of naive and timezone-aware datetime objects causing potential + issues ## Technical Details ### Code Locations + - **File**: `src/database.py` - `get_channel_config()` (lines 113-131) - **File**: `src/cogs/runner.py` - Various datetime operations - **File**: `src/api.py` - Date filtering in API calls @@ -29,6 +32,7 @@ Inconsistent timezone handling across the application may cause polling schedule ### Specific Issues #### Database Operations + ```python # In get_channel_config() - timezone conversion happens here if config.last_poll_at and config.last_poll_at.tzinfo is None: @@ -36,6 +40,7 @@ if config.last_poll_at and config.last_poll_at.tzinfo is None: ``` #### Polling Schedule Logic + ```python # In _should_poll_channel() - may use naive datetime current_time = datetime.now() # Potentially naive @@ -43,6 +48,7 @@ time_since_last_poll = current_time - config.last_poll_at # Mixed types ``` #### API Date Filtering + ```python # Date calculations for min_date_of_submission parameter yesterday = datetime.now() - timedelta(days=1) # Potentially naive @@ -50,14 +56,18 @@ date_str = yesterday.strftime("%Y-%m-%d") # May not account for timezone ``` ### Potential Impact -- **Polling Schedule**: Channels may poll too frequently or not frequently enough -- **API Filtering**: Submissions might be missed or duplicated due to date range issues + +- **Polling Schedule**: Channels may poll too frequently or not frequently + enough +- **API Filtering**: Submissions might be missed or duplicated due to date range + issues - **Database Consistency**: Timestamps may not align properly - **Multi-timezone Users**: Different behavior based on server vs user timezones ## Proposed Solution ### Centralized Timezone Handling + ```python # Create utility functions for consistent datetime handling from datetime import datetime, timezone @@ -80,12 +90,14 @@ def format_api_date(dt: datetime) -> str: ``` ### Update All Datetime Operations + 1. Replace `datetime.now()` with `now_utc()` 2. Ensure all database datetime fields are timezone-aware 3. Standardize API date formatting 4. Add timezone validation in tests ## Acceptance Criteria + - [ ] All datetime objects are timezone-aware - [ ] Polling schedules work correctly regardless of server timezone - [ ] API date filtering is consistent and accurate @@ -94,7 +106,8 @@ def format_api_date(dt: datetime) -> str: - [ ] Add timezone-specific tests ## Notes + - May require database migration if existing timestamps are naive - Consider impact on existing data - Test with different server timezone configurations -- Document timezone assumptions for deployment \ No newline at end of file +- Document timezone assumptions for deployment diff --git a/docs/issues/2-feature-monitoring-loop-debug-logging.md b/docs/issues/2-feature-monitoring-loop-debug-logging.md index f3196d1..b864c62 100644 --- a/docs/issues/2-feature-monitoring-loop-debug-logging.md +++ b/docs/issues/2-feature-monitoring-loop-debug-logging.md @@ -1,15 +1,16 @@ # Enhanced Monitoring Loop Debug Logging -**Priority**: 2 -**Type**: feature -**Status**: open -**Created**: 2025-07-04 +**Priority**: 2 **Type**: feature **Status**: open **Created**: 2025-07-04 **Updated**: 2025-07-04 ## Description -Add comprehensive debug logging to the monitoring loop to help diagnose timing issues, polling schedule problems, and unexpected behavior. Current logging is insufficient for troubleshooting complex monitoring issues. + +Add comprehensive debug logging to the monitoring loop to help diagnose timing +issues, polling schedule problems, and unexpected behavior. Current logging is +insufficient for troubleshooting complex monitoring issues. ## Use Cases + - Debugging why monitoring loop appears to stop working - Understanding polling schedule and timing behavior - Tracking down duplicate notification sources @@ -17,6 +18,7 @@ Add comprehensive debug logging to the monitoring loop to help diagnose timing i - Analyzing performance and bottlenecks ## Acceptance Criteria + - [ ] Log monitoring loop state transitions (start, stop, iteration) - [ ] Log channel polling decisions and timing calculations - [ ] Log API call details (URL, response time, submission counts) @@ -31,6 +33,7 @@ Add comprehensive debug logging to the monitoring loop to help diagnose timing i ### Proposed Logging Categories #### Loop State Logging + ```python logger.debug(f"LOOP_STATE: iteration #{self.iteration_count} starting") logger.debug(f"LOOP_STATE: {len(active_channels)} active channels found") @@ -38,6 +41,7 @@ logger.debug(f"LOOP_STATE: iteration completed in {duration:.2f}s") ``` #### Channel Polling Decisions + ```python logger.debug(f"POLL_DECISION: channel {channel_id} - last_poll: {last_poll}, " f"poll_rate: {poll_rate}min, should_poll: {should_poll}") @@ -46,6 +50,7 @@ logger.debug(f"POLL_TIMING: channel {channel_id} - time_since_last: {time_since} ``` #### API Call Details + ```python logger.debug(f"API_CALL: {api_type} for channel {channel_id} - " f"URL: {url}, min_date: {min_date}") @@ -54,6 +59,7 @@ logger.debug(f"API_RESPONSE: {api_type} - {len(submissions)} submissions, " ``` #### Database Operations + ```python logger.debug(f"DB_QUERY: filtering {len(submissions)} submissions for channel {channel_id}") logger.debug(f"DB_RESULT: {len(new_submissions)} new submissions after filtering") @@ -61,6 +67,7 @@ logger.debug(f"DB_UPDATE: marking {len(submission_ids)} submissions as seen") ``` #### Performance Metrics + ```python logger.info(f"PERF_METRICS: iteration #{iteration} - " f"channels: {channel_count}, api_calls: {api_call_count}, " @@ -68,11 +75,13 @@ logger.info(f"PERF_METRICS: iteration #{iteration} - " ``` ### Configuration + - Environment variable `MONITORING_LOG_LEVEL` to control verbosity - Separate log levels for different components - Option to enable/disable performance metrics logging ### Code Locations to Enhance + - `src/cogs/runner.py` - Main monitoring loop - `src/database.py` - Submission filtering and marking - `src/api.py` - API call timing and results @@ -81,27 +90,32 @@ logger.info(f"PERF_METRICS: iteration #{iteration} - " ## Implementation Plan ### Phase 1: Core Loop Logging + - Add state transition logging to `monitor_task_loop()` - Log iteration timing and channel processing - Add polling decision logging to `_should_poll_channel()` ### Phase 2: API and Database Logging + - Enhance API call logging with timing and results - Add detailed database operation logging - Log submission filtering details ### Phase 3: Performance and Error Logging + - Add performance metrics collection - Enhance error logging with full context - Add configurable log levels ### Phase 4: Integration and Testing + - Test with local development environment - Verify log output is useful for debugging - Document log analysis procedures ## Notes + - Balance between useful information and log volume - Consider log rotation and storage impact - Ensure sensitive data (tokens, etc.) not logged -- Design for both local development and production debugging \ No newline at end of file +- Design for both local development and production debugging diff --git a/docs/issues/CLAUDE.md b/docs/issues/CLAUDE.md index 7e63c53..32bab83 100644 --- a/docs/issues/CLAUDE.md +++ b/docs/issues/CLAUDE.md @@ -1,24 +1,29 @@ # Issue Tracking Framework -This directory contains systematically documented issues for the DisPinMap project, organized for easy prioritization and tracking. +This directory contains systematically documented issues for the DisPinMap +project, organized for easy prioritization and tracking. ## File Naming Convention All issues follow the format: `--.md` ### Priority Levels (0-3) + - **0**: Critical - System breaking, production down, data loss -- **1**: High - Major functionality broken, security issues, performance problems +- **1**: High - Major functionality broken, security issues, performance + problems - **2**: Medium - Important features not working, notable bugs, usability issues - **3**: Low - Minor bugs, enhancements, documentation updates ### Issue Types + - **bug**: Defects in existing functionality - **feature**: New functionality or capabilities - **docs**: Documentation improvements or additions - **chore**: Maintenance tasks, refactoring, technical debt ## Example File Names + - `0-bug-monitoring-loop-not-starting.md` - Critical bug - `1-bug-duplicate-notifications-on-startup.md` - High priority bug - `2-feature-add-location-search.md` - Medium priority feature @@ -31,32 +36,34 @@ Each issue file should contain: ```markdown # Issue Title -**Priority**: [0-3] -**Type**: [bug|feature|docs|chore] -**Status**: [open|in-progress|blocked|closed] -**Assignee**: [optional] -**Created**: [date] +**Priority**: [0-3] **Type**: [bug|feature|docs|chore] **Status**: +[open|in-progress|blocked|closed] **Assignee**: [optional] **Created**: [date] **Updated**: [date] ## Description + Clear description of the issue, bug, or feature request. ## Reproduction Steps (for bugs) + 1. Step by step instructions 2. To reproduce the issue 3. Expected vs actual behavior ## Acceptance Criteria (for features) + - [ ] Specific deliverable 1 - [ ] Specific deliverable 2 - [ ] Testing requirements ## Technical Details + - Code locations involved - Dependencies or related issues - Potential solutions or approaches ## Notes + Additional context, related discussions, or implementation notes. ``` @@ -71,24 +78,28 @@ Additional context, related discussions, or implementation notes. ## Priority Guidelines ### Priority 0 (Critical) + - Production system down - Data corruption or loss - Security vulnerabilities - Core functionality completely broken ### Priority 1 (High) + - Major features not working - Performance issues affecting users - Monitoring/alerting failures - High-impact bugs ### Priority 2 (Medium) + - Minor feature bugs - Usability improvements - Non-critical performance issues - Documentation gaps ### Priority 3 (Low) + - Cosmetic issues - Code organization - Nice-to-have features @@ -96,7 +107,8 @@ Additional context, related discussions, or implementation notes. ## Integration with Development -- Reference issues in commit messages: `git commit -m "fix: resolve startup duplicates (refs 1-bug-startup-duplicates)"` +- Reference issues in commit messages: + `git commit -m "fix: resolve startup duplicates (refs 1-bug-startup-duplicates)"` - Link to issues in PR descriptions - Update issue status when work begins/completes - Use issue analysis to guide development priorities @@ -104,8 +116,9 @@ Additional context, related discussions, or implementation notes. ## AI Agent Guidelines When working on this project: + 1. **Check existing issues** before starting work 2. **Create new issues** for discovered problems 3. **Update issue status** as work progresses 4. **Reference issues** in commits and PRs -5. **Prioritize work** based on issue priorities \ No newline at end of file +5. **Prioritize work** based on issue priorities From 5fa78502723bcfe8a153a84c5130975c29cc0c2e Mon Sep 17 00:00:00 2001 From: Tim Froehlich Date: Fri, 4 Jul 2025 17:41:42 -0500 Subject: [PATCH 3/4] Update src/cogs/runner.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- src/cogs/runner.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/cogs/runner.py b/src/cogs/runner.py index b5a70a6..4a11dc3 100644 --- a/src/cogs/runner.py +++ b/src/cogs/runner.py @@ -451,7 +451,7 @@ async def before_monitor_task_loop(self): f"🔍 Task loop is running: {self.monitor_task_loop.is_running()}" ) except Exception as e: - logger.warning(f"⚠️ Could not access bot debug info during startup: {e}") + logger.exception("⚠️ Could not access bot debug info during startup") logger.info("🔄 Debug info will be available once bot is fully initialized") # Run immediate first check to avoid waiting 60 minutes on startup From 5e0673e6e412ec3d0ee4d866d00a5b60287c5ce3 Mon Sep 17 00:00:00 2001 From: Tim Froehlich Date: Fri, 4 Jul 2025 21:12:24 -0500 Subject: [PATCH 4/4] Fix ruf --- src/cogs/runner.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/cogs/runner.py b/src/cogs/runner.py index 4a11dc3..347af45 100644 --- a/src/cogs/runner.py +++ b/src/cogs/runner.py @@ -450,7 +450,7 @@ async def before_monitor_task_loop(self): logger.info( f"🔍 Task loop is running: {self.monitor_task_loop.is_running()}" ) - except Exception as e: + except Exception: logger.exception("⚠️ Could not access bot debug info during startup") logger.info("🔄 Debug info will be available once bot is fully initialized")