Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
123 changes: 123 additions & 0 deletions docs/issues/1-bug-seen-submission-race-condition.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# Seen Submission Race Condition

**Priority**: 1 **Type**: bug **Status**: open **Created**: 2025-07-04
**Updated**: 2025-07-04

## Description

Race condition exists between filtering submissions and marking them as seen,
potentially allowing duplicate notifications if the monitoring loop runs
multiple times rapidly or if API responses change between calls.

## Reproduction Steps

1. Set up monitoring targets with very short poll intervals (difficult to
reproduce consistently)
2. Force multiple rapid checks using `.trigger` command
3. Monitor for duplicate notifications of the same submission
4. Check database for timing issues in `seen_submissions` table

## Expected vs Actual Behavior

- **Expected**: Each submission should only be notified once, regardless of
timing
- **Actual**: Rapid polling or API inconsistencies could cause duplicate
notifications

## Technical Details

### Code Location

- **File**: `src/cogs/runner.py`
- **Functions**: `filter_new_submissions()` (line 386-398),
`post_submissions()`, `mark_submissions_seen()`

### Race Condition Flow

```python
# Time T1: First check starts
submissions = await fetch_submissions_for_location(...) # API call
new_submissions = self.db.filter_new_submissions(channel_id, submissions) # Query DB

# Time T2: Second check starts (before first completes)
submissions_2 = await fetch_submissions_for_location(...) # Same API response
new_submissions_2 = self.db.filter_new_submissions(channel_id, submissions_2) # Same query result

# Time T3: Both post notifications
await self.post_submissions(new_submissions) # Posts duplicates
await self.post_submissions(new_submissions_2) # Posts duplicates

# Time T4: Both mark as seen
self.db.mark_submissions_seen(channel_id, submission_ids) # Second call gets IntegrityError
```

### Database Protection

- `seen_submissions` table has unique constraint on
`(channel_id, submission_id)`
- `mark_submissions_seen()` handles `IntegrityError` gracefully
- **Problem**: Notifications already sent before constraint violation

### Potential Triggers

- Multiple rapid manual checks (`.trigger` command)
- Cloud Run scaling events causing multiple instances
- API response timing variations
- Database connection delays

## Proposed Solutions

### Option 1: Atomic Transaction

```python
async def process_submissions_atomically(self, channel_id, submissions):
async with self.db.transaction():
new_submissions = self.db.filter_new_submissions(channel_id, submissions)
if new_submissions:
self.db.mark_submissions_seen(channel_id, [s["id"] for s in new_submissions])
await self.post_submissions(new_submissions)
```

### Option 2: Channel-Level Locking

```python
class Runner:
def __init__(self):
self.channel_locks = {}

async def run_checks_for_channel(self, channel_id, config, is_manual_check=False):
if channel_id not in self.channel_locks:
self.channel_locks[channel_id] = asyncio.Lock()

async with self.channel_locks[channel_id]:
# Existing check logic
```

### Option 3: Submission ID Tracking

```python
# Track recently processed submission IDs in memory
self.recently_processed = {} # {channel_id: set(submission_ids)}

async def filter_new_submissions(self, channel_id, submissions):
# Filter by database AND recent memory
recent = self.recently_processed.get(channel_id, set())
truly_new = [s for s in db_filtered if s["id"] not in recent]
return truly_new
```

## Acceptance Criteria

- [ ] No duplicate notifications under rapid polling scenarios
- [ ] Multiple manual checks don't cause duplicates
- [ ] Database integrity maintained
- [ ] Performance impact minimal
- [ ] Cloud Run scaling doesn't trigger race conditions
- [ ] Test with concurrent `.trigger` commands

## Notes

- Difficult to reproduce consistently in testing
- May require load testing to verify fix
- Related to Cloud Run scaling and multiple instances
- Consider distributed locking for multi-instance deployments
90 changes: 90 additions & 0 deletions docs/issues/1-bug-startup-duplicate-notifications.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Startup Duplicate Notifications

**Priority**: 1 **Type**: bug **Status**: open **Created**: 2025-07-04
**Updated**: 2025-07-04

## Description

When the bot starts up, it sends notifications for old submissions that have
already been shown previously. The startup check bypasses normal polling logic
and treats all submissions as "new" even though they were previously processed.

## Reproduction Steps

1. Start the bot with existing monitoring targets
2. Observe that one or more channels receive notifications
3. Check log timestamps - notifications are for submissions that occurred before
the current session
4. Compare with previous bot runs - same submissions are re-notified

## Expected vs Actual Behavior

- **Expected**: Startup should only initialize monitoring, not send duplicate
notifications
- **Actual**: Startup sends notifications for previously seen submissions

## Technical Details

### Code Location

- **File**: `src/cogs/runner.py`
- **Function**: `_run_startup_checks()` (lines 457-508)
- **Issue**: Calls `run_checks_for_channel(is_manual_check=False)` during
startup

### Root Cause

1. `_run_startup_checks()` runs immediately when bot becomes ready
2. Calls `run_checks_for_channel()` with `is_manual_check=False`
3. Since `last_poll_at` is None, API fetches all recent submissions without date
filtering
4. `filter_new_submissions()` has no seen submissions in database yet (fresh
start)
5. All submissions get posted as notifications
6. Submissions are then marked as seen, preventing future duplicates

### Database State

- `seen_submissions` table is empty on startup
- `last_poll_at` is None for channels that haven't been polled
- API returns submissions from past 24 hours without filtering

## Proposed Solution

### Option 1: Change Startup to Manual Check

```python
# In _run_startup_checks()
result = await self.run_checks_for_channel(channel_id, config, is_manual_check=True)
```

This would show "Last 5 submissions" format instead of posting as notifications.

### Option 2: Mark as Seen Without Posting

```python
# In _run_startup_checks()
submissions = await self._fetch_submissions_for_channel(channel_id, config)
if submissions:
self.db.mark_submissions_seen(channel_id, [s["id"] for s in submissions])
# Don't call post_submissions()
```

### Option 3: Skip Startup Checks Entirely

Remove `_run_startup_checks()` and let normal 1-minute polling handle first
check.

## Acceptance Criteria

- [ ] Bot startup does not trigger duplicate notifications
- [ ] Normal polling behavior unchanged
- [ ] Manual check commands still show recent submissions
- [ ] No regression in monitoring functionality
- [ ] Test with multiple channels and monitoring targets

## Notes

- This issue is particularly noticeable with active monitoring targets
- Users report confusion when old submissions appear as "new"
- May be related to timezone handling in date filtering logic
113 changes: 113 additions & 0 deletions docs/issues/2-bug-timezone-handling-inconsistency.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# Timezone Handling Inconsistency

**Priority**: 2 **Type**: bug **Status**: open **Created**: 2025-07-04
**Updated**: 2025-07-04

## Description

Inconsistent timezone handling across the application may cause polling schedule
issues, incorrect date filtering in API calls, and database timestamp problems.

## Reproduction Steps

1. Review datetime handling in different parts of the codebase
2. Check for naive vs timezone-aware datetime objects
3. Monitor for polling schedule irregularities
4. Verify API date filtering works correctly across time zones

## Expected vs Actual Behavior

- **Expected**: All datetime operations should be timezone-aware and consistent
- **Actual**: Mix of naive and timezone-aware datetime objects causing potential
issues

## Technical Details

### Code Locations

- **File**: `src/database.py` - `get_channel_config()` (lines 113-131)
- **File**: `src/cogs/runner.py` - Various datetime operations
- **File**: `src/api.py` - Date filtering in API calls

### Specific Issues

#### Database Operations

```python
# In get_channel_config() - timezone conversion happens here
if config.last_poll_at and config.last_poll_at.tzinfo is None:
config.last_poll_at = config.last_poll_at.replace(tzinfo=timezone.utc)
```

#### Polling Schedule Logic

```python
# In _should_poll_channel() - may use naive datetime
current_time = datetime.now() # Potentially naive
time_since_last_poll = current_time - config.last_poll_at # Mixed types
```

#### API Date Filtering

```python
# Date calculations for min_date_of_submission parameter
yesterday = datetime.now() - timedelta(days=1) # Potentially naive
date_str = yesterday.strftime("%Y-%m-%d") # May not account for timezone
```

### Potential Impact

- **Polling Schedule**: Channels may poll too frequently or not frequently
enough
- **API Filtering**: Submissions might be missed or duplicated due to date range
issues
- **Database Consistency**: Timestamps may not align properly
- **Multi-timezone Users**: Different behavior based on server vs user timezones

## Proposed Solution

### Centralized Timezone Handling

```python
# Create utility functions for consistent datetime handling
from datetime import datetime, timezone
import pytz

def now_utc() -> datetime:
"""Get current UTC time, always timezone-aware"""
return datetime.now(timezone.utc)

def ensure_utc(dt: datetime) -> datetime:
"""Ensure datetime is timezone-aware and in UTC"""
if dt.tzinfo is None:
# Assume naive datetime is UTC
return dt.replace(tzinfo=timezone.utc)
return dt.astimezone(timezone.utc)

def format_api_date(dt: datetime) -> str:
"""Format datetime for API calls consistently"""
return ensure_utc(dt).strftime("%Y-%m-%d")
```

### Update All Datetime Operations

1. Replace `datetime.now()` with `now_utc()`
2. Ensure all database datetime fields are timezone-aware
3. Standardize API date formatting
4. Add timezone validation in tests

## Acceptance Criteria

- [ ] All datetime objects are timezone-aware
- [ ] Polling schedules work correctly regardless of server timezone
- [ ] API date filtering is consistent and accurate
- [ ] Database timestamps are properly handled
- [ ] No regression in existing functionality
- [ ] Add timezone-specific tests

## Notes

- May require database migration if existing timestamps are naive
- Consider impact on existing data
- Test with different server timezone configurations
- Document timezone assumptions for deployment
Loading