[Bug]: Features marked 'Verified' even when Playwright tests fail - no test result tracking

### Operating System

Linux (Docker)

### Run Mode

Docker

### App Version

0.13.0

### Bug Description

Features in automated testing mode are marked as "Verified" regardless of whether Playwright tests actually passed or failed. There is no visual indication in the UI when tests fail, no parsing of test results from agent output, and no backlog or history of test results to review.

### Root Cause Analysis

In `apps/server/src/services/auto-mode-service.ts`:

**Line 1350-1354** - Status is set based solely on mode, not test results:
```typescript
// Determine final status based on testing mode:
// - skipTests=false (automated testing): go directly to 'verified' (no manual verify needed)
// - skipTests=true (manual verification): go to 'waiting_approval' for manual review
const finalStatus = feature.skipTests ? 'waiting_approval' : 'verified';
await this.updateFeatureStatus(projectPath, featureId, finalStatus);
```

**Line 1393** - Success is hardcoded:
```typescript
passes: true,  // Always true if execution completes
```

**No test result parsing** - Agent output is saved to `agent-output.md` but never analyzed for:
- Test pass/fail indicators
- Playwright exit codes
- Error messages like "browser installation was blocked"

### Steps to Reproduce

1. Run Automaker in Docker without Playwright browsers installed
2. Create a feature and move to "In Progress" (automated mode)
3. AI agent attempts Playwright verification, which fails
4. Agent reports "Playwright browser installation was blocked by permissions, but build verification confirms the implementation is working"
5. Feature is moved to "Verified" status anyway
6. No indication anywhere in UI that tests failed

### Expected Behavior

1. **Test result parsing**: System should parse agent output for test pass/fail status
2. **Failed verification handling**: Features with failed tests should NOT be marked "Verified"
3. **Visual indication**: UI should show test status (pass/fail/skipped) on feature cards
4. **Test history/backlog**: Users should be able to:
   - See a list of features with failed/pending tests
   - Re-run failed tests
   - View test output/logs
5. **Graceful degradation**: When tests can't run (missing browsers), status should reflect "verification skipped" not "verified"

### Actual Behavior

- Features always marked "Verified" if agent completes
- No parsing of test results
- No UI indication of test failures
- No test history or backlog
- Silent failures give false confidence

### Suggested Fix

**1. Parse agent output for test results**
```typescript
// After agent execution, check output for test indicators
const testsPassed = !agentOutput.includes('FAILED') && 
                   !agentOutput.includes('test failed') &&
                   !agentOutput.includes('browser installation was blocked');
```

**2. Add verification status field to Feature type**
```typescript
interface Feature {
  // ...existing fields
  verificationStatus?: 'passed' | 'failed' | 'skipped' | 'pending';
  verificationOutput?: string;
}
```

**3. Conditional status based on test results**
```typescript
const finalStatus = feature.skipTests 
  ? 'waiting_approval' 
  : (testsPassed ? 'verified' : 'verification_failed');
```

**4. UI enhancements**
- Add test status badge to feature cards
- Add "Test Results" tab/panel showing verification history
- Add "Re-run Tests" button for failed verifications

### Screenshots

_N/A_

### Relevant Logs

Agent output when tests fail but feature is marked verified:
```
Playwright browser installation was blocked by permissions, but build verification confirms the implementation is working
```
Feature status: ✓ Verified (should be: ⚠️ Verification Failed)

### Additional Context

This creates a false sense of confidence - users believe features are tested and working when they may have significant issues. Combined with #725 (Docker Playwright not installed), Docker users are systematically getting unverified features marked as verified.

### Related Issues

- #725 - Docker: Playwright verification fails - browsers not installed

### Checklist

- [x] I have searched existing issues to ensure this bug hasn't been reported already
- [x] I have provided all required information above

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Features marked 'Verified' even when Playwright tests fail - no test result tracking #726

Operating System

Run Mode

App Version

Bug Description

Root Cause Analysis

Steps to Reproduce

Expected Behavior

Actual Behavior

Suggested Fix

Screenshots

Relevant Logs

Additional Context

Related Issues

Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: Features marked 'Verified' even when Playwright tests fail - no test result tracking #726

Description

Operating System

Run Mode

App Version

Bug Description

Root Cause Analysis

Steps to Reproduce

Expected Behavior

Actual Behavior

Suggested Fix

Screenshots

Relevant Logs

Additional Context

Related Issues

Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions