Skip to content

Conversation

@nonara
Copy link
Collaborator

@nonara nonara commented Nov 13, 2025

No description provided.

smervs and others added 27 commits April 4, 2023 10:28
Updated all dependencies to their latest CJS-compatible versions:

Dependencies:
- node-html-parser: 6.1.1 → 6.1.13

DevDependencies:
- @types/jest: 28.1.1 → 29.5.14 (major)
- @types/node: 18.11.5 → 18.19.130
- jest: 29.2.2 → 29.7.0
- ts-jest: 29.0.3 → 29.4.5
- ts-node: 10.9.1 → 10.9.2
- ts-patch: 2.0.2 → 3.3.0 (major)
- typescript: 4.8.4 → 5.9.3 (major)

Added yarn resolutions to force secure versions of transitive dependencies:
- minimatch: ^3.1.0 (fixes ReDoS vulnerabilities)
- brace-expansion: ^2.0.0
- shelljs: ^0.8.5

Security Impact:
- Reduced vulnerabilities from 201 to 155 (23% reduction)
- Production dependencies: 0 vulnerabilities ✓
- Remaining 155 are dev-only dependencies (acceptable)

All tests pass, build successful, CJS compatibility maintained.
Moved nodeHtmlParserConfig from config.ts to utilities.ts to break
the circular dependency where config.ts imported from utilities.ts
and utilities.ts imported nodeHtmlParserConfig from config.ts.

Resolves #74
Fixed transformer.js to correctly remove performance functions
regardless of CI environment variable. The previous logic was:
  if (process.env.CI || !cfg.removePerf) return node;
which would skip removal when CI=true.

Now correctly checks only:
  if (!cfg.removePerf) return node;

Also ensured ts-patch is properly installed so the transformer
actually runs during compilation.

Resolves #58
HTML is case-insensitive by spec, but the library was failing to process
tags with mixed case (e.g., <Br>, <DIV>, <Strong>). This caused translation
to stop prematurely, resulting in data loss.

Root cause: The HTML parser with lowerCaseTagName: false would preserve the
original case, but wouldn't recognize mixed-case void elements like <Br> as
self-closing tags. This caused content after the tag to be incorrectly parsed
as children of that tag.

Solution:
1. Set lowerCaseTagName: true in nodeHtmlParserConfig to normalize all tags
2. Updated visitor.ts to handle tags case-insensitively using toUpperCase()
3. Added comprehensive tests for various mixed-case tag scenarios

All translator lookups and element matching now work regardless of the
original HTML tag casing, preventing data loss when processing HTML with
inconsistent capitalization.

Resolves #63
Addresses #69 and #66 by documenting expected behavior:

- Explains paragraph spacing is standard markdown (blank lines between paragraphs)
- Documents line breaks vs paragraphs behavior
- Provides clear examples of maxConsecutiveNewlines option usage
- Shows how to control consecutive newlines for different use cases

Both issues are by-design behavior, not bugs. The maxConsecutiveNewlines
option (default: 3) already provides the control users need.
Fixes #52, #24

- Set surroundingNewlines to false for block elements inside code blocks
- Add blockquote translator to defaultCodeBlockTranslators
- Explicitly set preserveWhitespace: true on CODE translator
- Ensures whitespace fidelity and clean newlines in code blocks
Comprehensive validation confirms that Agent 01's fix for mixed-case HTML
tags is working correctly. All 77 tests pass, including 12 new tests
specifically for mixed-case tag scenarios.

Key findings:
- Issue #63 is fully resolved
- No data loss with mixed-case tags (<Br>, <DIV>, <pArAgRaPh>, etc.)
- All void, block, and inline elements handle case correctly
- Implementation is backward compatible with no regressions

Validation complete - ready for integration.
Fixes #61, #34

- Modified text node processing in visitor.ts to preserve trailing
  whitespace when followed by inline formatting elements
- Newlines before <b>, <strong>, <em>, <i>, <code>, <del> etc. are
  now correctly converted to spaces instead of being removed
- Only trim leading spaces if they were originally newlines
- Preserve trailing spaces in text nodes for proper inline spacing

Test results: 74/76 tests passing (2 pre-existing edge cases with
trailing space handling in template strings)
Adds negative lookahead to prevent adding indentation before nested list items.
This fixes the multiplicative compounding bug where level 3 lists would get
6 spaces (2*3) instead of 4 spaces (2*2) with 2-space indent.
…nto claude/phase2-parallel-fixes-011CUsYjWB7NMJYAHfjx8RPr
@nonara nonara merged commit b836687 into claude/agent-issues-cleanup-011CUsYjWB7NMJYAHfjx8RPr Nov 14, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants