-
Notifications
You must be signed in to change notification settings - Fork 32
Description
The instance NFData Doc (added in #13) must be considered harmful, as the size of Doc normal form baloons even for small documents.
I shot myself badly in the foot when I refactored a DeepSeq.force on a rendered Doc (so, a String) to run on the Doc itself. (The purpose of the deep normalization is to bring exceptions to the surface.) The memory consumption exploded to the point of even making my terminal crash (on a 48GB machine!). The story can he read here and I link to the fatal loc:
- Fix #8033: color warnings by keeping Doc over rendering to String agda/agda#8040
- agda/agda@9ea5a10#r2243728414
What is going on? The problem is the Union constructor in Doc that holds two documents that render to the same text (with maybe different layout). Relying on laziness, usually the second alternative is only used when the first does not work out well. However, NFData's rnf will normalize both alternatives.
Also, the way that the pretty DSL is designed, common Doc constructors are implemented shallowly ("smart constructors"), so they distribute over Union. For instance, take beside which is ubiquitous because it is invoked whenever you concatenate documents horizontally (such as in <>, <+>, hcat, hsep...).
| beside (p1 `Union` p2) g q = beside p1 g q `union_` beside p2 g q |
g and q are duplicated when beside distributes itself over a Union. And unions are also ubiquitous as they are introduced by seps: pretty/src/Text/PrettyPrint/Annotated/HughesPJ.hs
Lines 722 to 724 in 8242baf
| -- Specification: sep ps = oneLiner (hsep ps) | |
| -- `union` | |
| -- vcat ps |
pretty/src/Text/PrettyPrint/Annotated/HughesPJ.hs
Lines 763 to 766 in 8242baf
| sepNB g Empty k ys | |
| = oneLiner (nilBeside g (reduceDoc rest)) `mkUnion` | |
| -- XXX: TODO: PRETTY: Used to use True here (but GHC used False...) | |
| nilAboveNest False k (reduceDoc (vcat ys)) |
Doc are quadratic and laziness is essential to make the pretty DSL work.Adding a normalizer like
NFData is contrary to the whole design of this library.
The addition of NFData was innocuous:
Contributor: #12 (comment)
Would it be feasible to add an NFData instance for the Doc type, etc.? This would probably bee needed for #2.
Maintainer: #12 (comment)
Yes, this would be fine.
Don't get me wrong; both the Contributor and the Maintainer could have been me. One usually is just smarter in hindsight, when the calamity has struck.
This is the usual plight of software engineering: the original authors of the software leave, taking their deep understanding and unspoken dos and don'ts with them, and the second generation has take over without having the expertise.
What is the way forward here?
I think the NFData Doc instance should be deprecated, i.e., equipped with a warning pointing out its hazards (e.g. pointing to this issue).
The original motivation for the introduction of NFData Doc also seems questionable in the light of this issue:
This would probably bee needed for #2.
Since NFData degrades performance, it is probably not a tool to measure performance.
Related: