From 69d7e83080b744488412b59c738b723d9d6c1979 Mon Sep 17 00:00:00 2001 From: Alisdair Meredith Date: Tue, 4 Nov 2025 13:33:48 -1000 Subject: [PATCH] [lex] Complete the use of unicode code points Completes the task of applying Unicode code point markup to refer to single characters in normative text. Replace all remaining uses of backslash as glyph (\) or text with the corresponding `unicode{005c}{reverse solidus}` markup. Replace a couple of text reference to single and double quotes with their corresponding Unicode markup, and similarly for the glyphs ` and ". This should compete the last Unicode markup in [lex]. --- source/lex.tex | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/source/lex.tex b/source/lex.tex index 2cb35f496f..8b8e8d5d49 100644 --- a/source/lex.tex +++ b/source/lex.tex @@ -99,12 +99,12 @@ \indextext{line splicing}% If the first translation character is \unicode{feff}{byte order mark}, it is deleted. -Each sequence comprising a backslash character (\textbackslash) +Each sequence comprising a \unicode{005c}{reverse solidus} character (\textbackslash) immediately followed by zero or more whitespace characters other than new-line followed by a new-line character is deleted, splicing physical source lines to form \defnx{logical source lines}{source line!logical}. Only the last -backslash on any physical source line is eligible for being part +\unicode{005c}{reverse solidus} on any physical source line is eligible for being part of such a splice. \begin{note} Line splicing can form @@ -578,7 +578,8 @@ circumstances during translation phase 4, whitespace (or the absence thereof) serves as more than preprocessing token separation. Whitespace can appear within a preprocessing token only as part of a header name or -between the quotation characters in a character literal or +between the \unicode{0027}{apostrophe} characters in a character literal +or between the \unicode{0022}{quotation mark} characters in a string literal. \end{note} @@ -730,13 +731,14 @@ \end{note} \pnum -The appearance of either of the characters \tcode{'} or \tcode{\textbackslash} or of +The appearance of either of the characters \unicode{0027}{apostrophe} or +\unicode{005c}{reverse solidus} or of either of the character sequences \tcode{/*} or \tcode{//} in a \grammarterm{q-char-sequence} or an \grammarterm{h-char-sequence} is conditionally-supported with \impldef{meaning of \tcode{'}, \tcode{\textbackslash}, \tcode{/*}, or \tcode{//} in a \grammarterm{q-char-sequence} or an \grammarterm{h-char-sequence}} semantics, as is the appearance of the character -\tcode{"} in an \grammarterm{h-char-sequence}. +\unicode{0022}{quotation mark} in an \grammarterm{h-char-sequence}. \begin{note} Thus, a sequence of characters that resembles an escape sequence can result in an error, be interpreted as the