From ba2d9d73e2b2f439bed5dc96b6de50f2a7f7efd7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Esteban=20K=C3=BCber?= Date: Tue, 31 Jul 2018 20:15:46 -0700 Subject: [PATCH 1/6] RFC: Teach `concat!()` to join `[u8]` and byte `str` --- text/0000-byte-concat.md | 92 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 92 insertions(+) create mode 100644 text/0000-byte-concat.md diff --git a/text/0000-byte-concat.md b/text/0000-byte-concat.md new file mode 100644 index 00000000000..7c7dfb42308 --- /dev/null +++ b/text/0000-byte-concat.md @@ -0,0 +1,92 @@ +- Feature Name: byte_concat +- Start Date: 2018-07-31 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Allow the use of `concat!()` to join byte sequences onto an `u8` array, +beyond the current support for `str` literals. + +# Motivation +[motivation]: #motivation + +`concat!()` is convenient and useful to create compile time `str` literals +from `str`, `bool`, numeric and `char` literals in the code. This RFC would +expand this capability to produce `[u8]` instead of `str` when any of its +arguments is a byte `str` or a byte `char`. + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +Whenever any of the arguments to `concat!()` is a byte literal, its output +will be a byte literal, and the other arguments will be evaluated on their +byte contents. + +- `str`s and `char`s are evaluated in the same way as `String::as_bytes`, +- `bool`s are not accepted, use a numeric literal instead, +- numeric literals passed to `concat!()` must fit in `u8`, any number + larger than `std::u8::MAX` causes a compile time error, like the + following: +``` +error: cannot concatenate a non-`u8` literal in a byte string literal + --> $FILE:XX:YY + | +XX | concat!(256, b"val"); + | ^^^ this value is larger than `255` +``` +- numeric array literals that can be coerced to `[u8]` are accepted, if the +literals are outside of `u8` range, it will cause a compile time error: +``` +error: cannot concatenate a non-`u8` literal in a byte string literal + --> $FILE:XX:YY + | +XX | concat!([300, 1, 2, 256], b"val"); + | ^^^ ^^^ this value is larger than `255` + | | + | this value is larger than `255` +``` + +For example, `concat!(42, b"va", b'l', [1, 2])` evaluates to +`[42, 118, 97, 108, 1, 2]`. + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +[PR #52838](https://github.com/rust-lang/rust/pull/52838) lays the +foundation for the implementation of the full RFC. + +This new feature could be surprising when editting existing code, if +`concat!("foo", `b`, `a`, `r`, 3)` were changed to +`concat!("foo", `b`, b`a`, `r`, 3)`, as the macro call would change from +being evaluated as a `str` literal "foobar3" to `[u8]` +`[102, 111, 111, 98, 97, 114, 3]`. + +# Drawbacks +[drawbacks]: #drawbacks + +As mentioned in the previous section, this causes `concat!()`'s output to be +dependant on its input. + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +A new macro `bconcat!()` could be introduced instead. People in the wild +have already intended to use `concat!()` for byte literals. A new macro +could be explained to users through diagnostics, but using the existing +macro adds support for something that a user could expect to work. + +# Prior art +[prior-art]: #prior-art + +[PR #52838](https://github.com/rust-lang/rust/pull/52838) lays the +foundation for the implementation of the full RFC, trying to enable a real +use seen in the wild. + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +- What parts of the design do you expect to resolve through the RFC process before this gets merged? +- What parts of the design do you expect to resolve through the implementation of this feature before stabilization? +- What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? From 02afa73238594053fb4b48d317272c4dfcac89ec Mon Sep 17 00:00:00 2001 From: Jonas Platte Date: Thu, 7 Jan 2021 12:35:27 +0100 Subject: [PATCH 2/6] Update byte_concat text to propose a separate concat_bytes! macro --- text/0000-byte-concat.md | 104 +++++++++++++++++++-------------------- 1 file changed, 50 insertions(+), 54 deletions(-) diff --git a/text/0000-byte-concat.md b/text/0000-byte-concat.md index 7c7dfb42308..c0d3525e1c2 100644 --- a/text/0000-byte-concat.md +++ b/text/0000-byte-concat.md @@ -6,87 +6,83 @@ # Summary [summary]: #summary -Allow the use of `concat!()` to join byte sequences onto an `u8` array, -beyond the current support for `str` literals. +Add a macro `concat_bytes!()` to join byte sequences onto an `u8` array, +the same way `concat!()` currently supports for `str` literals. # Motivation [motivation]: #motivation `concat!()` is convenient and useful to create compile time `str` literals -from `str`, `bool`, numeric and `char` literals in the code. This RFC would -expand this capability to produce `[u8]` instead of `str` when any of its -arguments is a byte `str` or a byte `char`. +from `str`, `bool`, numeric and `char` literals in the code. This RFC adds an +equivalent capability for `[u8]` instead of `str`. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation -Whenever any of the arguments to `concat!()` is a byte literal, its output -will be a byte literal, and the other arguments will be evaluated on their -byte contents. - -- `str`s and `char`s are evaluated in the same way as `String::as_bytes`, -- `bool`s are not accepted, use a numeric literal instead, -- numeric literals passed to `concat!()` must fit in `u8`, any number - larger than `std::u8::MAX` causes a compile time error, like the - following: -``` -error: cannot concatenate a non-`u8` literal in a byte string literal - --> $FILE:XX:YY - | -XX | concat!(256, b"val"); - | ^^^ this value is larger than `255` -``` -- numeric array literals that can be coerced to `[u8]` are accepted, if the -literals are outside of `u8` range, it will cause a compile time error: -``` -error: cannot concatenate a non-`u8` literal in a byte string literal - --> $FILE:XX:YY - | -XX | concat!([300, 1, 2, 256], b"val"); - | ^^^ ^^^ this value is larger than `255` - | | - | this value is larger than `255` -``` - -For example, `concat!(42, b"va", b'l', [1, 2])` evaluates to +The `concat_bytes!()` macro concatenates literals into a static byte slice. +The following literal types are supported: + +- byte string literals (`b"..."`) +- byte literals (`b'b'`) +- numeric literals – must fit in `u8`, any number larger than `u8::MAX` causes + a compile time error like the following: + + ``` + error: cannot concatenate a non-`u8` literal in a byte string literal + --> $FILE:XX:YY + | + XX | concat_bytes!(256, b"val"); + | ^^^ this value is larger than `255` + ``` +- numeric array literals – if any literal is outside of `u8` range, it will + cause a compile time error: + + ``` + error: cannot concatenate a non-`u8` literal in a byte string literal + --> $FILE:XX:YY + | + XX | concat_bytes!([300, 1, 2, 256], b"val"); + | ^^^ ^^^ this value is larger than `255` + | | + | this value is larger than `255` + ``` + +For example, `concat_bytes!(42, b"va", b'l', [1, 2])` evaluates to `[42, 118, 97, 108, 1, 2]`. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -[PR #52838](https://github.com/rust-lang/rust/pull/52838) lays the -foundation for the implementation of the full RFC. - -This new feature could be surprising when editting existing code, if -`concat!("foo", `b`, `a`, `r`, 3)` were changed to -`concat!("foo", `b`, b`a`, `r`, 3)`, as the macro call would change from -being evaluated as a `str` literal "foobar3" to `[u8]` -`[102, 111, 111, 98, 97, 114, 3]`. + # Drawbacks [drawbacks]: #drawbacks -As mentioned in the previous section, this causes `concat!()`'s output to be -dependant on its input. +None known. # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives -A new macro `bconcat!()` could be introduced instead. People in the wild -have already intended to use `concat!()` for byte literals. A new macro -could be explained to users through diagnostics, but using the existing -macro adds support for something that a user could expect to work. +`concat!` could instead be changed to sometimes produce byte literals instead of +string literals, like a previous revision of this RFC proposed. This would make +it hard to ensure the right output type is produced – users would have to use +hacks like adding a dummy `b""` argument to force a byte literal output. # Prior art [prior-art]: #prior-art -[PR #52838](https://github.com/rust-lang/rust/pull/52838) lays the -foundation for the implementation of the full RFC, trying to enable a real -use seen in the wild. + # Unresolved questions [unresolved-questions]: #unresolved-questions -- What parts of the design do you expect to resolve through the RFC process before this gets merged? -- What parts of the design do you expect to resolve through the implementation of this feature before stabilization? -- What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? +- Should additional literal types be supported? Byte string literals are + basically the same thing as byte slice references, so it might make sense to + support those as well (support `&[0, 1, 2]` in addition to `[0, 1, 2]`). +- What to do with string and character literals? They could either be supported + with their underlying UTF-8 representation being concatenated, or rejected. + - If supported, it would probably make sense to also support boolean literals + so `concat_bytes!()` supports all inputs `concat!()` does. + - If rejected, it would probably makes sense to also reject boolean literals + to avoid any possible confusion about their representation (`b"true"` and + `b"false"` vs. `1` and `0`). From a326a18a4263dc29fb267ea2be092002dbfdb138 Mon Sep 17 00:00:00 2001 From: Jonas Platte Date: Thu, 7 Jan 2021 13:24:51 +0100 Subject: [PATCH 3/6] byte-concat: Improve wording about the output of concat_bytes! --- text/0000-byte-concat.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/text/0000-byte-concat.md b/text/0000-byte-concat.md index c0d3525e1c2..4fd40aacf54 100644 --- a/text/0000-byte-concat.md +++ b/text/0000-byte-concat.md @@ -19,8 +19,9 @@ equivalent capability for `[u8]` instead of `str`. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation -The `concat_bytes!()` macro concatenates literals into a static byte slice. -The following literal types are supported: +The `concat_bytes!()` macro concatenates literals into a byte string literal +(an expression of the type `&[u8; N]`). The following literal types are +supported as inputs: - byte string literals (`b"..."`) - byte literals (`b'b'`) From 5d1cea76c202d1584fc62b6b978b606dabda391d Mon Sep 17 00:00:00 2001 From: Jonas Platte Date: Tue, 19 Jan 2021 19:07:43 +0100 Subject: [PATCH 4/6] byte-concat: Remove confusing byte literal syntax Co-authored-by: Esteban Kuber --- text/0000-byte-concat.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-byte-concat.md b/text/0000-byte-concat.md index 4fd40aacf54..56bdc5c80e3 100644 --- a/text/0000-byte-concat.md +++ b/text/0000-byte-concat.md @@ -85,5 +85,5 @@ hacks like adding a dummy `b""` argument to force a byte literal output. - If supported, it would probably make sense to also support boolean literals so `concat_bytes!()` supports all inputs `concat!()` does. - If rejected, it would probably makes sense to also reject boolean literals - to avoid any possible confusion about their representation (`b"true"` and - `b"false"` vs. `1` and `0`). + to avoid any possible confusion about their representation (`true` and + `false` vs. `1` and `0`). From 32ed992fabee580fb96750f4e72a40cfa1ea3dd3 Mon Sep 17 00:00:00 2001 From: Jonas Platte Date: Tue, 23 Mar 2021 12:07:34 +0100 Subject: [PATCH 5/6] byte-concat: Delete unneeded sections --- text/0000-byte-concat.md | 10 ---------- 1 file changed, 10 deletions(-) diff --git a/text/0000-byte-concat.md b/text/0000-byte-concat.md index 56bdc5c80e3..df57082ea0c 100644 --- a/text/0000-byte-concat.md +++ b/text/0000-byte-concat.md @@ -51,11 +51,6 @@ supported as inputs: For example, `concat_bytes!(42, b"va", b'l', [1, 2])` evaluates to `[42, 118, 97, 108, 1, 2]`. -# Reference-level explanation -[reference-level-explanation]: #reference-level-explanation - - - # Drawbacks [drawbacks]: #drawbacks @@ -69,11 +64,6 @@ string literals, like a previous revision of this RFC proposed. This would make it hard to ensure the right output type is produced – users would have to use hacks like adding a dummy `b""` argument to force a byte literal output. -# Prior art -[prior-art]: #prior-art - - - # Unresolved questions [unresolved-questions]: #unresolved-questions From 431dadf599bac7c3524d53eb14d88fcc77c5b9ba Mon Sep 17 00:00:00 2001 From: Jonas Platte Date: Wed, 26 May 2021 16:56:46 +0200 Subject: [PATCH 6/6] byte-concat: Address argument type concern --- text/0000-byte-concat.md | 20 +++++--------------- 1 file changed, 5 insertions(+), 15 deletions(-) diff --git a/text/0000-byte-concat.md b/text/0000-byte-concat.md index df57082ea0c..b58adda0fa8 100644 --- a/text/0000-byte-concat.md +++ b/text/0000-byte-concat.md @@ -25,16 +25,6 @@ supported as inputs: - byte string literals (`b"..."`) - byte literals (`b'b'`) -- numeric literals – must fit in `u8`, any number larger than `u8::MAX` causes - a compile time error like the following: - - ``` - error: cannot concatenate a non-`u8` literal in a byte string literal - --> $FILE:XX:YY - | - XX | concat_bytes!(256, b"val"); - | ^^^ this value is larger than `255` - ``` - numeric array literals – if any literal is outside of `u8` range, it will cause a compile time error: @@ -64,6 +54,11 @@ string literals, like a previous revision of this RFC proposed. This would make it hard to ensure the right output type is produced – users would have to use hacks like adding a dummy `b""` argument to force a byte literal output. +An earlier version of this RFC proposed to support integer literals outside of +arrays, but that was rejected since it would make the output of +`byte_concat!(123, b"\n")` inconsistent with the equivalent `concat!` +invocation. + # Unresolved questions [unresolved-questions]: #unresolved-questions @@ -72,8 +67,3 @@ hacks like adding a dummy `b""` argument to force a byte literal output. support those as well (support `&[0, 1, 2]` in addition to `[0, 1, 2]`). - What to do with string and character literals? They could either be supported with their underlying UTF-8 representation being concatenated, or rejected. - - If supported, it would probably make sense to also support boolean literals - so `concat_bytes!()` supports all inputs `concat!()` does. - - If rejected, it would probably makes sense to also reject boolean literals - to avoid any possible confusion about their representation (`true` and - `false` vs. `1` and `0`).