Skip to content

Conversation

@bonktree
Copy link
Contributor

Add some tests that cover all the possible octet values in the mask, as well as invalid prefix lengths.

While running the tests, it was discovered the function did not work on mksh at all:

  • that shell stores integers as int32_t in arith context, so numbers like 0x80000000 are < 0, and bit shift to the right has been observed to sign-extend;
  • that shell cannot left-shift an integer by 32 bits (and evaluate to a multiple of 4294967296), since the shift amount is considered modulo 32.

We have to replace the algorithm to not rely on left shift by 32, or any right shift. We move it to a separate function to be used from more places in future patches.

Add some tests that cover all the possible octet values in the mask,
as well as invalid prefix lengths.

While running the tests, it was discovered the function did not work
on mksh at all:
- that shell stores integers as int32_t in arith context, so numbers
  like 0x80000000 are < 0, and bit shift to the right has been observed
  to sign-extend;
- that shell cannot left-shift an integer by 32 bits (and evaluate to a
  multiple of 4294967296), since the shift amount is considered modulo 32.
We have to replace the algorithm to not rely on left shift by 32, or any
right shift. We move it to a separate function to be used from more
places in subsequent changes.

Signed-off-by: Arseny Maslennikov <ar@cs.msu.ru>
@bonktree
Copy link
Contributor Author

bonktree commented Aug 22, 2025

There is a different micro-issue: the following has been failing for me since the last additions to master:

LANG=ru_RU.UTF-8 make check CHECK_SHELL='/bin/bash'

with this error output:

./check-string-performance: line 57: (0 * 60) + 1,159s: value too great for base (error token is "159s")
[FAIL] (string_performance_test1) 

The LANG took place implicitly from my environment, so that had me startled for a while :)

I'm not sure it'd be appropriate or not-noisy to register this as a GH issue, so describing it here.

% bash --version | head -n1
GNU bash, version 5.2.37(1)-release (x86_64-pc-linux-gnu)
% grep sid /etc/os-release
PRETTY_NAME="Debian GNU/Linux forky/sid"

@legionus
Copy link
Owner

There is a different micro-issue: the following has been failing for me since the last additions to master:

LANG=ru_RU.UTF-8 make check CHECK_SHELL='/bin/bash'

Ah. Good catch! Builtin times uses locale. I miss it. Thanks!
Fixed in master.

@legionus legionus merged commit ab2e6d1 into legionus:master Aug 22, 2025
6 checks passed
@legionus
Copy link
Owner

@bonktree merged.

@legionus
Copy link
Owner

I am slowly coming to the conclusion that it is worth adding shell-bits in which to add functions for creating masks, setting and checking certain bits. Something like that.

@bonktree bonktree deleted the prefix2mask-tests branch August 22, 2025 19:56
@legionus
Copy link
Owner

While running the tests, it was discovered the function did not work on mksh at all:

I didn't know that mksh doesn't support 64-bit numbers. I even started to think that this should be a blocker for mksh support. But it turns out that ash/hash from busybox is also 32-bit by default, and 64-bit support can only be enabled with a separate option.

@legionus
Copy link
Owner

But it turns out that ash/hash from busybox is also 32-bit by default, and 64-bit support can only be enabled with a separate option.

Sorry. Wrong again. 64-bit support enabled by default if POSIX math support enabled.

@bonktree
Copy link
Contributor Author

I am slowly coming to the conclusion that it is worth adding shell-bits in which to add functions for creating masks, setting and checking certain bits. Something like that.

That would definitely make sense if there were multiple use cases for such tools, either from inside the libshell project or described by users. So operations on IP addresses is one, and there would better be others.

A small bit of bikeshedding on the file name:
"bits" can be understood as a synonym for "things", for a generic noun to describe a vague group of objects there isn't a collective term for, or the speaker is too lazy to find a term for. shell-bits might come across like a collection of generic shell tools. For that reason, I feel like shell-bitwise would be more descriptive.
I do not dare to insist, of course. :)

@legionus
Copy link
Owner

That would definitely make sense if there were multiple use cases for such tools, either from inside the libshell project or described by users. So operations on IP addresses is one, and there would better be others.

What lies on the surface, inside libshell is shell-cmdline, in which I used bitwise operations for parser states. In several other places, bitwise operations could also be useful when multiple flags are required.

The kernel has convenient wrappers such as BIT, test_bit, and so on. This is much more declarative than in-place operations. I thought that this could also be useful for libshell.

A small bit of bikeshedding on the file name: "bits" can be understood as a synonym for "things", for a generic noun to describe a vague group of objects there isn't a collective term for, or the speaker is too lazy to find a term for. shell-bits might come across like a collection of generic shell tools. For that reason, I feel like shell-bitwise would be more descriptive. I do not dare to insist, of course. :)

I'm just lazy. In the kernel, such wrappers are located in linux/bits.h and linux/bitops.h. So I decided to name it the same way. But I have nothing against shell-bitwise.

@legionus
Copy link
Owner

@bonktree I wrote a short article researching the behavior of different shell implementations when parsing and calculating arithmetic values close to the maximum. The result surprised me.

https://github.com/legionus/libshell/blob/master/Documentation/int64_max.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants