The thread-numbering on POWER9 big-core is an interleaving of the threads of the constituent small cores, i.e thread-ids {0, 2, 4, 6} belong to the same small core, while thread-ids {1, 3, 5, 7} belong to the other small core in the big-core.
The number of context-switches as observed with the context_switch2 benchmark (https://ozlabs.org/~anton/junkcode/context_switch2.c) for
(0, 1) (two different small-cores of the same big-core)
(0, 2) (same small core)
(0, 8) (two different big-cores)
are as follows:
# ./context_switch2 0 1 --timeout=5
354950
354112
353290
349704
350604
# ./context_switch2 0 2 --timeout=5
269772
269692
269420
269878
269702
# ./context_switch2 0 8 --timeout=5
289174
287990
288556
288812
288678
Thus the number of context-switches with (0,2) is lesser than those with (0,1) and those with (0,8). Understand the reason for this behaviour.