Add StateTracker::wait_state() and converted linux semaphore timed wait into monotonic clock domain #814

Flw5469 · 2025-05-21T12:30:40Z

Implement #749 based on #769.
Fix according to #795 so the above implementation works

Make the Waiting Threads waiting using condition variables according to Add StateTracker::wait_state() with lock-free wakeup #749 (comment)
According to Update Semaphore::timed_wait with sem_clockwait and converting domain #797 (comment), added compile time check in SConstruct for whether sem_clockedwait exist, set the clock of timedwait in POSIX platforms into CLOCK_MONOTONIC domain. Document of checkFunc can be found at https://scons.org/doc/2.3.1/HTML/scons-api/SCons.Conftest-pysrc.html#CheckFunc
Added simple test cases of state_tracker's timeout and blocking.
Added simple test case of POSIX semaphore's timeout.

…st cases

…e and add checking of sem_clockwait in compile time.

rocstreaming-bot · 2025-05-21T12:35:49Z

🤖 Welcome! Thanks for your interest in contributing to the project!

Here is a short check-list to help you get started:

Creating pull request

Target PR to develop branch.
Include link to related issue in PR description.
Ensure all CI checks pass.

Code review

Mark PR as draft until it's ready. When ready, undraft and request review.
Don't resolve discussions by yourself, instead leave a comment or thumbs up.
Re-request review after addressing all discussions.

Refer to contribution guidelines for futher details.

rocstreaming-bot · 2025-06-05T16:05:35Z

🤖 Pull request is currently unmergeable due to conflicts.
Please rebase on up-to-date upstream branch, resolve merge conflicts, and force-push to pull request's branch. Remember to use rebase with force-push instead of a regular merge.

gavv

Thanks for PR! Here is my review.

Also sorry for merge conflicts, Atomic<T> was recently split into AtomicInt<T> + AtomicBool.

gavv · 2025-06-06T05:38:35Z

SConstruct

+#Check for existence of function sem_clockwait 
+    temp_conf = Configure(env)
+    header = """
+#ifdef __cplusplus
+extern "C"
+#endif
+char sem_timedwait(void);
+"""
+
+    if temp_conf.CheckFunc('sem_clockwait', header):
+        env.Append(CPPDEFINES=['ROC_HAVE_SEM_CLOCKWAIT'])


Is there a reason why you defined the function manually instead of including semaphore.h?

If no, I suggest to use include. Also let's move it to the section where we do other checks and already have conf:

diff --git a/SConstruct b/SConstruct index f01da593..a5da8609 100644 --- a/SConstruct +++ b/SConstruct @@ -687,6 +687,9 @@ if meta.platform in ['linux', 'unix']: elif meta.platform in ['android']: meta.gnu_toolchain = True +if conf.CheckFunc('sem_clockwait', header="#include <semaphore.h>\n"): + conf.env.Append(CPPDEFINES=['ROC_HAVE_SEM_CLOCKWAIT']) + conf.env['ROC_SYSTEM_BINDIR'] = GetOption('bindir') conf.env['ROC_SYSTEM_INCDIR'] = GetOption('incdir') @@ -955,20 +958,6 @@ if meta.compiler in ['gcc', 'clang']: if meta.platform in ['linux', 'darwin']: env.AddManualDependency(libs=['pthread']) -#Check for existence of function sem_clockwait - temp_conf = Configure(env) - header = """ -#ifdef __cplusplus -extern "C" -#endif -char sem_timedwait(void); -""" - - if temp_conf.CheckFunc('sem_clockwait', header): - env.Append(CPPDEFINES=['ROC_HAVE_SEM_CLOCKWAIT']) - - - if meta.platform in ['linux', 'android'] or meta.gnu_toolchain: if not GetOption('disable_soversion'): subenvs.public_libs['SHLIBSUFFIX'] = '{}.{}'.format(

gavv · 2025-06-06T07:04:34Z

src/internal_modules/roc_core/target_posix_ext/roc_core/semaphore.cpp

+#else
        if (sem_timedwait(&sem_, &ts) == 0) {
            return true;
        }


After giving this second thought, it seems that using sem_timedwait() is a really bad thing to use when we don't have sem_clockwait():

If system time changes after we compute converted_deadline, but before we call sem_timedwait, there is a risk to sleep practically forever.

If system time changes during sem_timedwait, depending on platform, there is the same risk, depending om platform. Some platforms will convert deadline to relative time, but some will wait until absolute time point taking account clock changes - which is the worst for us. BTW is seems it's how Linux works.

Then I checked what is the actual availability of sem_clockwait() on other POSIX OSes:

Linux has it

FreeBSD and QNX have close alternatives (sem_clockwait_np and sem_timedwait_monotonic)

seems that OpenBSD, NetBSD, Solaris, Redox don't have it

macOS doesn't have it but we don't need it there

This means that we're going to use sem_timedwait on quite a few platforms and our code will potentially hang on system time change.

An alternative is to implement semaphore using a cond variable on platforms without sem_clockwait(). POSIX cond vars also have timedwait variant (pthread_cond_timedwait). By default it uses CLOCK_REALTIME and so has same problem as sem_timedwait. But if platform support pthread_condattr_setclock(), you can use CLOCK_MONOTONIC.

The good news are that pthread_condattr_setclock(CLOCK_MONOTONIC) is supported on much more platforms compared to sem_clockwait: actually everything listed above except macOS have it (and macOS has non-portable alternative).

Long story short, I think when sem_clockwait() is not available, we should implement semaphore using a mutex + condition variable, instead of sem_timedwait() - this will give us correct code on much more platforms.

(Sorry that I haven't realized it earlier).

So I suggest to create two separate semaphore implementations in two target directories:

target_posix_sem - current implementation, but it will unconditionally use sem_clockwait()

target_nosem - fallback implementation for platforms without proper semaphores, using mutex and condvar

(and we also have target_darwin which implements semaphore for macOS)

We can do it like this:

diff --git a/SConstruct b/SConstruct index f01da593..9904c913 100644 --- a/SConstruct +++ b/SConstruct @@ -794,9 +794,15 @@ else: ]) if meta.platform in ['linux', 'android', 'unix']: - env.Append(ROC_TARGETS=[ - 'target_posix_ext', - ]) + if 'ROC_HAVE_SEM_CLOCKWAIT' in env['CPPDEFINES']: + env.Append(ROC_TARGETS=[ + 'target_posix_sem', + ]) + else: + env.Append(ROC_TARGETS=[ + 'target_nosem', + ]) if meta.platform in ['linux', 'android', 'darwin'] or meta.gnu_toolchain: env.Append(ROC_TARGETS=[

We can rename src/internal_modules/roc_core/target_posix_ext to target_posix_sem (it doesn't have anything except semaphore) and enable it only if sem_clockwait() is present.

The fallback semaphore implementation can be placed into src/internal_modules/roc_core/target_nosem.

We already have core::Mutex and core::Cond with properly implemented timed_wait().

What do you think, are you willing to implement this?

gavv · 2025-06-06T07:05:26Z

src/internal_modules/roc_core/target_posix_ext/roc_core/semaphore.cpp

+    roc_log(roc::LogDebug, "origin time is %" PRId64 "\n", deadline);
+    roc_log(roc::LogDebug, "time is %" PRId64 "\n", converted_deadline);
+    roc_log(roc::LogDebug, "now time is %" PRId64 "\n",
+            core::timestamp(core::ClockMonotonic));


This method can be in hot path and called very frequently, so we shouldn't use logging here.

gavv · 2025-06-06T07:08:56Z

src/tests/roc_pipeline/test_state_tracker.cpp

+    delete[] threads_ptr;
+}
+
+TEST(state_tracker, semaphore_test) {


nit: Please move this to its own file (src/tests/roc_core/test_semaphore.cpp)

gavv · 2025-06-06T07:09:25Z

src/tests/roc_pipeline/test_state_tracker.cpp

+    if (sem.timed_wait(1 * core::Second + core::timestamp(core::ClockMonotonic)))
+        roc_log(LogDebug, "true, unlocked by other threads");


nit: Please always use {} with if/while/etc

gavv · 2025-06-07T07:37:24Z