It would be enough to use relaxed ordering here if it weren't for
the mutex, which we also need to store and retrieve. To ensure the
pthread_cond_broadcast() call sees the store, use release and acquire
as appropriate. Thankfully, both of these are on the slow paths.
This implementation features a fast path for pthread_cond_signal() and
pthread_cond_broadcast() for the case there's no thread waiting, and
does not exhibit the "thundering herd" issue in
pthread_cond_broadcast().
Fixes https://github.com/SerenityOS/serenity/issues/8432