This change was a long time in the making ever since we obtained sample
rate awareness in the system. Now, each client has its own sample rate,
accessible via new IPC APIs, and the device sample rate is only
accessible via the management interface. AudioServer takes care of
resampling client streams into the device sample rate. Therefore, the
main improvement introduced with this commit is full responsiveness to
sample rate changes; all open audio programs will continue to play at
correct speed with the audio resampled to the new device rate.
The immediate benefits are manifold:
- Gets rid of the legacy hardware sample rate IPC message in the
non-managing client
- Removes duplicate resampling and sample index rescaling code
everywhere
- Avoids potential sample index scaling bugs in SoundPlayer (which have
happened many times before) and fixes a sample index scaling bug in
aplay
- Removes several FIXMEs
- Reduces amount of sample copying in all applications (especially
Piano, where this is critical), improving performance
- Reduces number of resampling users, making future API changes (which
will need to happen for correct resampling to be implemented) easier
I also threw in a simple race condition fix for Piano's audio player
loop.
This is a sensible separation of concerns that mirrors the WindowServer
IPC split. On the one hand, there is the "normal" audio interface, used
for clients that play audio, which is the primary service of
AudioServer. On the other hand, there is the management interface,
which, like the WindowManager endpoint, provides higher-level control
over clients and the server itself.
The reasoning for this split are manifold, as mentioned we are mirroring
the WindowServer split. Another indication to the sensibility of the
split is that no single audio client used the APIs of both interfaces.
Also, useless audio queues are no longer created for managing clients
(since those don't even exist, just like there's no window backing
bitmap for window managing clients), eliminating any bugs that may occur
there as they have in the past.
Implementation-wise, we just move all the APIs and implementations from
the old AudioServer into the AudioManagerServer (and respective clients,
of course). There is one point of duplication, namely the hardware
sample rate. This will be fixed in combination with per-client sample
rate, eliminating client-side resampling and the related update bugs.
For now, we keep one legacy API to simplify the transition.
The new AudioManagerServer also gains a hardware sample rate change
callback to have exact symmetry on the main server parameters (getter,
setter, and callback).
Previously we would exit the dequeuing loop after just one buffer
had been dequeued due to some bogus logic. This would manifest
when stopping and starting a track in SoundPlayer, where a few
miliseconds of 'old' audio would play when restarting the playback.
This commit makes sure we clear the entire queue.
Because IPC is used very little in audio server communication, a
ping-pong method like WindowServer is neither a good nor a reliable way
of detecting detached audio clients. AudioServer was previously doing
nothing to detect the kinds of clients that never closed their
connection properly, which happens e.g. when a program is force-closed.
Due to reference-counting cycles, the associated client connection
queues were being kept alive. However, the is_open method of local
sockets reliably detects all kinds of disconnected sockets and can
easily be adapted for this use case. With this fix, we no longer get
"Audio client can't keep up" spam on improperly disconnected clients,
and the client queues don't fill up indefinitely, reducing processing
and memory usage in AudioServer.
The buffer provided to `OutputMemoryStream` was made a private class
member. This is because there is no reason to re-create it in every
iteration. Also, the logic becomes more symmetric with
`m_zero_filled_buffer` which is already a class member.
Initialize the `AudioServer::Mixer::m_zero_filled_buffer` to zero. The
garbage memory inside that buffer was causing a glitch sound when the
user was toggling the mute checkbox or was moving the volume slider on
and off zero. Glitching was more obvious if the toggling was happening
without any sound being played in parallel.
In addition to that, the `m_zero_filled_buffer` turned to `const` since
there is no intention to modify its content.
This removes some old cruft to refactor the hardware buffer-related
datastructures into depending on a single constant, which determines the
number of samples per hardware buffer that the audio server mixes. This
is set to 1024 as before, so there are no functional changes.
The file is now renamed to Queue.h, and the Resampler APIs with
LegacyBuffer are also removed. These changes look large because nobody
actually needs Buffer.h (or Queue.h). It was mostly transitive
dependencies on the massive list of includes in that header, which are
now almost all gone. Instead, we include common things like Sample.h
directly, which should give faster compile times as very few files
actually need Queue.h.
Previously, we were sending Buffers to the server whenever we had new
audio data for it. This meant that for every audio enqueue action, we
needed to create a new shared memory anonymous buffer, send that
buffer's file descriptor over IPC (+recfd on the other side) and then
map the buffer into the audio server's memory to be able to play it.
This was fine for sending large chunks of audio data, like when playing
existing audio files. However, in the future we want to move to
real-time audio in some applications like Piano. This means that the
size of buffers that are sent need to be very small, as just the size of
a buffer itself is part of the audio latency. If we were to try
real-time audio with the existing system, we would run into problems
really quickly. Dealing with a continuous stream of new anonymous files
like the current audio system is rather expensive, as we need Kernel
help in multiple places. Additionally, every enqueue incurs an IPC call,
which are not optimized for >1000 calls/second (which would be needed
for real-time audio with buffer sizes of ~40 samples). So a fundamental
change in how we handle audio sending in userspace is necessary.
This commit moves the audio sending system onto a shared single producer
circular queue (SSPCQ) (introduced with one of the previous commits).
This queue is intended to live in shared memory and be accessed by
multiple processes at the same time. It was specifically written to
support the audio sending case, so e.g. it only supports a single
producer (the audio client). Now, audio sending follows these general
steps:
- The audio client connects to the audio server.
- The audio client creates a SSPCQ in shared memory.
- The audio client sends the SSPCQ's file descriptor to the audio server
with the set_buffer() IPC call.
- The audio server receives the SSPCQ and maps it.
- The audio client signals start of playback with start_playback().
- At the same time:
- The audio client writes its audio data into the shared-memory queue.
- The audio server reads audio data from the shared-memory queue(s).
Both sides have additional before-queue/after-queue buffers, depending
on the exact application.
- Pausing playback is just an IPC call, nothing happens to the buffer
except that the server stops reading from it until playback is
resumed.
- Muting has nothing to do with whether audio data is read or not.
- When the connection closes, the queues are unmapped on both sides.
This should already improve audio playback performance in a bunch of
places.
Implementation & commit notes:
- Audio loaders don't create LegacyBuffers anymore. LegacyBuffer is kept
for WavLoader, see previous commit message.
- Most intra-process audio data passing is done with FixedArray<Sample>
or Vector<Sample>.
- Improvements to most audio-enqueuing applications. (If necessary I can
try to extract some of the aplay improvements.)
- New APIs on LibAudio/ClientConnection which allows non-realtime
applications to enqueue audio in big chunks like before.
- Removal of status APIs from the audio server connection for
information that can be directly obtained from the shared queue.
- Split the pause playback API into two APIs with more intuitive names.
I know this is a large commit, and you can kinda tell from the commit
message. It's basically impossible to break this up without hacks, so
please forgive me. These are some of the best changes to the audio
subsystem and I hope that that makes up for this :yaktangle: commit.
:yakring:
With the following change in how we send audio, the old Buffer type is
not really needed anymore. However, moving WavLoader to the new system
is a bit more involved and out of the scope of this PR. Therefore, we
need to keep Buffer around, but to make it clear that it's the old
buffer type which will be removed soon, we rename it to LegacyBuffer.
Most of the users will be gone after the next commit anyways.
When computing the 'output mix', the Mixer iterates over all client
audio streams and computes a 'mixed sample' taking into account mainly
the client's volume.
This new member and methods will allow us to ignore a muted client
when computing that mix.
The `m_remaining_samples` attribute was underflowing at the end of an
audio stream. This fix guards against the underflow by only decrementing
the attribute when it is greater than zero.
I found this bug because the SoundPlayer userland application was not
correctly detecting when an audio stream was completed. This was
happening because the remaining samples being returned from the client
audio connection was an underflowed 16 bit integer instead of zero.
Executing `asctl set r 96000` no longer results in weird sample rates
being set on the audio devices. SB16 checks for a sample rate between 1
and 44100 Hz, while AC97 implements double-rate support which allows
sample rates between 8kHz and 96kHZ.
"Frame" is an MPEG term, which is not only unintuitive but also
overloaded with different meaning by other codecs (e.g. FLAC).
Therefore, use the standard term Sample for the central audio structure.
The class is also extracted to its own file, because it's becoming quite
large. Bundling these two changes means not distributing similar
modifications (changing names and paths) across commits.
Co-authored-by: kleines Filmröllchen <malu.bertsch@gmail.com>
Derivatives of Core::Object should be constructed through
ClassName::construct(), to avoid handling ref-counted objects with
refcount zero. Fixing the visibility means that misuses like this are
more difficult.
This commit is separate from the other Servives changes because it
required additional adaption of the code. Note that the old code did
precisely what these changes try to prevent: Create and handle a
ref-counted object with a refcount of zero.
Previously, AudioServer would deadlock when trying to play another audio
stream, i.e. creating a queue. The m_pending_cond condition was used
improperly, and the condition wait now happens independently of querying
the pending queue for new clients if the mixer is running.
To make the mixer's concurrency-safety code more readable, the use of
raw POSIX mutex and condition syscalls is replaced with Threading::Mutex
and Threading::ConditionVariable.
Across the entire audio system, audio now works in 0-1 terms instead of
0-100 as before. Therefore, volume is now a double instead of an int.
The master volume of the AudioServer changes smoothly through a
FadingProperty, preventing clicks. Finally, volume computations are done
with logarithmic scaling, which is more natural for the human ear.
Note that this could be 4-5 different commits, but as they change each
other's code all the time, it makes no sense to split them up.
AudioServer loads its settings, currently volume and mute state, from a
user config file "Audio.ini". Additionally, the current settings are
stored every ten seconds, if necessary. This allows for persistent audio
settings in between boots.
This way we don't have to allocate this at runtime. I'm intentionally
not using static constexpr here because that would put the variable
into the .rodata segment and would therefore increase the binary by
4kB.
The old code also failed to free() the buffer in the destructor, however
that wasn't much of an issue because the Mixer object exists throughout
the program's entire lifetime.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
Because it's what it really is. A frame is composed of 1 or more samples, in
the case of SerenityOS 2 (stereo). This will make it less confusing for
future mantainability.