Previously, we were sending Buffers to the server whenever we had new
audio data for it. This meant that for every audio enqueue action, we
needed to create a new shared memory anonymous buffer, send that
buffer's file descriptor over IPC (+recfd on the other side) and then
map the buffer into the audio server's memory to be able to play it.
This was fine for sending large chunks of audio data, like when playing
existing audio files. However, in the future we want to move to
real-time audio in some applications like Piano. This means that the
size of buffers that are sent need to be very small, as just the size of
a buffer itself is part of the audio latency. If we were to try
real-time audio with the existing system, we would run into problems
really quickly. Dealing with a continuous stream of new anonymous files
like the current audio system is rather expensive, as we need Kernel
help in multiple places. Additionally, every enqueue incurs an IPC call,
which are not optimized for >1000 calls/second (which would be needed
for real-time audio with buffer sizes of ~40 samples). So a fundamental
change in how we handle audio sending in userspace is necessary.
This commit moves the audio sending system onto a shared single producer
circular queue (SSPCQ) (introduced with one of the previous commits).
This queue is intended to live in shared memory and be accessed by
multiple processes at the same time. It was specifically written to
support the audio sending case, so e.g. it only supports a single
producer (the audio client). Now, audio sending follows these general
steps:
- The audio client connects to the audio server.
- The audio client creates a SSPCQ in shared memory.
- The audio client sends the SSPCQ's file descriptor to the audio server
with the set_buffer() IPC call.
- The audio server receives the SSPCQ and maps it.
- The audio client signals start of playback with start_playback().
- At the same time:
- The audio client writes its audio data into the shared-memory queue.
- The audio server reads audio data from the shared-memory queue(s).
Both sides have additional before-queue/after-queue buffers, depending
on the exact application.
- Pausing playback is just an IPC call, nothing happens to the buffer
except that the server stops reading from it until playback is
resumed.
- Muting has nothing to do with whether audio data is read or not.
- When the connection closes, the queues are unmapped on both sides.
This should already improve audio playback performance in a bunch of
places.
Implementation & commit notes:
- Audio loaders don't create LegacyBuffers anymore. LegacyBuffer is kept
for WavLoader, see previous commit message.
- Most intra-process audio data passing is done with FixedArray<Sample>
or Vector<Sample>.
- Improvements to most audio-enqueuing applications. (If necessary I can
try to extract some of the aplay improvements.)
- New APIs on LibAudio/ClientConnection which allows non-realtime
applications to enqueue audio in big chunks like before.
- Removal of status APIs from the audio server connection for
information that can be directly obtained from the shared queue.
- Split the pause playback API into two APIs with more intuitive names.
I know this is a large commit, and you can kinda tell from the commit
message. It's basically impossible to break this up without hacks, so
please forgive me. These are some of the best changes to the audio
subsystem and I hope that that makes up for this :yaktangle: commit.
:yakring:
With the following change in how we send audio, the old Buffer type is
not really needed anymore. However, moving WavLoader to the new system
is a bit more involved and out of the scope of this PR. Therefore, we
need to keep Buffer around, but to make it clear that it's the old
buffer type which will be removed soon, we rename it to LegacyBuffer.
Most of the users will be gone after the next commit anyways.
When computing the 'output mix', the Mixer iterates over all client
audio streams and computes a 'mixed sample' taking into account mainly
the client's volume.
This new member and methods will allow us to ignore a muted client
when computing that mix.
The `m_remaining_samples` attribute was underflowing at the end of an
audio stream. This fix guards against the underflow by only decrementing
the attribute when it is greater than zero.
I found this bug because the SoundPlayer userland application was not
correctly detecting when an audio stream was completed. This was
happening because the remaining samples being returned from the client
audio connection was an underflowed 16 bit integer instead of zero.
Executing `asctl set r 96000` no longer results in weird sample rates
being set on the audio devices. SB16 checks for a sample rate between 1
and 44100 Hz, while AC97 implements double-rate support which allows
sample rates between 8kHz and 96kHZ.
"Frame" is an MPEG term, which is not only unintuitive but also
overloaded with different meaning by other codecs (e.g. FLAC).
Therefore, use the standard term Sample for the central audio structure.
The class is also extracted to its own file, because it's becoming quite
large. Bundling these two changes means not distributing similar
modifications (changing names and paths) across commits.
Co-authored-by: kleines Filmröllchen <malu.bertsch@gmail.com>
Derivatives of Core::Object should be constructed through
ClassName::construct(), to avoid handling ref-counted objects with
refcount zero. Fixing the visibility means that misuses like this are
more difficult.
This commit is separate from the other Servives changes because it
required additional adaption of the code. Note that the old code did
precisely what these changes try to prevent: Create and handle a
ref-counted object with a refcount of zero.
Previously, AudioServer would deadlock when trying to play another audio
stream, i.e. creating a queue. The m_pending_cond condition was used
improperly, and the condition wait now happens independently of querying
the pending queue for new clients if the mixer is running.
To make the mixer's concurrency-safety code more readable, the use of
raw POSIX mutex and condition syscalls is replaced with Threading::Mutex
and Threading::ConditionVariable.
Across the entire audio system, audio now works in 0-1 terms instead of
0-100 as before. Therefore, volume is now a double instead of an int.
The master volume of the AudioServer changes smoothly through a
FadingProperty, preventing clicks. Finally, volume computations are done
with logarithmic scaling, which is more natural for the human ear.
Note that this could be 4-5 different commits, but as they change each
other's code all the time, it makes no sense to split them up.
AudioServer loads its settings, currently volume and mute state, from a
user config file "Audio.ini". Additionally, the current settings are
stored every ten seconds, if necessary. This allows for persistent audio
settings in between boots.
This way we don't have to allocate this at runtime. I'm intentionally
not using static constexpr here because that would put the variable
into the .rodata segment and would therefore increase the binary by
4kB.
The old code also failed to free() the buffer in the destructor, however
that wasn't much of an issue because the Mixer object exists throughout
the program's entire lifetime.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.
See: https://spdx.dev/resources/use/#identifiers
This was done with the `ambr` search and replace tool.
ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
Because it's what it really is. A frame is composed of 1 or more samples, in
the case of SerenityOS 2 (stereo). This will make it less confusing for
future mantainability.