This splits the RIFFTypes header/TU into the WAV specific parts, which
move to WavTypes.h, as well as the general RIFF parts which move to the
new LibRIFF.
Sidenote for the spec comments: even though they are linked from a site
that explains the WAV format, the document is the (an) overall RIFF spec
from Microsoft. A better source may be used later; the changes to the
header are as minimal as possible.
https://developer.apple.com/documentation/audiounit
Apple has a number of audio frameworks we could use. This uses the Audio
Unit framework, as it gives us most control over the rendering of the
audio frames (such as being able to quickly pause / discard buffers).
From some reading, we could implement niceties such as fading playback
in and out while seeking over a short (10ms) period. This patch does not
implement such fancy features though.
Co-Authored-By: Andrew Kaster <akaster@serenityos.org>
This implementation is very naive compared to the PulseAudio one.
Instead of using a callback implemented by the audio server connection
to push audio to the buffer, we have to poll on a timer to check when
we need to push the audio buffers. Implementing cross-process condition
variables into the audio queue class could allow us to avoid polling,
which may prove beneficial to CPU usage.
Audio timestamps will be accurate to the number of samples available,
but will count in increments of about 100ms and run ahead of the actual
audio being pushed to the device by the server.
Buffer underruns are completely ignored for now as well, since the
`AudioServer` has no way to know how many samples are actually written
in a single audio buffer.
This encoder can handle all integer formats and sample rates, though
only two channels well. It uses fixed LPC and performs a
close-to-optimal parameter search on the LPC order and residual Rice
parameter, leading to decent compression already.
This adds an abstract `Audio::PlaybackStream` class to allow cross-
platform audio playback to be done in an opaque manner by applications
in both Serenity and Lagom.
Currently, the only supported audio API is PulseAudio, but a Serenity
implementation should be added shortly as well.
This is a sensible separation of concerns that mirrors the WindowServer
IPC split. On the one hand, there is the "normal" audio interface, used
for clients that play audio, which is the primary service of
AudioServer. On the other hand, there is the management interface,
which, like the WindowManager endpoint, provides higher-level control
over clients and the server itself.
The reasoning for this split are manifold, as mentioned we are mirroring
the WindowServer split. Another indication to the sensibility of the
split is that no single audio client used the APIs of both interfaces.
Also, useless audio queues are no longer created for managing clients
(since those don't even exist, just like there's no window backing
bitmap for window managing clients), eliminating any bugs that may occur
there as they have in the past.
Implementation-wise, we just move all the APIs and implementations from
the old AudioServer into the AudioManagerServer (and respective clients,
of course). There is one point of duplication, namely the hardware
sample rate. This will be fixed in combination with per-client sample
rate, eliminating client-side resampling and the related update bugs.
For now, we keep one legacy API to simplify the transition.
The new AudioManagerServer also gains a hardware sample rate change
callback to have exact symmetry on the main server parameters (getter,
setter, and callback).
With this, the WAV loader is a completely modern LibAudio loader:
- Own type header for RIFF data structures
- custom stream read functions for the types
- Final removal of legacy I/O error checking
- clearer error messages
- clean handling of header chunks
The latter will allow proper handling of other chunks (before "data") in
the future, such as metadata :^)
"Improve" is an understatement, since this commit makes all FLAC files
seek without errors, mostly with high accuracy, and partially even fast:
- A new generic seek table type is introduced, which keeps an
always-sorted list of seek points, which allows it to use binary
search and fast insertion.
- Automatic seek points are inserted according to two heuristics
(distance between seek points and minimum seek precision), which not
only builds a seek table for already-played sections of the file, but
improves seek precision even for files with an existing seek table.
- Manual seeking by skipping frames works properly now and is still used
as a last resort.
This container has several design goals:
- Represent all common and relevant metadata fields of audio files in a
unified way.
- Allow perfect recreation of any metadata format from the in-memory
structure. This requires that we allow non-detected fields to reside
in an "untyped" miscellaneous collection.
Like with pictures, plugins are free to store their metadata into the
m_metadata field whenever they read it. It is recommended that this
happens on loader creation; however failing to read metadata should not
cause an error in the plugin.
Brought to you by the inventor of QOI, QOA is a lossy audio codec that
is, as the name says, quite okay in compressing audio data reasonably
well without frequency transformation, mostly introducing some white
noise in the background. This implementation of QOA is fully compatible
with the qoa.h reference implementation as of 2023-02-25. Note that
there may be changes to the QOA format before a specification is
finalized, and there is currently no information on when that will
happen and which changes will be made.
This implementation of QOA can handle varying sample rate and varying
channel count files. The reference implementation does not produce these
files and cannot handle them, so their implementation is untested.
The QOA loader is capable of seeking in constant-bitrate streams.
QOA links:
https://phoboslab.org/log/2023/02/qoa-time-domain-audio-compressionhttps://github.com/phoboslab/qoa
Otherwise, we end up propagating those dependencies into targets that
link against that library, which creates unnecessary link-time
dependencies.
Also included are changes to readd now missing dependencies to tools
that actually need them.
Also do this for Shell.
This greatly simplifies the CMakeLists in Lagom, replacing many glob
patterns with a big list of libraries. There are still a few special
libraries that need some help to conform to the pattern, like LibELF and
LibWebView.
It also lets us remove essentially all of the Serenity or Lagom binary
directory detection logic from code generators, as now both projects
directories enter the generator logic from the same place.
The file is now renamed to Queue.h, and the Resampler APIs with
LegacyBuffer are also removed. These changes look large because nobody
actually needs Buffer.h (or Queue.h). It was mostly transitive
dependencies on the massive list of includes in that header, which are
now almost all gone. Instead, we include common things like Sample.h
directly, which should give faster compile times as very few files
actually need Queue.h.
Previously, we were sending Buffers to the server whenever we had new
audio data for it. This meant that for every audio enqueue action, we
needed to create a new shared memory anonymous buffer, send that
buffer's file descriptor over IPC (+recfd on the other side) and then
map the buffer into the audio server's memory to be able to play it.
This was fine for sending large chunks of audio data, like when playing
existing audio files. However, in the future we want to move to
real-time audio in some applications like Piano. This means that the
size of buffers that are sent need to be very small, as just the size of
a buffer itself is part of the audio latency. If we were to try
real-time audio with the existing system, we would run into problems
really quickly. Dealing with a continuous stream of new anonymous files
like the current audio system is rather expensive, as we need Kernel
help in multiple places. Additionally, every enqueue incurs an IPC call,
which are not optimized for >1000 calls/second (which would be needed
for real-time audio with buffer sizes of ~40 samples). So a fundamental
change in how we handle audio sending in userspace is necessary.
This commit moves the audio sending system onto a shared single producer
circular queue (SSPCQ) (introduced with one of the previous commits).
This queue is intended to live in shared memory and be accessed by
multiple processes at the same time. It was specifically written to
support the audio sending case, so e.g. it only supports a single
producer (the audio client). Now, audio sending follows these general
steps:
- The audio client connects to the audio server.
- The audio client creates a SSPCQ in shared memory.
- The audio client sends the SSPCQ's file descriptor to the audio server
with the set_buffer() IPC call.
- The audio server receives the SSPCQ and maps it.
- The audio client signals start of playback with start_playback().
- At the same time:
- The audio client writes its audio data into the shared-memory queue.
- The audio server reads audio data from the shared-memory queue(s).
Both sides have additional before-queue/after-queue buffers, depending
on the exact application.
- Pausing playback is just an IPC call, nothing happens to the buffer
except that the server stops reading from it until playback is
resumed.
- Muting has nothing to do with whether audio data is read or not.
- When the connection closes, the queues are unmapped on both sides.
This should already improve audio playback performance in a bunch of
places.
Implementation & commit notes:
- Audio loaders don't create LegacyBuffers anymore. LegacyBuffer is kept
for WavLoader, see previous commit message.
- Most intra-process audio data passing is done with FixedArray<Sample>
or Vector<Sample>.
- Improvements to most audio-enqueuing applications. (If necessary I can
try to extract some of the aplay improvements.)
- New APIs on LibAudio/ClientConnection which allows non-realtime
applications to enqueue audio in big chunks like before.
- Removal of status APIs from the audio server connection for
information that can be directly obtained from the shared queue.
- Split the pause playback API into two APIs with more intuitive names.
I know this is a large commit, and you can kinda tell from the commit
message. It's basically impossible to break this up without hacks, so
please forgive me. These are some of the best changes to the audio
subsystem and I hope that that makes up for this :yaktangle: commit.
:yakring:
The Buffer files had contained both the ResampleHelper and the
sample format utilities. Because the Buffer class (and its file) is
going to be deleted soon, this commit separates those two things into
their own files.
This commit adds a loader for the FLAC audio codec, the Free Lossless
Audio codec by the Xiph.Org foundation. LibAudio will automatically
read and parse FLAC files, so users do not need to adjust.
This implementation is bare-bones and needs to be improved upon.
There are many bugs, verbatim subframes and any kind of seeking is
not supported. However, stereo files exported by libavcodec on
highest compression setting seem to work well.