We can just create a `BigEndianInputBitStream` where it is needed
instead of storing one in the class. After making `read_header()`
static, the only other user of the field was `read_size_information()`,
so let's do that there and then remove the field.
This makes the checks for a frame header more consistent, so if the
conditions for allowed frame headers change, there are less scattered
lines that will need to be changed.
`synchronize()` will now also properly scan the second byte of the hex
sequence `FF FF F0` as a sync code, where previously it would see
`FF F` and skip on to `F0`, ignoring its preceding `FF` that would
indicate that it is a sync code.
`synchronize()` can be simplified greatly by checking whole bytes with
bitwise operations, and doing so also avoids the overhead of reading
individual bits from a bitstream.
Making `read_frame()` also take a `SeekableStream` will allow it to be
used inside `synchronize()` in the next commit.
Prevously, the header size was used to calculate the `slot_count` field
of `MP3::Header`, but `build_seek_table()` just used the maximum size
of the header instead, causing it not to seek far enough, and in cases
where a possible sync code occurred two bytes before the next frame, it
would read that possible sync code as if it was a real frame. It would
then either reject it due to bad field values, or could possibly skip
over the next real frame due to a larger calculated frame size in the
bogus frame.
By fixing this issue, we now properly calculate the duration of MP3
files where these fake sync codes occur. In the case of the raw file
for this podcast:
https://changelog.com/podcast/554
the duration goes from 1:21:57 to 1:22:47, which is the real duration
according to the player user interface.
The seek table must locate the first MP3 frame in the file, so it makes
sense to locate the samples for the sample table first, then that
information to seek to the first frame.
Previously, we would just start from byte 0 and check individual bytes
of the file until we find two bytes starting with `FF F`, and then
assume that that was the MP3 frame sync code. However, some ID3v2 tags
do not have to be what is referred to as "unsynchronized", meaning that
they can contain that `FF F` sequence and cause our decoder to think it
has found a frame.
To avoid this happening, we can read a minimal amount of the ID3 header
to determine how many bytes to skip before attempting to find the MP3
frames.
This allows the recent podcast with Andreas to play here:
https://changelog.com/podcast/554
Seek points were being created after adding to the sample count in
`build_seek_table()`, meaning that they would be offset forward by
`MP3::frame_size` samples.
This also allows us to remove the hardcoded sample 0 seek point that
was previously added, since a seek point at sample 0 will now be added
by the loop.
The estimation for this is fast but not very accurate, meaning we save
around 5-10% storage space. (We also don’t try other channel coupling
methods, but I am sceptical of how much benefit that actually provides.)
This encoder can handle all integer formats and sample rates, though
only two channels well. It uses fixed LPC and performs a
close-to-optimal parameter search on the LPC order and residual Rice
parameter, leading to decent compression already.
This interface is very simple for the time being and can be used to
provide encoding functionality in a generalized way. Initialization and
parameter setting are intentionally not abstracted for now, since this
is usually very format-specific. We just need a general interface for
writing samples and errorable finalization.
WavWriter and the shot utility open files with this mode and never
truncate the files, which might leave some contents of a previous file
during overwriting.
This will ensure that we don't leak any memory while playing back
audio.
There is an expectation value in the test that is only set to true when
PulseAudio is present for the moment. When any new implementation is
added for other libraries/platforms, we should hopefully get a CI
failure due to unexpected success in creating the `PlaybackStream`.
To ensure that we clean up our PulseAudio connection whenever audio
output is not needed, add `PulseAudioContext::weak_instance()` to allow
us to check whether an instance exists without creating one.
If we don't clear the callbacks, they may be called after our functions
are deleted.
Disconnecting the stream also doesn't appear to be done automatically
when calling `pa_stream_unref()` for the last time, so let's do that.
We don't want to pull the stream out from under our PulseAudio main
loop, so call these with the lock to ensure that nothing is touching
them.
The `pa_threaded_mainloop_stop()` call does not require lock as it sets
a flag to tell the main loop to exit.
The mutex used to protect from multiple threads creating PulseAudio
contexts simultaneously could remain locked when an application exited.
The static variables' destructors could be called on the main thread
while another thread is running `PulseAudioContext::instance()` and
synchronously connecting to a PulseAudio daemon. This would cause an
assertion in Mutex that it is unlocked upon its destructor being
called.
By creating a static `ScopeGuard` that locks and immediately unlocks
the mutex, we can ensure that the main thread waits for the connection
to succeed or fail. In most cases, this will not take long, but if the
connection is timing out, it could take a matter of seconds.
Now that `Thread` keeps itself alive when it is running detached, we do
not need to hold onto it in the PulseAudio playback stream's internal
state object. This was a hack that did not work correctly because the
`Thread` object and its action `Function` would be deleted before the
action had exited and cause a crash.
This adds an abstract `Audio::PlaybackStream` class to allow cross-
platform audio playback to be done in an opaque manner by applications
in both Serenity and Lagom.
Currently, the only supported audio API is PulseAudio, but a Serenity
implementation should be added shortly as well.
Removes the Sample struct inside Piano and replaces it with the struct
from LibDSP.
It automatically scales the height of the wave depending on the maximum
amplitude, as the Samples now contain floats and not integers.
A previous commit made it so that SeekTable doesn't provide a seek
point from `seek_point_before()` if there is not a seek point before
the requested sample index. However, MP3Loader was only setting a seek
point after the first 10 frames, meaning that it would do nothing when
seeking back to 0.
To fix this, add a seek point at byte 0 for the first sample, so that
`seek_point_before()` will never fail.
Bytes will implicitly cast to StringView, but not to ReadonlyBytes. That
means that if you call
`Audio::Loader::create_plugin(mapped_file->bytes())`
it will silently use the `create_plugin(StringView path)` overload.
Reading audio data does not require that data to be writable, so let's
use ReadonlyBytes for it and avoid the footgun.
- Pre-allocate and reuse sample decompression buffers. In many FLAC
files, the amount of samples per frame is either constant or the
largest frame will be hit within the first couple of frames. Also,
during audio output, we need to move and combine the samples from the
decompression buffers into the final output buffers anyways. Avoiding
the reallocation of these large buffers provides an improvement from
16x to 18x decode speed on strongly compressed but otherwise usual
input.
- Leave a FIXME for a similar improvement that can be made in the
residual decoder.
- Pre-allocate audio chunks if frame size is known.
- Use reasonable inline capacities in several places where we know the
maximum or usual capacity needed.
Instead of using a seek tolerance value to get close enough to the
target, we can skip frames forward until we pass the target, then seek
back to the previous frame. That puts us in a position to immediately
decode the frame containing the target sample.
Previously, the FLAC loader would not skip samples to reach its seek
target if it saw that the current sample in the loader is closer to the
target than the seek point it finds. This prevents seeking forward when
there are no seek points past the current position.
Previously, the calculation of the distance to the previous seekpoint
would always behave as if a seek point existed at sample 0, meaning
that it would never place a seek point there. If we instead treat it as
the maximum distance if no sample is present, a seek point will be
placed.
If the seek table was incomplete, without any seek points available
before the target point, `SeekTable::seek_point_before()` would instead
return the first seek point after the target. Check whether the seek
point is before the target before returning it.
We downsample multi-channel files into stereo for now, which at least
makes the other channels listenable. The new multi-channel downmix
helper is intended to be used for other formats with the same or similar
channel arrangement, such as QOA.
Especially FLAC had an issue here before, but the loader infrastructure
itself wouldn't handle end of stream properly if the "available samples"
information didn't match up.
It's no longer needed now that this code uses ErrorOr instead of Result.
Ran:
rg -lw LOADER_TRY Userland/Libraries/LibAudio \
| xargs sed -i '' 's/LOADER_TRY/TRY/g'
...and then manually fixed up Userland/Libraries/LibAudio/LoaderError.h
to not redefine TRY but instead remove the now-unused LOADER_TRY,
and ran clang-format.
For very large seekpoint indices, the casts necessary for the "simple"
subtraction comparison will yield wrong and overflowing results.
Therefore, we perform the seekpoint comparison manually instead.
This specialized UTF-8 decoder is more powerful than a normal UTF-8
decoder anyways, but it couldn't account for the never spec-compliant
0xff start byte. This commit makes that byte behave as expected if
taking UTF-8 to its extreme, even if it is a little silly and likely not
relevant for real applications.