There's now two namespaces, RIFF (little-endian) and IFF (big-endian)
which (for the most part) contain the same kinds of structures for
handling similar data in both formats. (They also share almost all of
their implementation) The main types are ChunkHeader and (Owned)Chunk.
While Chunk has no ownership over the data it accesses (and can only be
constructed from a byte view), OwnedChunk has ownership over this data
and is aimed at reading from streams.
OwnedList, implementing the standard RIFF LIST type, is currently only
implemented for RIFF due to its only user being WAV, but it may be
generalized in the future for use by IFF.
Co-authored-by: Timothy Flynn <trflynn89@pm.me>
This splits the RIFFTypes header/TU into the WAV specific parts, which
move to WavTypes.h, as well as the general RIFF parts which move to the
new LibRIFF.
Sidenote for the spec comments: even though they are linked from a site
that explains the WAV format, the document is the (an) overall RIFF spec
from Microsoft. A better source may be used later; the changes to the
header are as minimal as possible.
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).
This commit is auto-generated:
$ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
Meta Ports Ladybird Tests Kernel)
$ perl -pie 's/\bDeprecatedString\b/ByteString/g;
s/deprecated_string/byte_string/g' $xs
$ clang-format --style=file -i \
$(git diff --name-only | grep \.cpp\|\.h)
$ gn format $(git ls-files '*.gn' '*.gni')
We were calling write syscall twice for every sample, which effectively
hurt the writer's performance.
With this change exporting a melody in the Piano app now takes less than
a second, which previously took about 20 seconds on my machine.
Additionally, I've removed an unused `WavWriter::file()` getter.
Let's replace this bool with an `enum class` in order to enhance
readability. This is done by repurposing `MappedFile`'s `OpenMode` into
a shared `enum` simply called `Mode`.
This is 10-20% of a speed increase on platforms with fast I/O (Linux)
and not a slowdown on Serenity. Again, the file system layer is the
limit for us :^)
These were intentionally set up to be at the end of the granule size,
but since the stereo intensity loop is intentionally using a <= end
comparison (that’s how the scale factor bands work), we must skip these
dummy bands which would otherwise cause an out-of-bounds index.
This can happen with some weird inputs, so instead, return an error; we
need at least one “effective” bit per sample so the bits per sample
cannot be less than or equal to the wasted bits per sample.
https://developer.apple.com/documentation/audiounit
Apple has a number of audio frameworks we could use. This uses the Audio
Unit framework, as it gives us most control over the rendering of the
audio frames (such as being able to quickly pause / discard buffers).
From some reading, we could implement niceties such as fading playback
in and out while seeking over a short (10ms) period. This patch does not
implement such fancy features though.
Co-Authored-By: Andrew Kaster <akaster@serenityos.org>
The `Loader` uses a `Vector<Sample>` to store any samples that the
actual audio decoder returned that the loader client did not request
yet. However, it did not take that buffer into account when those
clients asked for the number of samples loaded. The buffer is an
internal implementation detail, and should not be reflected on the
external API.
This allows LibWeb to keep better track of audio position without any
desync caused by the buffer. When using the Serenity `PlaybackStream`
implementation as the back end for an `HTMLAudioElement`, this allows
us to perfectly stop on the exact last sample of the audio file, so it
will not stop before the media element can see that it has finished
playback.
Note that `Loader::get_more_samples()` also calls `loaded_samples()` to
determine how many samples are remaining to load into the output
buffer. However, this change appears to be correct even there, given
that the samples copied from the internal sample buffer are included in
that count.
This implementation is very naive compared to the PulseAudio one.
Instead of using a callback implemented by the audio server connection
to push audio to the buffer, we have to poll on a timer to check when
we need to push the audio buffers. Implementing cross-process condition
variables into the audio queue class could allow us to avoid polling,
which may prove beneficial to CPU usage.
Audio timestamps will be accurate to the number of samples available,
but will count in increments of about 100ms and run ahead of the actual
audio being pushed to the device by the server.
Buffer underruns are completely ignored for now as well, since the
`AudioServer` has no way to know how many samples are actually written
in a single audio buffer.
We can just create a `BigEndianInputBitStream` where it is needed
instead of storing one in the class. After making `read_header()`
static, the only other user of the field was `read_size_information()`,
so let's do that there and then remove the field.
This makes the checks for a frame header more consistent, so if the
conditions for allowed frame headers change, there are less scattered
lines that will need to be changed.
`synchronize()` will now also properly scan the second byte of the hex
sequence `FF FF F0` as a sync code, where previously it would see
`FF F` and skip on to `F0`, ignoring its preceding `FF` that would
indicate that it is a sync code.
`synchronize()` can be simplified greatly by checking whole bytes with
bitwise operations, and doing so also avoids the overhead of reading
individual bits from a bitstream.
Making `read_frame()` also take a `SeekableStream` will allow it to be
used inside `synchronize()` in the next commit.
Prevously, the header size was used to calculate the `slot_count` field
of `MP3::Header`, but `build_seek_table()` just used the maximum size
of the header instead, causing it not to seek far enough, and in cases
where a possible sync code occurred two bytes before the next frame, it
would read that possible sync code as if it was a real frame. It would
then either reject it due to bad field values, or could possibly skip
over the next real frame due to a larger calculated frame size in the
bogus frame.
By fixing this issue, we now properly calculate the duration of MP3
files where these fake sync codes occur. In the case of the raw file
for this podcast:
https://changelog.com/podcast/554
the duration goes from 1:21:57 to 1:22:47, which is the real duration
according to the player user interface.
The seek table must locate the first MP3 frame in the file, so it makes
sense to locate the samples for the sample table first, then that
information to seek to the first frame.
Previously, we would just start from byte 0 and check individual bytes
of the file until we find two bytes starting with `FF F`, and then
assume that that was the MP3 frame sync code. However, some ID3v2 tags
do not have to be what is referred to as "unsynchronized", meaning that
they can contain that `FF F` sequence and cause our decoder to think it
has found a frame.
To avoid this happening, we can read a minimal amount of the ID3 header
to determine how many bytes to skip before attempting to find the MP3
frames.
This allows the recent podcast with Andreas to play here:
https://changelog.com/podcast/554
Seek points were being created after adding to the sample count in
`build_seek_table()`, meaning that they would be offset forward by
`MP3::frame_size` samples.
This also allows us to remove the hardcoded sample 0 seek point that
was previously added, since a seek point at sample 0 will now be added
by the loop.
The estimation for this is fast but not very accurate, meaning we save
around 5-10% storage space. (We also don’t try other channel coupling
methods, but I am sceptical of how much benefit that actually provides.)
This encoder can handle all integer formats and sample rates, though
only two channels well. It uses fixed LPC and performs a
close-to-optimal parameter search on the LPC order and residual Rice
parameter, leading to decent compression already.