Previously, the `Parser::decode_tiles()` function wouldn't wait for the
tile-decoding workers to finish before exiting the function, which
could mean that the data the threads are working with could become
invalid if the decoder is deleted after an error is encountered.
Using malloc does not invoke T's constructor, nor were were invoking T's
constructor ourselves. Accessing T without invoking its constructor is
undefined behavior.
This adds a new WorkerThread class to run one task asynchronously,
and allow waiting for that thread to finish its work.
TileContexts are placed into multiple tile column vectors with their
streams to read from pre-created. Once those are ready, the threads can
start their work on each vector separately. The main thread waits for
those tasks to finish, then sums up the syntax element counts for each
tile that was decoded.
Previously, we were incorrectly wrapping an error from `BooleanDecoder`
initialization in a `DecoderErrorCategory::Memory` error. This caused
an incorrect error message in VideoPlayer. Now it will instead return
`DecoderErrorCategory::Corrupted`.
Extending the borders on reference frames so that motion vectors that
point outside the reference frame allows `predict_inter_block()` to
avoid some branches to clamp the sample coordinates in its loops.
This results in about a 25% improvement in decode time of a motion-
heavy YouTube video (~20.8s -> ~15.6s).
Moving the clamping of the coordinates of the reference frame samples
as well as some bounds checks outside of the loop reduces the branches
needed in the `predict_inter_block()` significantly.
This results in a whopping ~41% improvement in decode performance
of an inter-prediction-heavy YouTube video (~35.4s -> ~20.8s).
Changing the calculation of reference frame scale factors to be done on
a per-frame basis reduces the amount of work done in
`predict_inter_block()`, which is a big hotspot in most videos.
This reduces decode times in a test video from YouTube by about 5%
(~37.2s -> ~35.4s).
This changes the order of the loop copying data to a reference frame
store so that it copies each row in a contiguous line rather than
copying a column at a time, which caused unnecessary branches.
This reduces the decode time on a fairly long 720p YouTube video by
about 14.5% (~43.5s to ~37.2s).
Previously, there was some leftover logic in `SeekingStateHandler`
which would avoid presenting a frame if it didn't find a frame both
before and after the target frame. However, this logic was unnecessary
as the `on_enter()` function would check if it needed to present a
frame and exit seeking if not.
This allows seeking to succeed if the Seeking handler cannot find a
frame before the one to be seeked to, which could occur if the first
frame of a video is not at timestamp 0.
This adds a timestamp to the debug output when presenting a frame, so
it can be clear the frame spacing between a presented frame and a state
change.
The seeking state will also now print when it early-exits if the seek
point is within the current sample's duration.
StartingStateHandler and SeekingStateHandler were declaring their own
`bool m_playing` fields (from previous code where there was no base
class).
In the case of SeekingStateHandler, this only made the logging wrong.
For StartingStateHandler, however, this meant that it was not using
the boolean passed as a parameter to the constructor to define the
state that would be transitioned to after the Starting state finished.
This meant that when the Stopping state replaced itself with the
Starting state, playback would not resume when Starting state exits.
To pass events from LibVideo's PlaybackManager to interested parties, we
currently dispatch Core::Event objects that outside callers listen for.
Dispatching events in this manner rely on a Core::EventLoop. In order to
use PlaybackManager from LibWeb, change this mechanism to instead use a
set of callbacks to inform callers of events.
Using uninitialized primitives is undefined behavior. In this case, all
videos would autoplay on Serenity, but not play at all on Lagom, due to
some m_playing booleans being uninitialized.
This copies the video data from the Matroska document into the Track
structure that outside users have access to. Because Track can actually
represent other media types, this is set up such that the Track can hold
metadata for those other types when they are needed.
This is needed for LibWeb's HTMLMediaElement implementation.
This doesn't appear to have had a measurable impact on performance,
and behavior is the same.
With the tiles using independent BooleanDecoders with their own
backing BitStreams, we're even one step closer to threaded tiles!
This allows the logic for keeping track of whether to resume to the
paused or the playing state when exiting these states. The new
StartingStateHandler also uses the class, since it can also be paused
and unpaused while waiting for samples.
The pause/play actions on the handlers inheriting from the resuming
handler will also now notify the owner that the state has changed so
that it can change icons, etc.
The PlaybackStateChangeEvent wasn't connected up anymore, so the player
wouldn't change icons when stopping playback due to reaching the end of
the stream or encountering an error.
This new state handler will retrieve and display the first frame, while
ensuring that playback can start as soon as possible by buffering two
frames on top of the first frame for the PlayingStateHandler to set its
next frame timer by.
Previously, we assumed the timer was hitting at the correct time. This
meant that if we changed states and the previous state handler had
prepared a next frame, we would immediately display it without checking
the timestamp.
If we encounter an error in the video decoder, we can set a timestamp
for the error item in the queue so that it will display the error only
when the frame that caused the error would have been displayed.
Previously we had dispatch_decoder_error and on_decoder_error serving
the same function, with one not handling the end of stream properly.
There is also a new function to dispatch either an error or a frame to
the owner of this playback manager, so that PlaybackStateHandlers don't
have to duplicate this logic.
Checking the bounds of the intermediate values was only implemented to
help debug the decoder. However, it is non-fatal to have the values
exceed the spec-defined bounds, and causes a measurable performance
reduction.
Additionally, the checks were implemented as an assertion, which is
easily broken by bad input files.
I see about a 4-5% decrease in decoding times in the `webm_in_vp9` test
in TestVP9Decode.
For example, consider cases where we want to propagate errors only in
specific instances:
auto result = read_data(); // something like ErrorOr<ByteBuffer>
if (result.is_error() && result.error().code() != EINTR)
continue;
auto bytes = TRY(result);
The TRY invocation will currently copy the byte buffer when the
expression (in this case, just a local variable) is stored into
_temporary_result.
This patch binds the expression to a reference to prevent such copies.
In less trival invocations (such as TRY(some_function()), this will
incur only temporary lifetime extensions, i.e. no functional change.
That matches the terminology used in ITU-T Rec. H.273,
PNG's cICP chunk, and the ICC cicpTag.
Also change the enum values to match the values in the spec --
0 means "not full range" and 1 means "full range".
(For now, keep the "Unspecified" entry around, and give it value 2.
This value is not in the spec.)
No intended behavior change.
Fast seeking does not work correctly when seeking in small increments,
so it is necessary to use accurate seeking when using certain actions.
The PlaybackManager has been changed to accept the seek mode as a
parameter to `seek_to_timestamp` to facilitate this. This now means
that it no longer has to track a seek mode preference.
This would cause seeks where the approximate starting cue index was one
cue before the optimal cue to not change the sample iterator position.
Fixing this issue improves accurate seek speeds significantly.
In cases where the PlaybackManager's earliest buffered or displayed
sample is closer to the seek target than the demuxer's chosen keyframe,
we don't want to seek at all. To enable this, demuxers now receive an
optional parameter with the earliest timestamp that the caller can
still access.
The demuxer in turn returns an optional to indicate when a seek was not
needed, which allows PlaybackManager to avoid clearing its queue and
re-decoding frames.
Storing playback states in virtual classes allows the behavior to be
much more clearly written. Each `PlaybackStateHandler` subclass can
implement some event-handling functions to model their behavior, and
has functions to change its parent PlaybackManager's state to any other
state.
This will allow expanding the functionality of playback in the future,
for example to allow skipping a single frame forward/backward.
A bit of a bikeshed, but status sounds more like a result of an action,
and state sounds more accurate to what the `PlaybackManager` does.
The previous and current state fields of the `PlaybackStateChangeEvent`
are now removed, since they were unused (for now).
This is a lead-up to the refactoring of VideoPlaybackManager to make
that diff more legible.