The framebuffer size was reduced in f2c0cee, but this caused some niche
block layouts to write outside of the frame.
This could be fixed by adding checks to see if a block being predicted/
reconstructed is within the frame, but the branches introduced by that
reduce performance slightly. Therefore, it's better to keep the
framebuffer sized according to the decoded frame size in 8x8 blocks so
that any block can be decoded without bounds checking.
A test was added to ensure that this continues to work.
Quantizers are a constant for the whole frame, except when segment
features override them, in which case they are a constant per segment
ID. We take advantage of this by pre-calculating those after reading
the quantization parameters and segmentation features for a frame.
This results in a small 1.5% improvement (~12.9s -> ~12.7s).
This throws out some ugly `#define`s we had that were taking the role
of an enum anyway. We now have some nice getters in the contexts that
take the place of the combo of `seg_feature_active()` and then doing a
lookup in `FrameContext::m_segmentation_features` directly.
This moves all the frame size calculation to `FrameContext`, where the
subsampling is easily accessible to determine the size for each plane.
The internal framebuffer size has also been reduced to the exact frame
size that is output.
Previously, we were incorrectly wrapping an error from `BooleanDecoder`
initialization in a `DecoderErrorCategory::Memory` error. This caused
an incorrect error message in VideoPlayer. Now it will instead return
`DecoderErrorCategory::Corrupted`.
This doesn't appear to have had a measurable impact on performance,
and behavior is the same.
With the tiles using independent BooleanDecoders with their own
backing BitStreams, we're even one step closer to threaded tiles!
Like the non-zero tokens and segmentation IDs, these can be moved into
the tile decoding loop for above context and allocated by TileContext
for left context.
We can store this context in the stack of Parser::decode_tiles and use
spans to give access to the sections of the context for each tile and
subsequently each block.
The array containing the vertical line of bools indicating whether non-
zero tokens were decoded in each sub-block is moved to TileContext, and
a span of the valid range for a block to read and write to is created
when we construct a BlockContext.
Only the residual tokens array needs to be kept for the transforms to
use after all the tokens have been parsed. The token cache is able to
be kept in the stack only for the duration of the token parsing loop.
Since the enum is used as an index to arrays, it unfortunately can't
be converted to an enum class, but at least we can make sure to use it
with the qualified enum name to make things a bit clearer.
Previously, the variables were named similarly to the names in spec
which aren't very human-readable. This adds some utility functions for
dimensional unit conversions and names the variables in residual()
based on their units.
References to 4x4 blocks were also renamed to call them sub-blocks
instead, since unit conversion functions would not be able to begin
with "4x4_blocks".
Moving these to another header allows Parser.h to include less context
structs/classes that were previously in Context.h.
This change will also allow consolidating some common calculations into
Context.h, since we won't be polluting the VP9 namespace as much. There
are quite a few duplicate calculations for block size, transform size,
number of horizontal and vertical sub-blocks per block, all of which
could be moved to Context.h to allow for code deduplication and more
semantic code where those calculations are needed.
Note that some of the previous segmentation feature settings must be
preserved when a frame is decoded that doesn't use segmentation.
This change also allowed a few functions in Decoder to be made static.
Previously, we were using size_t, often coerced from bool or u8, to
index reference pairs. Now, they must either be taken directly from
named fields or indexed using the `ReferenceIndex` enum with options
`primary` and `secondary`. With a more explicit method of indexing
these, the compiler can aid in using reference pairs correctly, and
fuzzers may be able to detect undefined behavior more easily.
Candidate vector selections are only used to calculate the new vectors
for the current block, so we only need to keep those for the duration
of the inter_block_mode_info() call.
Candidate vectors are now stored in BlockMotionVectorCandidates, which
contains the fields necessary to choose the vector to use to sample
from the selected reference frame.
Most functions related to motion vectors were renamed to more verbose
but meaningful names.
The log2 of tile counts in the horizontal and vertical dimensions are
now stored in the FrameContext struct to be kept only as long as they
are needed.
This also renames (most?) of the related quantizer functions and
variables to make more sense. I haven't determined what AC/DC stands
for here, but it may be just an arbitrary naming scheme for the first
and subsequent coefficients used to quantize the residuals for a block.
The color config is reused for most inter predicted frames, so we use a
struct ColorConfig to store the config from intra frames, and put it in
a field in Parser to copy from when an inter frame without color config
is encountered.
There are three mutually exclusive frame-showing states:
- Show no new frame, only store the frame as a reference.
- Show a newly decoded frame.
- Show frame from the reference frame store.
Since they are mutually exclusive, using an enum rather than two bools
makes more sense.
These are used to pass context needed for decoding, with mutability
scoped only to the sections that the function receiving the contexts
needs to modify. This allows lifetimes of data to be more explicit
rather than being stored in fields, as well as preventing tile threads
from modifying outside their allowed bounds.
There are three fields that we need to store from FrameBlockContext to
keep between frames, which are used to parse for those same fields for
the next frame.
All state that needed to persist between calls to decode_block was
previously stored in plain Vector fields. This moves them into a struct
which sets a more explicit lifetime on that data. It may be possible to
store this data on the stack of a function with the appropriate
lifetime now that it is split into its own struct.
With the addition of this struct, both the bool to determine if coefs
should be parsed and the token parse itself can take specific
parameters.
This is the last step in parameterizing all the tree parsing, so the
old functions in TreeParser are now unused. This patch is very
satisfying :^)
There's still more work to be done to clean up how the parameters are
passed from Parser, but that's work for another day.
Since these two types are often passed around as a pair, it's easier to
handle them with a simple pair struct, at least for now. Once things
are fully being passed around as parameters wherever possible, it may
be good to change this type for something more generalized.