1
Fork 0
mirror of https://github.com/RGBCube/serenity synced 2025-05-14 19:54:57 +00:00
Commit graph

486 commits

Author SHA1 Message Date
Lucas CHOLLET
83f1775f15 LibGfx/CCITT: Reimplement PassMode in a less naive way
The old implementation of PassMode has only been tested with a single
image, and let's say that it didn't survive long in the wild. A few
cases were not considered:
 - We only supported VerticalMode right after PassMode.
 - It can happen that token need to be used but not consumed from the
 reference line.

 With that fix, we are able to decode every single PDF file from the
 1000-file zip "0000" (except 0000871.pdf, which uses byte alignment).
 This is massive progress compared to the hundred of errors that we were
 previously receiving.
2024-02-22 16:45:03 +01:00
Nico Weber
607880cbd3 LibGfx/JPEGLoader: Add dbgln_if() when hitting unsupported marker 2024-02-21 17:54:53 +01:00
Nico Weber
95391fafcb LibGfx/JPEGLoader: Print offset in an error dbgln() in hex 2024-02-21 17:54:53 +01:00
Nico Weber
24a469f521 Everywhere: Prefer {:#x} over 0x{:x} in format strings
The former automatically adapts the prefix to binary and octal
output, and is what we already use in the majority of cases.

Patch generated by:

    rg -l '0x\{' | xargs sed -i '' -e 's/0x{:/{:#/'

I ran it 4 times (until it stopped changing things) since each
invocation only converted one instance per line.

No behavior change.
2024-02-21 17:54:38 +01:00
Lucas CHOLLET
9ec3480207 LibGfx/TIFF: Add support for Group4Fax encoded images
Note that we don't parse the T6 option group yet.

The test case was generated with GIMP.
2024-02-21 13:49:43 +01:00
Lucas CHOLLET
d57d676425 LibGfx/CCITT: Add support for Group4
The API is currently pretty raw. Group4 has a bunch of options that we
don't support yet.
2024-02-21 13:49:43 +01:00
Lucas CHOLLET
e9dd1cda3e LibGfx/CCITT: Abstract the code to read a single CCITT 2D line 2024-02-21 13:49:43 +01:00
Lucas CHOLLET
be9ec591e7 LibGfx/CCITT: Add support for Group3 2D
The two test images were created with:
tiffcp ccit3.tiff -c g3:2d ccit3_2d.tiff
tiffcp ccit3.tiff -c g3:2d:fill ccit3_2d_fill.tiff
2024-02-19 01:40:04 +01:00
Lucas CHOLLET
9116cc3f45 LibGfx/CCITT: Put the code to read the run length in its own function
This is already nice to do for the sole purpose of the readability but
that will also become handy for the 2D decoder.
2024-02-19 01:40:04 +01:00
Lucas CHOLLET
3d63dd5c53 LibGfx/CCITT: Make get_code_from_table take a generic Array 2024-02-19 01:40:04 +01:00
Lucas CHOLLET
d54dbdae5e LibGfx/CCITT: Introduce the invert helper
This function turns Black into White and White into Black.
2024-02-19 01:40:04 +01:00
Lucas CHOLLET
ce0ac70416 LibGfx/CCITT: Declare reference colors as static variables 2024-02-19 01:40:04 +01:00
Lucas CHOLLET
45b37010b5 Revert "LibGfx/CCITT: Don't overrun the image width"
This reverts commit a4b2e5b27b.

This was just plain wrong, I remember it making sense and fixing
something but that was probably due to local changes. It should never
have landed on master, my bad.
2024-02-19 01:40:04 +01:00
Lucas CHOLLET
d375b5c2a5 LibGfx/TIFF: Also cache the result of alpha_channel_index()
This function was called over and over in `manage_extra_channels()`,
even if the result depends only on the metadata. Instead, we now call it
once and store the result.
2024-02-18 21:53:27 +01:00
Lucas CHOLLET
a637a02de8 LibGfx/TIFF: Cache metadata values that are used in the hot path
The ExifMetadata class is handy as it handles any Exif tag, but the
performance price is non-negligible. So, let's cache important values.
2024-02-18 21:53:27 +01:00
Lucas CHOLLET
15d151ee66 LibGfx/ICO: Remove unused parameter 2024-02-14 06:56:03 +01:00
Lucas CHOLLET
8e21bbf7bf LibGfx/TIFF: Add support for tiled images
A tile is basically a strip with a user-defined width. With that in
mind, adding support for them is quite straightforward. As a lot the
common code was named after 'strips', to avoid future confusion I
renamed everything that interact with either strips or tiles to a
global term: 'segment'.

Note that tiled images are supposed to always have a 'TileOffsets' tag
instead of 'StripOffset'. However, this doesn't seem to be enforced by
encoders, so we support having either of them indifferently.

The test case was generated with the following Python script:

import pyvips

img = pyvips.Image.new_from_file('deflate.tiff')
img.write_to_file('tiled.tiff',
                  compression=pyvips.ForeignTiffCompression.DEFLATE,
                  tile=True, tile_width=64, tile_height=64)
2024-02-13 10:13:11 +01:00
Lucas CHOLLET
a30515011a LibGfx/TIFF: Add support for TileOffset and TileByteCounts 2024-02-13 10:13:11 +01:00
Lucas CHOLLET
18871e23d7 LibGfx/TIFF: Make decoders take an IntSize
They will also need the width of the sub-image when we will add support
for tiles.
2024-02-13 10:13:11 +01:00
Lucas CHOLLET
7b510c3876 LibGfx/TIFF: Rename scanline => image_row
This variable stores the number of rows from the beginning of the image,
contrary to `row` that stores the number of rows relative to the start
of the current segment.
2024-02-13 10:13:11 +01:00
Lucas CHOLLET
c4e8e5c4a6 LibGfx/TIFF: Rename ImageHeight => ImageLength
This is the name used in the TIFF specification. No behavior change.
2024-02-13 10:13:11 +01:00
Lucas CHOLLET
f5e7ee8d4a LibGfx/CCITT: Don't be fooled by black-starting lines
The first marker is always white in CCITT streams, so lines starting
with a black pixel encodes a symbol meaning 0 white pixels. Then, the
decoding would proceed with a black symbol. We used to set the symbol's
color based on `column == 0`, which is wrong in this situation.
2024-02-13 00:37:06 +01:00
Lucas CHOLLET
a4b2e5b27b LibGfx/CCITT: Don't overrun the image width 2024-02-13 00:37:06 +01:00
Lucas CHOLLET
720187623b LibGfx/TIFF: Read and honor the FillOrder tag 2024-02-13 00:37:06 +01:00
Lucas CHOLLET
b9afac0a06 LibGfx/CCITT: Consider the UseFillBits option 2024-02-12 14:08:56 +01:00
Lucas CHOLLET
9ae17e3a7a LibGfx/CCITT: Align the output stream on byte-boundary after each line
This makes the CCITT decoder in line with what the TIFF decoder is
expecting.
2024-02-12 14:08:56 +01:00
Lucas CHOLLET
42f29b9670 LibGfx/TIFF: Also seek after reading the last tag
The `read_tag()` function is not mandated to keep the reading head at a
meaningful position, so we also need to align the pointer after the last
tag. This solves a bug where reading the last field of an IFD, which is
placed after the tags, was incorrect.
2024-02-08 09:03:46 -07:00
Lucas CHOLLET
a43793ee0d LibGfx/TIFF: Explore underlying Image File Directories
Every TIFF containers is composed of a main IFD. Some entries of this
one can be a pointer to a sub-IFD. We are now capable of exploring these
underlying structures. Note that we don't do anything with them yet.
2024-02-08 09:03:46 -07:00
Nico Weber
d7f04c9aa1 LibGfx/JPEGLoader: Make byte_offset() return offset from start of stream
JPEGStream::byte_offset() now returns an offset relative to the start
of the stream, instead of relative to the buffered part.

No behavior change except if JPEG_DEBUG is set.
2024-02-08 07:45:34 -07:00
Nico Weber
e269526020 LibGfx/PNM: Remove two fixmes
bab2113ec1 made read_whitespace() return ErrorOr, which makes this
easy to do.

(7cafd7d177, which added the fixmes, landed slightly after bab2113ec1,
so not quite sure why it wasn't like this immediately. Maybe commit
order got changed during review; both commits were in #17831.)

No behavior change.
2024-02-02 08:26:40 +00:00
Nico Weber
1dfd68c798 LibGfx/JPEGWriter: Make it possible to write CMYKBitmaps
We always store CMYK data as YCCK, for two reasons:

1. If we ever want to do subsampling, then doing 2111 or
   2112 makes sense with YCCK, while it doesn't make sense
   if we store CMYK directly.
2. It forces us to write a color transform header. With a color
   transform header, everyone agrees that the CMYK channels should
   be stored inverted, while without it behavior between decoders
   is inconsistent. (We could write an explicit  color transform header
   for CMYK too though, but with YCCK it's harder to forget since the
   output will look wrong everywhere without it.)

initialize_mcu() grows a full CMYKBitmap override. Some of the
macroblock traversal could probably shared with some kind of
for_all_macroblocks() type function in the future, but the color
conversion math is different enough that this should be a separate
function.

Other than that, we pass around a mode parameter and make a few fuctions
write 4 instead of 3 channels, and that's it.

We use the luminance quantization and huffman tables for the K
channel.
2024-02-02 07:19:18 +01:00
Nico Weber
e8788d4023 LibGfx/JPEGWriter: Move image data writing into new add_image() function
No behavior change.
2024-02-02 07:19:18 +01:00
Nico Weber
e449dba85b LibGfx/JPEGWriter: Move header writing into new add_headers() function
No behavior change.
2024-02-02 07:19:18 +01:00
Nico Weber
4e637fa1d2 LibGfx/JPEGWriter: Pass IntSize instead of Bitmap to add_frame_header()
No behavior change.
2024-02-02 07:19:18 +01:00
Nico Weber
38526414b0 LibGfx/JPEGWriter: Add a named constant in add_scan_header()
No behavior change.
2024-02-02 07:19:18 +01:00
Nico Weber
4a8e7f44dc LibGfx/JPEGWriter: Add a named constant in add_frame_header()
No behavior change.
2024-02-02 07:19:18 +01:00
Nico Weber
ad7d25f089 LibGfx/JPEGWriter: Make vertical_macroblocks a local
It's only used in one function.

No behavior change.
2024-02-02 07:19:18 +01:00
Nico Weber
8964a52fe0 LibGfx/JPEGWriter: Use ceil_div()
No behavior change.
2024-02-02 07:19:18 +01:00
Nico Weber
69964e10f4 LibGfx+Tests: Improve calculation of restart interval
JPEGs can store a `restart_interval`, which controls how many
minimum coded units (MCUs) apart the stream state resets.
This can be used for error correction, decoding parts of a jpeg
in parallel, etc.

We tried to use

    u32 i = vcursor * context.mblock_meta.hpadded_count + hcursor;
    i % (context.dc_restart_interval *
         context.sampling_factors.vertical *
         context.sampling_factors.horizontal) == 0

to check if we hit a multiple of an MCU.

`hcursor` is the horizontal offset into 8x8 blocks, vcursor the
vertical offset, and hpadded_count stores how many 8x8 blocks
we have per row, padded to a multiple of the sampling factor.

This isn't quite right if hcursor isn't divisible by both
the vertical and horizontal sampling factor. Tweak things so
that they work.

Also rename `i` to `number_of_mcus_decoded_so_far` since that
what it is, at least now.

For the test case, I converted an existing image to a ppm:

    Build/lagom/bin/image -o out.ppm \
        Tests/LibGfx/test-inputs/jpg/12-bit.jpg

Then I resized it to 102x77px in Photoshop and saved it again.
Then I turned it into a jpeg like so:

    path/to/cjpeg \
        -outfile Tests/LibGfx/test-inputs/jpg/odd-restart.jpg \
        -sample 2x2,1x1,1x1 -quality 5 -restart 3B out.ppm

The trick here is to:

a) Pick a size that's not divisible by the data size width (8),
   and that when rounded to a block size (13) still isn't divisible
   by the subsample factor -- done by picking a width of 102.
b) Pick a huffman table that doesn't happen to contain the bit
   pattern for a restart marker, so that reading a restart marker
   from the bitstream as data causes a failure (-quality 5 happens
   to do this)
c) Pick a restart interval where we fail to skip it if our calculation
   is off (-restart 3B)

Together with #22987, fixes #22780.
2024-01-30 14:50:43 +01:00
Nico Weber
1ed9e597a9 LibGfx/JPEGLoader: Use 1 / sqrt(8) instead of rsqrt(8)
At least on arm64, rsqrt(8) has noticeably worse precision:
https://github.com/SerenityOS/serenity/issues/22739#issuecomment-1912909835
2024-01-30 10:02:33 +01:00
Nico Weber
6971ba35d5 LibGfx+Tests: Support grayscale jpegs with 2x2 sampling and MCU reset
Non-interleaved files always have an MCU of one data unit.

(A "data unit" is an 8x8 tile of pixels, and an "MCU" is a
"minium coded unit", e.g. 2x2 data units for luminance and
1 data unit each for Cr and Cb for a YCrCb image with
4:2:0 subsampling.)

For the test case, I converted an existing image to a ppm:

    Build/lagom/bin/image -o out.ppm \
        Tests/LibGfx/test-inputs/jpg/12-bit.jpg

Then I converted it to grayscale and saved it as a pgm in Photoshop.
Then I turned it into a weird jpeg like so:

    path/to/cjpeg \
        -outfile Tests/LibGfx/test-inputs/jpg/grayscale_mcu.jpg \
        -sample 2x2 -restart 3 out.pgm

Makes 3 of the 5 jpegs failing to decode at #22780 go.
2024-01-30 05:35:22 +01:00
Nico Weber
002eb0ad03 LibGfx/JPEGLoader: Use ceil_div
No behavior change.
2024-01-29 13:33:10 -05:00
Nico Weber
393fb1f7b9 LibGfx/JPEGLoader: Pass SamplingFactors instead of Component to function
That's all this function reads from Component.

Also rename from validate_luma_and_modify_context() to
validate_sampling_factors_and_modify_context().

No behavior change.
2024-01-29 13:29:44 -05:00
Nico Weber
b37f3c86fd LibGfx/WebPLoaderLossless: Fix grammar-o in comment 2024-01-29 09:12:06 -05:00
Nico Weber
d89e42902e LibGfx/JPEGLoader: Extract inverse_dct_8x8() function
No behavior change.
2024-01-27 10:21:33 +00:00
Nico Weber
494fc1234e LibGfx/JPEGWriter: Rename y<=>x, u<=>v
Usually x and u go horizontally and y and v go vertically.

No behavior change.
2024-01-27 10:20:56 +00:00
Nico Weber
c3c7707de4 LibGfx/JPEGWriter: Don't throw away highest-frequency component in FDCT 2024-01-26 20:19:55 -05:00
Nico Weber
db4982bd29 LibGfx+image: Implement writing ICC profiles to jpeg files 2024-01-26 14:03:20 -05:00
Nico Weber
83ab9f7c2d LibGfx/PNM: Remove unnecessary line
This is already done at the caller decode() in
PortableImageLoaderCommon.h, as pointed out by @LucasChollet at
https://github.com/SerenityOS/serenity/pull/22935#discussion_r1467158789

No behavior change.
2024-01-26 14:53:33 +01:00
Nico Weber
556addc5cb LibGFX/PAM: Allow reading CMYK .pam files
These are written by `mutool extract` for CMYK images.

They don't contain color profiles so they're not super convenient.
But `image` can convert them (*) to sRGB as long as you use it with
`--assign-color-profile` pointing to some CMYK icc profile of your
choice.

*: Once #22922 is merged.
2024-01-26 07:36:53 +01:00