1
Fork 0
mirror of https://github.com/RGBCube/serenity synced 2025-05-31 11:38:11 +00:00

LibPDF: Add missing character quirk for WinAnsiEncoding fonts

Fonts with the encoding name "WinAnsiEncoding" should render missing
characters above character code 040 (octal) as a "bullet" character.

This patch adds Encoding::should_map_to_bullet(char_code) which is then
called by char_code_to_code_point() to check if the given char code
should be displayed as a bullet instead.

I didn't have a good way to test this, so I've only verified that it
works by manually overriding inputs to the function during the rendering
stage.

This takes care of a FIXME in the Annex D part of the PDF specification.
This commit is contained in:
Andreas Kling 2022-11-19 20:23:18 +01:00
parent f4f5b045ca
commit d6a3be1615
4 changed files with 20 additions and 1 deletions

View file

@ -115,6 +115,7 @@ NonnullRefPtr<Encoding> Encoding::windows_encoding()
encoding->m_name_mapping.set(#name, name##_code_point);
ENUMERATE_LATIN_CHARACTER_SET(ENUMERATE)
#undef ENUMERATE
encoding->m_windows = true;
}
return encoding;
@ -170,4 +171,12 @@ CharDescriptor const& Encoding::get_char_code_descriptor(u16 char_code) const
return const_cast<Encoding*>(this)->m_descriptors.ensure(char_code);
}
bool Encoding::should_map_to_bullet(u16 char_code) const
{
// PDF Annex D table D.2, note 3:
// In WinAnsiEncoding, all unused codes greater than 40 (octal) map to the bullet character. However, only
// code 225 (octal) shall be specifically assigned to the bullet character; other codes are subject to future re-assignment.
return m_windows && char_code > 040 && !m_descriptors.contains(char_code);
}
}