This was blocked because it can be used for cross-protocol attacks on
some network printers. However, it's also used by the web platform
tests. One can argue that getting WPT working is more important than
theoretical attacks on poorly configured printers.
I noticed while debugging a fully downloaded page that it was trying
to preconnect to a file:// host. That doesn't make any sense, so let's
add a tiny bit of logic to ignore preconnect requests for file: and
data: URLs.
The resource:// scheme is used for Core::Resource files. Currently, any
users of resource:// URLs in Ladybird must manually create the Resource
and extract its data. This will allow for passing the resource:// URL
along for LibWeb to handle.
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).
This commit is auto-generated:
$ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
Meta Ports Ladybird Tests Kernel)
$ perl -pie 's/\bDeprecatedString\b/ByteString/g;
s/deprecated_string/byte_string/g' $xs
$ clang-format --style=file -i \
$(git diff --name-only | grep \.cpp\|\.h)
$ gn format $(git ls-files '*.gn' '*.gni')
Pages like the new tab page, error page, etc. all belong solely to
Ladybird, but are scattered across a couple of subfolders in Base. This
moves them all to Base/res/ladybird.
This is a hack on top of a hack because Workers don't *really* need to
have a Web::Page at all, but the ResourceLoader infra that should be
going away soon ™️ is not quite ready to axe that requirement for
cookies.
Some websites, such as m.youtube.com, sniff for a version after the
Android OS version. Chrome has recently taken to always reporting
Android 10, so let's follow suit.
Before page_did_create_main_document() only initialized ConsoleClient
for top-level browsing context which means that nested browsing context
could not print into the console.
With this change, ConsoleClient is initialized for documents created
for nested browsing context too. One ConsoleClient is shared between
all browsing contexts within the same page.
This is defined when building on GNU/Hurd, the GNU operating system with
the Hurd as its kernel (as it was designed originally, before Linux and
GNU/Linux came to be).
Also, define the corresponding part of User-Agent.
As it turns out, there are popular User-Agent blacklists out there with
the string "LibWeb" in them. Such entries have been added long before
our LibWeb existed, so "LibWeb" has presumably been used by some bot
that people got tired of.
Trying to chase down everyone who has installed these blacklists is
obviously a losing battle, so this patch simply removes the engine part
of our default UA string.
The error.html page now uses the resource_directory_url this
variable contains the relative path to /Base/res/ on the host
system as a file:// url. This is needed for future pages to load
resource files like icons. For the error.html page this was not
really needed because it lies over this own URL in FrameLoader.cpp.
Before, navigator.platform would always report the platform as "Serenity
OS", regardless of whether or not that was true. It also did not include
the architecture, which Firefox and Chrome both do. Now, it can report
either "Linux x86_64" or "SerenityOS AArch64".
Parsing 'data:' URLs took it's own route. It never set standard URL
fields like path, query or fragment (except for scheme) and instead
gave us separate methods called `data_payload()`, `data_mime_type()`,
and `data_payload_is_base64()`.
Because parsing 'data:' didn't use standard fields, running the
following JS code:
new URL('#a', 'data:text/plain,hello').toString()
not only cleared the path as URLParser doesn't check for data from
data_payload() function (making the result be 'data:#a'), but it also
crashes the program because we forbid having an empty MIME type when we
serialize to string.
With this change, 'data:' URLs will be parsed like every other URLs.
To decode the 'data:' URL contents, one needs to call process_data_url()
on a URL, which will return a struct containing MIME type with already
decoded data! :^)
This makes the loader more agnostic.
Additionally, this allows us to load tab in Ladybird with a 'data:' URL
containing parameters, as a Resource will now call
`mime_type_from_content_type` to extract the content type from MIME. :^)
This commit changes the variables used to represent the size and
progress of downloads from u32 to u64. This allows `pro` and
`Browser` to report the total size and progress of a download
correctly for downloads larger than 4GiB.