We were previously assuming that the input offsets and lengths were all
in raw byte offsets into a UTF-8 string. While internally our String
representation may be in UTF-8 from the external world it is seen as
UTF-16, with code unit offsets passed through, and used as the returned
length.
Beforehand, the included test included in this commit would crash
ladybird (and otherwise return wrong values).
The implementation here is very inefficient, I am sure there is a
much smarter way to write it so that we would not need a conversion
from UTF-8 to a UTF-16 string (and then back again).
Fixes: #20971
The existing implementation has some pre-existing issues where it is
incorrectly assumes that byte offsets are given through the IDL instead
of UTF-16 code units. While making these changes, leave some FIXMEs for
that.
NewAKString is effectively the default for any new IDL interface, so
let's mark this as the default behavior. It also makes it much easier to
figure out whatever interfaces are still left to port over to new AK
String.
The intent is to use these to autogenerate prototype declarations for
Window and WorkerGlobalScope classes.
And the spec links are just nice to have :^)