1
Fork 0
mirror of https://github.com/RGBCube/serenity synced 2025-06-01 10:08:10 +00:00

LibWeb: Use UTF-16 code unit offsets in Range::to_string

Similar to another problem we had in CharacterData, we were assuming
that the offsets were raw utf8 byte offsets into the data, instead of
utf16 code units. Fix this by using the substring helpers in
CharacterData to get the text data from the Range.

There are more instances of this issue around the place that we will
need to track down and add tests for, but this fixes one of them :^)

For the test included in this commit, we were previously returning:

llo💨😮

Instead of the expected:

llo💨😮 Wo
This commit is contained in:
Shannon Booth 2024-01-04 10:27:25 +13:00 committed by Andreas Kling
parent ee431e6911
commit e9dfa61588
3 changed files with 28 additions and 6 deletions

View file

@ -0,0 +1,14 @@
<body><p id="p1"><b>Hello💨</b>😮 World</p>
<script src="../include.js"></script>
<script>
test(() => {
const p1 = document.getElementById("p1");
const hello = p1.firstChild.firstChild;
const world = p1.lastChild;
const range = document.createRange();
range.setStart(hello, 2);
range.setEnd(world, 5);
println('');
println(range.toString());
});
</script>