wc: Do a chunked read with proper UTF-8 handling

This brings the results mostly in line with GNU wc and solves nasty behavior with long lines.
2025-07-28 03:27:44 +00:00 · 2021-08-25 13:26:44 +02:00 · 2021-08-25 13:26:44 +02:00 · 6f7d740592
commit 6f7d740592
parent 48437fc49d
8 changed files with 105 additions and 138 deletions
--- a/tests/fixtures/wc/UTF_8_weirdchars.txt
+++ b/tests/fixtures/wc/UTF_8_weirdchars.txt
@ -0,0 +1,25 @@
+zero-width space inbetween these: xx
+and inbetween two spaces: [  ]
+and at the end of the line: 
+
+non-breaking space: x x [   ]  
+
+simple unicode: xµx [ µ ] µ
+
+wide: xｗx [ ｗ ] ｗ
+
+simple emoji: x👩x [ 👩 ] 👩
+
+complex emoji: x👩‍🔬x [ 👩‍🔬 ] 👩‍🔬
+
+Ｈｅｌｌｏ, ｗｏｒｌｄ!
+
+line feed: xx [  ] 
+
+vertical tab: xx [  ] 
+
+horizontal tab: x	x [ 	 ]
+this should be the longest line:
+1234567	12345678	123456781234567812345678
+
+Control character: xx [  ]