String or maybe special Utf8View struct being aware of multi byte chars and graphemes #193
Locked
mzaks
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I wrote a blog post where I discuss a possible way of counting chars in a String.
Counting chars with SIMD in Mojo
From my experience implementing BigData and ML pipelines, it is very often that users need to truncate strings by byte size and sadly they often do it wrong, producing corrupt strings or semantically wrong string as they cut through a grapheme cluster. In my opinion it would be nice if Mojo String would provide a method for proper String truncation and number of chars + number of graphemes in a string.
Beta Was this translation helpful? Give feedback.
All reactions