-
Notifications
You must be signed in to change notification settings - Fork 325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use HarfBuzz's advances instead of FreeType's #639
base: master
Are you sure you want to change the base?
Use HarfBuzz's advances instead of FreeType's #639
Conversation
@@ -98,7 +96,7 @@ const FallbackFontGlyphMap& FontFaceHandleHarfBuzz::GetFallbackGlyphs() const | |||
} | |||
|
|||
int FontFaceHandleHarfBuzz::GetStringWidth(StringView string, const TextShapingContext& text_shaping_context, | |||
const LanguageDataMap& registered_languages, Character prior_character) | |||
const LanguageDataMap& registered_languages, Character /*prior_character*/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to note is that this change will make the prior_character
parameter unused.
Originally, this character was used to calculate any extra kerning that may have existed prior to this string being rendered (for example, in cases where a much larger, continuous string was split into several renders).
One possible way to reïmplement this would be to create a new string that is a combination of the prior character and the first character of the current string. We could shape this new string and then query the offsets of the second shaped glyph (if it exists) and use those offsets when positioning the first character. I'll leave it up to you if this is something you want to implement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might cause some issues in the text inputs especially. When we select text there, we break a sentence up into pre-selection, selection, and post-selection parts. If the kerning between these parts are not the same when changing the selection, then the text will move around a bit.
This may also affect where we decide to place the text cursor from a mouse position. And also for breaking up text in normal text flow, there might be some differences.
When glancing over the harfbuzz API, I think I saw some of them taking whole strings for prior text. This is probably something that would be a lot more accurate if we used this with harfbuzz, since harfbuzz can combine several characters (code points) into single glyphs.
I think this change as it stands is a bit problematic for the above reasons. Maybe it is better to use Freetype kerning until we have solved this issue properly? Maybe you could test it with a text area and a font with kerning, to see how problematic it is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent some time trying to resolve this but to no avail. I couldn't find any HarfBuzz functions that allow you to provide prior text or anything similar.
Firstly, I added the prior character to the shaping buffer before the rest of the text (and then ignored its advance when measuring). However, it became difficult to determine which shaped glyphs were part of the prior character and which were not. There was also the issue of ligatures. The font supports the ti
ligature (where they become a single glyph). If I were to have a prior character of t
and a first character of i
, HarfBuzz would shape them into the ti
ligature. Then, if I were to ignore measuring the first shaped glyph (which would normally correspond to the prior character), it would skip both the t
and i
entirely (since they are now a single glyph).
I also tried creating a second, smaller shaping buffer that only consisted of the prior character and first character of the string but ran into similar problems listed above.
I also reïnstated the FreeType kerning function and used that. It worked fine for the “LatoLatin” font (since it supports the kern
table), but this method won't work for fonts that do not include a kern
table (EB Garamond is one such font). It also has the same aforementioned ligature issues.
I did come across hb-font-get-glyph-kerning-for-direction
, but I looked at its source and it only returns legacy kerning values (such as those from the kern
table), making it functionally similar to FreeType's kerning functions.
I found this old issue that was posted by someone solving a similar problem. However, it seems that this is a much more complicated issue, and one that HarfBuzz doesn't seem to solve—some shaping results can require multiple characters of context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the one I believe I was thinking of: hb_buffer_add_codepoints()
(looks like this also applies to _add_utf8
):
When shaping part of a larger text (e.g. a run of text from a paragraph), instead of passing just the substring corresponding to the run, it is preferable to pass the whole paragraph and specify the run start and length as item_offset and item_length , respectively, to give HarfBuzz the full context to be able, for example, to do cross-run Arabic shaping or properly handle combining marks at stat of run.
I'm not sure if this is actually useful to help solve this problem though. It does indeed sound like a complicated problem. I came to remember this blog post about text shaping and ligatures, which seems like a similar problem: https://faultlore.com/blah/text-hates-you/#style-can-change-mid-ligature
Once we get to text editing, there is also the problem of placing the text cursor, and moving them properly between glyphs and such: https://lord.io/text-editing-hates-you-too/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah. That function is for adding text to a buffer. When it comes to shaping, you process the entire buffer in one go. One method to try could be to do this and then get the shaped glyphs for the segment we care about, but the main problem is actually identifying the start and end of said segment. Regardless, it seems as if we'll need more than just the prior character to assist herewith.
In the meantime, what would you like me to do with this pull request? I could reïmplement the original FreeType kerning functions and use the value it returns. It won't be perfect, but it'll provide a decent enough estimate for now. I have tried selecting text in input elements, and whilst the text does jump around a bit if you displace kerning or cut a ligature, it is only a few pixels, and I wouldn't say it's that noticeable unless you're specifically looking for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, on English the movement is relatively subtle, although not something I would use in production. I also tested on Hindi and Arabic, and they are very janky. I also tested this on master, and it's pretty much the same there. We don't really do anything to handle right-to-left, so selection on Arabic does weird things. And while Hindi at least works, it's moving about a lot. I guess the main reason is that we don't consider more than one character back. And also, we're allowed to move between code points that become merged glyphs, which allows you to corrupt these glyphs. So yeah, a lot of things to work on before this becomes very stable.
This PR definitely is the right direction. And considering it's not considerably worse, and it does look a lot better in normal rendering, I'm okay with merging this if you agree with that. I think people in the mean time will just have to be careful with input fields if they want to use this font engine. Basically, one would need to use a font without any kerning here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right. I should have tested languages other than English in the input fields. I tried selecting Arabic and Hindi, and I can see what you mean by how janky they are. I also tried selecting these languages on the master branch, and these problems still occur—so this issue definitely transcends this pull request.
I'm fine with it being merged if you are, since most of these issues were happening before this PR. I'm now just thinking of ways to resolve this issue. The first option that comes to mind is to give the entire line to the font engine and then determine whole characters for selection using HarfBuzz cluster indices. However, determining position within a string using cluster indices is difficult, and this still has the issue of splitting ligatures (since even though they are two characters, they are a single cluster) and would require a lot of changes to RmlUi's internal text processing. We might end up requiring both the Unicode Text Segmentation algorithm and the Unicode Bidirectional Algorithm.
Thanks a lot for the pull request! It certainly makes a lot of sense to use harfbuzz for advances. So the contribution is very much welcome. I took some screenshots to compare the old kerning with Freetype (master) and Harfbuzz (this PR). Both with the harfbuzz font engine. The new kerning certainly looks a lot more compact. Aside from that, I think the overall spacing looks a bit better generally. But I wonder if it is supposed to be that compact? Maybe it is, it certainly looks better the more I look at it :) |
I think the more-compacted text from HarfBuzz is correct. I have compared it to a few browsers, and it matches their renders. In the FreeType render, you can see a few issues. For example, the spacing between the letters in the word “consectetur” on the first line is irregular. Likewise, the word “magna” at the end of the second line has its ‘g’ and ‘n‘ almost touching each other (which definitely shouldn't be happening). I looked at the internal tables of the “Poppins” font, and it turns out its |
Oh I see. Yeah, those two examples you bring up definitely look better on harfbuzz, and overall too. Not having any kerning on freetype at all explains the big difference, thanks for looking in to it. |
This pull request alters the HarfBuzz sample to use the shaped advances instead of FreeTypes. The main reason to do this is because it provides more accurate kerning.
OpenType has two main ways of providing kerning information: either with a simple
kern
table or with the more modernGPOS
feature.GPOS
is much more sophisticated and can contain context-dependent information (such as position in a string or locale data). Some fonts might support both, and some will only supportGPOS
.FreeType can only retrieve kerning from the
kern
table, whereäs HarfBuzz can retrieve kerning from either option (which it then adds to the advances), so it is more correct to use HarfBuzz's advances where possible.This PR also removes all the old FreeType kerning functions from the sample (which are no longer needed).