Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle composed characters in evil-range #1664

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

justbur
Copy link
Member

@justbur justbur commented Aug 22, 2022

This is a low-level fix for the composed characters problem (see #1656). The idea is to prevent evil-range from returning a range that lands inside a sequence of composed characters. I think this makes sense, because evil-range already takes care to return a valid range through calling evil-normalize-position, and having the range (beg end) start or end inside a sequence of hidden characters, seems like it should be considered invalid.

If this seems too aggressive, I could see hiding this feature behind a flag.

Also added some tests.

A composed character is a way of visually replacing a string of characters with
a single glyph. If beg in the call to evil-range lies in this hidden sequence,
adjust the position to the beginning of the sequence. Similarly, make sure end
comes at the end of any such hidden sequence. The purpose is to prevent evil
from unintentionally modifying part of a hidden sequence of characters.
Account for beg > end case and case where beg or end are nil
@tomdl89
Copy link
Member

tomdl89 commented Aug 22, 2022

Looks fine to me, although I don't know much about composition. The tests look good, but maybe something using evil-test-buffer which fails without this and succeeds with it would illustrate why ending up inside a sequence of hidden characters is bad. My assumption is that selecting and e.g. deleting halfway through a composition would leave one of the components behind? And although I don't much about composition, I can see that wouldn't be desired behaviour.

@justbur
Copy link
Member Author

justbur commented Aug 22, 2022

Sure, I can add more tests.

FWIW, here's the basic idea. If you have a buffer with text like

\alpha

you can use prettify-symbols-mode which uses composition behind the scenes to visually make the string \alpha appear as

α

anywhere in the buffer. However, the change is only cosmetic and prior to this evil only "saw" and acted on the string \alpha (up to internal adjustments made by emacs). If you, for example, put the cursor in the first position and hit x in normal mode, evil would delete just the \ character, leaving

alpha

because after hitting x evil calculates a range of characters to delete and since it does not know about composition it just expands the range one character to the right and deletes the single \.

After this PR, evil-range will not stop inside the \alpha string because it is composed, and it will expand the range all the way to the end of \alpha deleting the entire composed character.

By the way, it is a little presumptuous to assume that the user wanted x to delete all of \alpha here, but I think it makes the most sense.

@tomdl89
Copy link
Member

tomdl89 commented Aug 22, 2022

Thanks for that great explanation. When I'm in emacs-state (or for that matter, insert-state, which doesn't map delete differently) and press delete before \alpha I get alpha so presumably emacs devs have decided the current behaviour is desirable? What I'm getting at here is: is this an evil feature, or is it a package which allows easier manipulation of composed text? Does vim have an equivalent of composition and handle it more like this PR? BTW I could still imagine this being behind a feature flag if the answer to those questions is no. I can see that it could be helpful :)

@justbur
Copy link
Member Author

justbur commented Aug 22, 2022

Well... it's complicated. I brought up x because I think it's the harder one to swallow. If you try to use a region to delete \ in emacs state, it won't let you select just the \ character because emacs always moves the point outside of these composed sequences in the command loop. In other words, emacs never lets you move point inside one of these sequences. So in that sense, emacs treats them as single units. However, delete-char does not treat the whole character sequence as a single unit, like you saw. On the evil side, everything is kind of a region behind the scenes. Even when you ask to delete a character with x it creates something analogous to a region behind the scenes, causing this change to evil-range to affect even x.

We could try to stay more faithful to the emacs behavior and only make the required changes for regions, but I'm not sure if that is the right answer here. I found this plugin for vim, but it opens up all of the composed characters on the line, so I can't check the behavior there.

I personally like the behavior in this PR, because it lets me pretend that all of these latex macros are just single unicode characters, but hiding the behavior behind a flag would be fine with me. One issue with the flag is that it's going to be hard to make it obvious what it means to an average user.

@tomdl89
Copy link
Member

tomdl89 commented Aug 22, 2022

Cool - all understood. I think flags which are hard to understand to the average user are fine as long as they are off by default and have a good doc-string.
I just pulled your branch and turned on prettify-symbols-mode and noticed that when I press backspace in insert mode after a λ (which is pretified lambda in elisp files) it sometimes ends up looking like lamb|a (where | is the cursor). To repro: If I do vy on a lambda, and paste elsewhere, the cursor ends up after it (whereas usually the cursor would end up on it). If I press i to enter insert mode and then backspace, I end up in that situation. If I press h instead (in normal mode) the cursor doesn't appear to move until I've pressed it the number of letters in the actual word. None of these things happen in master currently. I hate to be a downer, but I think these things need fixing before we merge (and probably a fair bit more manual playing around, plus tests). lmk what you think.

@justbur
Copy link
Member Author

justbur commented Aug 23, 2022

Thanks for checking. I'm happy to tweak or improve as necessary. I also think hiding it behind a flag makes sense for now. I did try to reproduce what you were talking about, but had some difficulty.

To repro: If I do vy on a lambda, and paste elsewhere, the cursor ends up after it (whereas usually the cursor would end up on it). If I press i to enter insert mode and then backspace, I end up in that situation.

What seems to be happening is that after the paste on a new line the cursor is in the last position of the line. evil then adjusts the cursor back one position, which is inside the composed lambda, and emacs never readjusts it outside of the lambda.

I hate to be a downer, but I think these things need fixing before we merge (and probably a fair bit more manual playing around, plus tests). lmk what you think.

No problem. I'm also going to put it behind a flag that makes it and experimental feature.

Allows user to disable special handling of composed characters. See the
docstring for more information.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants