-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Call for possible collaboration #2
Comments
Goal overview of inputmanglerWorking in current version
Planned
Nice to have if feasable
Frame
|
Have you considered contributing to the existing ones? I'll probably invest my time into maintaining the 2.0 (beta right now) release coming in February |
I am still considering that and its part of what this thread is about. Project descriptions tell something about what the current state is, but little about what is planned. Having everyone summarize their goals and priorities would help a lot to clear this up. |
Ok For input-remapper the current goal is to finish 2.0 up. The current work on that is happening on the beta branch: https://github.com/sezanzeb/input-remapper/tree/beta. After that pretty much any input can be mapped to anything. For example mouse movements to joysticks. It will feature an overhaul of the GUI to support all that without editing configs. After the release, people might discover bugs, since a lot of new stuff will be released. See sezanzeb/input-remapper#177 for information about contributing, and https://github.com/sezanzeb/input-remapper/blob/beta/readme/development.md for some technical details. Works:
If input-remapper doesn't work, then it is usually because something is fundamentally broken or impossible as of now. But it seems to be quite stable during operation. There are tons of automated tests.
I like to think it is
input-remapper will never support anything other than linux probably Somewhat works:
Via third party software: https://github.com/DreadPirateLynx/input-remapper-xautopresets. This needs to be individual for each Desktop Environment. There is no solution that works for all Wayland DEs. It's easy to do in X11 apparently.
This works for X11, gnome on wayland, plasma on wayland, but other DEs that run on wayland may not support it properly. Input-remapper has to rely on using
Not causing any issues, but CPU usage can go up to 5% on my computer during usage (on a single core). input-remapper-service has never been profiled properly, there might be potential for optimization.
Key logging is possible or a few minutes while the GUI is open. There is no way around that. Because information has to go from a privileged service to the unprivileged GUI via a pipe to record input. Other than that, I don't think input-remapper is leaking input anywhere during normal operation. Doesn't work:
Because the daemon runs as root, which is a security problem if mappings trigger commands and challenging to sandbox properly to not cause problems. I'd like to avoid those things. Running external commands is often possible via the DEs settings, and probably sufficient for most users.
updating config is done via the GUI, which just writes to a json file |
Thanks for the info :)
I tried your beta and think the UI has a solid concept and is well done (although some polish is needed - which is to be expected in beta). I find defining output combos difficult though. Autocompletion is a great idea, but recording the output sequence should be better in most cases.
For comparison, I tried it on my computer and it goes up to 9 % for mouse movement and 12 % with a SpaceMouse compared to 1.3 % / 2 % with inputmangler, so there is clearly room for imrovement. |
Yes, not much difference with pypy once the jit compilation has started optimize it
Have you seen the information on the botton of the output editor? It was added for the purpose. If there is no device available to record the output the user wants it might get difficult to set certain mappings. |
You are very welcome to tweak it in Glade and to make a PR
and also to create a new issue to discuss this. Showing how the GUI would have to change, and explaining how the workflow of recording input would change would be helpful there :) |
My two cents: I think that the hardest part of a keymapper project is actually not the implementation, but the design. If the user wanted to have full control over how input maps to output, then there is already python-evdev for that. The disadvantage of python-evdev is that it requires some boilerplate and it is difficult to write scripts that do not suffer from many different edge cases. (Particularly, before I started on evsieve, I had about two dozen python scripts for different things, and regularly observed that writing a script that did thing A was relatively simple, writing a script that did thing B was relatively simple, but writing one that did both A and B was really difficult due to edge cases introduced by their interaction. Relatedly, the big time sink for adding new features to evsieve is not figuring out some way to implement it, but deciding on how that feature should interact with the other features that are already there, and figuring out what would be the most sensible behaviour for every edge case that could come up.) Several projects have started to search for a higher-level way to describe ways to transform events. These higher-level configurations tend to make simple things easier but difficult things harder or impossible. The big question is how flexible you want your configuration language too be: if your configuration is too simple, many things users might want become impossible. If it is too complex, it ceases to offer much advantage over just writing a python-evdev script. Many different projects have struck the balance between simplicity of configuration and versatiliy at different points. Before you start working on implementation details, I think it is important to first of all figure out exactly which kinds of transformations you intend to support and how you intend to present that in an user-friendly way to the user. Since we can't beat python-evdev on versatiliy, we need to beat it on ease-of-use and user-friendliness. Having a user-friendly user interface for your targeted level of versatiliy is where the value of keymapping programs lies. In particular,
I think that whether, how, and which "complex input" you intend to support—along with how you intend to present that configurability to the user—is a fundamental question that needs to be considered before anything else, rather than treated as an afterthought. The answer to this question will impact just about every other part of the development process. |
In our case it would be the "mapping handler" architecture, which is like a pipeline, combined out of multiple handlers that can do different things. As far as we are know it is finished on beta. We'll have to wait and see if someone raises issues about certain things not being possible. It allows for example to combine mouse-movements with button clicks to produce some other output. |
Learning GTK / Glade is out of scope for me, but I'll create issues so you know what I mean.
My plans for the UI are incomplete, but it will involve a TreeView to represent the hierarchy (group -> [subgroubs ..] -> window -> title) while mappings for all devices for that preset should be visible in the same view. Speaking of UI - I'm currently learning QML/Kirigami and have plenty of experience with the rest of Qt (mostly in C++ though), so I might help a bit with your Qt port. More by answering questions though, as it's not a high priority for me right now.
Yeah, I totally agree. One thing I'd like to explore here is which projects have (partially) compatible designs.
Yep. But don't forget about performance. I don't like to have tools running in the background that use up more ressources than they need.. Do any of you have detailed documentation, describing your projects design? |
If I were to create Hawck from scratch, this is the architecture I'd probably go with: A single input-capture-redirect service with a small custom sandboxed VM, runs as a user with access to input, and accepts The system should have access to not just keyboard/mouse/controller input but also many xdg-desktop-portal extensions, preferably the portable ones, and should include some wm-specific functionality that doesn't exist portably for Wayland compositors right now (like currently focused window.) Also random number generator, tty-detection, open-in-browser, etc. As for launching things, I think we could provide functionality for launching Then any GUI-based thing can just talk to this input service, and it should be flexible enough to do whatever one of those GUIs might want to do, and any text-based system can be compiled to the VMs bytecode. I've been thinking about building this service just for fun, but it has ended up on the back-burner for a while because low-level Linux input stuff can be kinda frustrating due to a lack of documentation in a few areas. If anyone else thinks this is a good idea, I'd write a spec for this architecture for reuse in other projects. Of course, 99.9% of users are looking for one of a few select specific things like replacing caps-lock with ctrl/escape, but I still think a highly generic but safe and fast keyboard remapping system is a nice-to-have for the platform. |
I started work on the InputRemapper beta branch a year ago in order to solve my personal needs (using a 3DConnexion SpaceMouse as Joystick). Which somehow escalated into reinventing the whole architecture. That pretty much confirms the concerns raised by @KarsMulder:
That said, I think the current approach can accomplish almost any reasonable remapping (mouse/joystick -> keyboard and mouse <-> joystick) with support for combinations in each case. + macro support to generate complex input-sequences (I think it is possible to make keyboard-> joystick/mouse mappings with macros). There are some limitations:
In general I think it is quite possible to design a common Sevice which is simple to use for simple tasks e.g. remapping of n inputs to one output. But also provides a api for user scripts and more complex behavior. Implementing a good
will make it possible to develop different GUIs or simple scripts which may or may not maintain their own configurations and translate them for the service. |
I sometimes wonder if this limitation can be avoided. Soon mappings will hold the information of their source device, so we could as well just record from all devices at once I guess. Idk.
For performance, if there really are no good optimizations possible, I'd not be very opposed of translating everything to a different language. It probably doesn't matter which one, because python is pretty much one of the slowest widespread languages. Translating the Tests could be a bit tricky sometimes, but they cover a lot of edge cases and past bug reports, so that would be really nice to be able to keep them. But anyway, if someone could do some profiling that would be great.
Also see sezanzeb/input-remapper#500. I thought lua doesn't require a vm to sandbox it, or does it? |
I do not have such documentation written other than the comments interspersed through the source code, but I can give a quick rundown of the major parts: The input system I have benchmarked epoll vs poll and was not able to find any measurable difference in performance. I have not benchmarked how the performance would compare against using LIBEVDEV_READ_FLAG_BLOCKING. I wasn't even aware that was possible when I started writing, and at this point it would be too much hassle to implement it. Argument parsing Event propagation At a first glance, you may think that this use of out-pointers looks like a bad practice that originates from the time of C, and modern programs should just return That said, in hindsight I think that processing multiple events at the same time was a bad design decision that is making some new arguments (most importantly, The The output system (Also, if an input device marked with Threading structure If needed, some additional background threads may be spawned to do tasks that I do not want to delay event handling (i.e. garbage-collecting subprocesses that were spawed using The code is written synchronously (i.e. without using the async feature), for two reasons: (1) in a previous development version that was based on python-evdev before I rewrote it in Rust, I found that using epoll to wait for events had half the latency of using Python's async, and (2) at the time, I heard that the Rust async ecosystem still had several rough edges. I have not benchmarked whether the Rust async feature has the same performance overhead as the Python one, but in the end I think it was the right decision to write synchronous code, because I cannot imagine the codebase becoming cleaner if async was involved. |
Hm.. I sense a wide agreement on Rust - no real suprise here :) Maybe I should write a bit about inputmanglers current architecture (which isn't exactly how I would do it now):
Things I would like to change:
Things I would like to keep:
@snyball
I assume you mean that the user process passes code to the service to execute on a given event, which is then done there. Not that events are passed from system space to user space, which then sends something back to system space. Right?
I sometimes wonder how many people use these things for gaming, compared to those who use them for their intended purpose of 3D-Modelling..
Inputmangler has the same problem. Doing this per event has worked perfectly fine for me for a long time. But recently mouse wheels are sending normal wheel events alongside hi-resolution events, causing double scrolling events. @KarsMulder
I wonder if libevdev causes any measurable overhead compared to direct ioctl calls / device read. This would be an interesting thing to profile. |
Since a lot of relevant people are here, this might be a good place to discuss this. I'd agree with the above quoted assertion that 99.9% of users just want to do that one thing and be done with it, but that's because the some 25% of all users, who need more (there's a lot of us crippled dudes around), just can't use linux, so they don't. Back in windows-land, it's not even a bat of an eyelid to be running 5 or 6 input handling tools like this simultaneously. Nobody talks about it because it's normal. In linux-land, nobody talks about it because it's impossible. I mistakenly thought that problem has been solved, moved back to linux, and I've found out I was wrong. It's a physically painful mistake, but I'm too far in to go back to windows now, so I want to do what I can to get this sorted. Since X doesn't support a lot of video features I need, I had been waiting for Wayland tools to mature, so that I could do all of the input mangling I need, which I could do in Windows. I kept my ear to the ground, and over time I heard about many new projects which were wayland-compatible replacements for existing X tools which I used to use in linux. xdotool gave birth to ydotool, some KWin shortcuts features offered keybinding ability to run scripts (that's AutoHotKey taken care of) and finally, the most important one for me, mouse-actions came along to replace easystroke(X)/StrokesPlus(Windows) for mouse gestures. So, I figured it was a safe time to jump ship back to Linux (I can't stand Windows, so this was exciting for me!) I need to rebind and disable keys and key-combos, bind key combos to external commands, adjust analog input (joystick) sensitivity curves, re-map mouse buttons, map foot switches to scripts, and mouse gestures are an absolute MUST. Why:? Because I'm physically disabled. So all these accessibility tools aren't just "nice-to-haves', they're 'must haves'. And each of the presently available tools on linx/wayland works fantastically. But once I tried to use more than one, I hit a wall, and it's a hard one. While everyone was talking about how the lack of Wayland replacements for classic tools like xdotool had been solved, nobody was talking about the fact that you can't use them all. Only one. Pretty much (actually I think it's literally) every Wayland input device handler, takes the same approach - go a layer lower in the stack than X11 did, and exclusively grab the evdev devices. It's a simple solution t the problem but short sighted in that it means you get to choose one and only one accessibility tool, because one effectively locks out all the others. It doesn't seem practical or realistic that any single tool should be the all-singing all-dancing solution to every input device accessibility requirement, so the thing that is really needed from all of you tagged in this thread, is to find a way to get your tools to play nicely together. I'm not really sure of the right way to go about resolving this issue, but I am sure that it means that, at least in it's present form, Wayland is an accessibility failure from the get-go. And I see a lot of people who should be involved in a conversation about this, in this one thread, so I'd be interested to hear your thoughts. Because if you're going to discuss collaboration, this is the first thing that needs to be addressed. None of you could be expected to write a single tool that does everything, nor should the user be limited to that one tool, so finding a way to make them all work simultaneously is step 1 in collaborating (I mean, the word 'collaborating' literally means 'working together' and most of your apps won't work together 😄 ) Since it's been almost a year, I'll do that ping again, apologies if this causes you any consternation: @sezanzeb @samvel1024 @shiro @rvaiya @snyball @KarsMulder @jersou Speaking in terms of the solution to this problem....it strikes me that what's required here, is a new layer between evdev and these applications, which would exclusively grab the evdev device as these apps do, and then allow these 'client' applications, rather than exclusively locking the devices, to subscribe to callbacks from the intermediary layer, to handle input events; thus allowing a single input event to be handled by multiple applications. Perhaps there's a better way to deal with it, which is why I'm asking you for your thoughts. |
I like this idea This avoids grabbing, while still allowing applications to hide events from the desktop-environment. This way, multiple mapping-tools can map the same event to whatever they like. Those new pipes that applications read from could be compatible with tools like python-evdev by behaving exactly like uinputs/InputDevices, they are just at a different path, and they ignore requests for grabbing. Allowing existing mapping tools to continue to work, as long as they discover those new devnodes. The new-layer has to wait for each mapping-tool to report that it is done handling the event, and only then decide if the event should be forwarded or not. It won't forward it, if one of the tools reported that it is suppressed. If a service/layer like this is written, then please
|
Would like to see someone make a proof of concept for this to test performance, lots of piping/polling going on, not sure how much latency this adds. Maybe a wayland protocol would be a good place to put this, not sure if gnome/kde would pick it up though. |
Given that noone ever comlained about input-remapper having too much of a latency, even though it's written in python and never has seen any sort of optimization, I doubt it will be significant. But that is just my gut-feeling. |
I had to add to the proposal above, that the new-layer has to keep track of tools that are reading, in order to wait for each one of them to finish processing in order to know if the event is suppressed. I don't know if this is possible. Do owners of pipes have a way of knowing which processes are reading from it? |
I think it is not possible to accomplish the above with just pipes because anything you write to a pipe can only be read by a single process anyway. Those "new readable pipes" would have to become Unix domain sockets instead. With sockets, it becomes also becomes possible to track which processes are listening as a nice side-effect.
I haven't tried implementing the proposed scheme, based on how quickly I've managed to get event round-trips to work in evsieve, I'd expect a latency of ~0.15ms when zero input remappers are in use (one round-trip; I assume that most of that latency comes from waiting for the scheduler to give the program some CPU time), and at least ~0.45ms when one or more input remappers are in use (which involves three round trips.) An inefficient Python implementation takes ~0.5ms for a single round-trip. Assuming you're gaming on a cutting-edge 240 Hz monitor, a latency of 0.45ms would mean that there is about 11% chance that an input event gets delayed a single frame. Which is an acceptable delay in case you're actually using remappers. For users not using remappers, I can however imagine that any scheme that proposes adding 0.15ms of latency to Wayland as a background service would receive more flak than dbus. Some people still don't accept that dbus adds enough value to be worth the couple of megabytes of memory it uses. If we want to go with the above scheme, I think it would greatly help adoption if it was a dynamic service that could be started on-demand when the first program needs it, rather than something the operating system is expected to keep alive whether the user wants it or not. The protocol Of course, libevdev (and python-evdev?) would need patches to be able to work on those sockets. I personally think that the evdev protocol is a bit painful to work with. However, we must remember that the evdev protocol has been crafted by kernel developers which have seen every single crazy input device hardware manufacturers have devised, and the evdev protocol has stood the test of time for quite a while now. I would be skeptical about proposals to replace the input stack that the kernel has built up with a new protocol in a userspace daemon just to make keymapping possible. Alternative solution: can't we solve this in the kernel instead?It is easy to jump to the idea of writing another userspace daemon because you can "just do" and does not require anyone's approval, but I wonder if our effort is better spent submitting patches to the kernel instead? So far, a lot of event mappers for Wayland have decided that grabbing event devices and creating new event devices is a good idea. However, we're discussing creating an abstraction layer over them. This makes sense because there are several drawbacks to the approach of creating new event devices. From the top of my head, the big pain points are:
The kernel folks have already been kind to the keymapping community by giving us tools like uinput and grabbing event devices. And looking at the above list, I think all except the last pain point could be fixed if we had an additional ioctl (say, EVIOCSHADOW) which did the following:
In other words, a kernel ioctl that makes it possible for a program to change the events on an event device without the rest of the system having to notice that event devices are getting created, grabbed, or destroyed. It would solve issue 1, 2, and 3. Issue 4 would remain; solving it would require some kind of extension to the evdev protocol to allow devices to change their capabilities, but that might run into backwards compatibility issues. Issue #5 would remain as well, but is more of a theoretical issue since most computers are single-user nowadays. This way it also would make it easily possible to run 5 or 6 input handling tools simultaneously, since each tool can shadow the input device that was already shadowed by the previous tool without needing those tools to even be aware that there are other tools running as well. Thinking about it, most of the pain points related to grabbing event devices for keymapping stem from the newly created uinput device being a distinct entity from the original device. If we could get a new ioctl that would allow us to sidestep that issue, about 60% of our problems would be solved without requiring a new userspace daemon. |
I'm also worried that the whole extra layer might add too much latency and complexity. If we do that, it should ideally be optional in the sense that it's only used when there are actually multiple tools trying to grab the same device. My gut feeling says that the kernel approach is probably the better idea, but I have to think about that some more... As for issue number 4:
I don't think it's such a big problem. In the first version of inputMangler, I didn't know that uinput could do everything I needed, so I wrote my own kernel-module which simply announced all the events that could make sense for that type of device - no matter whether those events were ever generated or not. SDL (and by extension Steam) seems to differentiate between joystick and controller by looking up the vendor/product id in a database first, then defaulting to controller if the device has 6 axes. So it might be good to convince the SDL devs to reserve certain ranges of product ids for vendor 0x0000 for certain types of virtual devices to prevent issues (I had enough of these with the Spacemouse). Of course, if we were to actually make one backend service to handle all possible input transformations, which has great performance, and so on, all of that might not even be necessary... well .. if it just was that easy.. Until any of this is implemented, maybe there is a workaround... the question is: @pallaswept: do you need to have the same events processed be multiple tools that grab a device? If e.g. you just need tool A to process mouse movements and tool B to process it's buttons, this might be solvable by splitting the events into 2 virtual devices. I think evsieve is currently the only tool that supports this, so that would be the lowest layer. Then tool A could grab the move device and tool B the button device. There might be some problems with tools reading the virtual devices if all of them have the same vendor/product-id, but uinput allows that to be changed. @KarsMulder does evsieve support setting those ids? It might also be neccessary to unite those devices later, but I believe most of the tools here do that anyway. |
Thanks for the clarification
Couldn't get it to work for a stylus, but yeah, it can be figured out somehow usually. I wish it was determined by some sort of enum value instead that is being reported by a device.
If events contain that enum, the system could decide to treat is as joystick movement, while ignoring any device capabilities, couldn't it?
@KarsMulder something like this? When the hardware reports "a", each shadowed device receives the event for "a", and each tool reads "a". What if each tool decides to not map this key and just forward it, will "aa" be written to "Keyboard"? |
Something like this. Imagine that the default setup is like this: (I suppose this is slightly oversimplified since the A physical keyboard emits events to the kernel. The kernel sends those events to an event device keyboard-1. Wayland and other processes on the system can read those events. Now suppose a program "Mapper #1" comes along which issues the hypothetical EVIOCSHADOW on keyboard-1. The kernel will then adjust the topology to become like the following: The kernel stops writing the events from the physical keyboard to keyboard-1. Instead, it writes them to shadow-1, an event device that is only accessible to Mapper #1 and no other part of the system. Mapper #1 get a file descriptor for shadow-1, but shadow-1 does not show up in The program Mapper #1 can now read events from shadow-1 like it can read them from any other event device. If Mapper #1 does nothing, then no events get written to keyboard-1 and the whole system loses access to the keyboard just as if it had been grabbed. Mapper #1 can write events to keyboard-1 like it can write events to any other uinput device. The events it writes to keyboard-1 can be read by Wayland and so on. The mapper scripts do not explicitly announce that they want to drop any particular event, events can simply be dropped as consequence of a mapper script reading an event from a shadow device and then not writing that event to its output device. This is basically the trick of "grab an input device and create another uinput device", except this whole process is invisible to Wayland. Wayland can just keep reading events from keyboard-1 as if nothing happened, whereas with the old method Wayland would have to notice that another input device was created and open it, without even knowing that this new input device was related to another device. When another script, say Mapper #2 also issues EVIOCSHADOW on keyboard-1, the event chain becomes: Just like the events from the keyboard got redirected to shadow-1 when Mapper #1 issued EVIOCSHADOW, a second invocation of EVIOCSHADOW causes the events that Mapper #1 writes to be redirected to shadow-2. This means that all events from the physical keyboard first pass through Mapper #1, then through Mapper #2, and finally back to Wayland and the rest of the system. |
It currently doesn't because I actually wasn't even aware that event devices hat vendor and product ids. I thought that was something that only existed at the USB-device level, but I guess I was wrong. It doesn't seem like a difficult feature to add. I'll get around to it when I figure out what the CLI arguments should be. (Should |
An enum for the device type would be nice :) Hm.. that makes me wonder if we could speed up input events on linux by truncating the timestamp to the final 16 bits. I'm not sure the rest is really needed anyway.
This would also reduce the systems number of virtual devices. We would still need them for events that don't fit into existing ones.. Some things to decide on:
I'd use |
Those sent by Mapper #1. The effect is the same as if Mapper #1 created a virtual device shadow-2 which was subsequently grabbed by Mapper #2.
The shadow-* devices all must have the same capabilities as the original keyboard-1 device. Assuming that keyboard-1 didn't just happen to have an integrated mouse, it is not possible for Mapper #1 to write mouse events to keyboard-1. When Mapper #2 starts and silently replaces keyboard-1 with shadow-2, this transition is supposed to be invisible to both Mapper #1 and to Wayland. As such, Mapper #1 can still not write mouse events to keyboard-1/shadow-2. It would be possible for Mapper #1 to shadow another mouse device (or create a new virtual one) and write mouse events to that device. Either way, Mapper #2 will not be able to observe any mouse events getting emitted by the keyboard device. If Mapper #2 does want to observe mouse events, it should listen to or shadow a mouse device as well.
I imagine that writing events to keyboard-1/shadow-2/whatever would follow the same rules as writing events to any other virtual device: events that do not match the capabilities of the virtual device get silently dropped. It is the job of the mapper script to ensure that it is writes its events to devices that are capable of them. |
This thread is pure gold so far and I want to thank you all sincerely for your input. I hoped but never imagined I'd have such a positive response, thanks so much.
Yes it's pretty frequent. Just to make matters worse, it's also fairly common to need to process the same event (say, pressing the ctrl key) by multiple tools, from multiple devices. Like say, maybe one day I can't use my left hand, so I'll rebind a mouse button to a ctrl key, and I'll need the footswitch to read the ctrl keypress regardless of where it came from, from the keyboard, some other device (bluetooth keyboard), the re-bound mouse button, on-sreen keboard, etc, to modify the footswitch's behaviour, or I might use that same ctrl key to modify the behaviour of some other keybind, in another tool. Just to give a curly example. I'd echo everything that's been said ITT so far. A middle layer is the least reliant on outside support, but it does have shortcomings. I also feel like the kernel is the best place to be doing this, from a functional point of view. A Wayland protocol might also be as functional, but then there's a reliance on its implementation from every compositor, and it might take a very long time to become a reality, or just never happen. I do have similar fears about doing this in the kernel, though. I wonder how hard it would be to get the kernel maintainers interested in such a thing, enough that it could become a reality. Requiring a custom patched kernel would make it somewhat prohibitive. That being said, if it could be a kernel module, then that makes things a lot simpler for the end-user to implement. And yeh, thanks again for this amazing conversation. Your input is priceless. Please forgive my lack of input, I'm mostly just trying to stay out of the way right now :) |
I did that at some point, more or less: https://github.com/sezanzeb/input-remapper/blob/xkb/keymapper/injection/xkb.py However xkbcommon/libxkbcommon#223 (comment)
Which might be impossible with mapper scripts that allow sophisticated user-defined custom scripts |
According to The Wayland Book [CC-BY-SA 4.0]:
So it seems like the protocol already expects clients to be able to deal with changes in the keymap. All we need is a new protocol to tell the compositor that a new keymap should be used. |
That would be nice. Simple wayland input remapping utilities might already exist at this point if this was possible. Regardless of how more sophisticated mapping tools can be made to work with wayland, the above might already be an improvement. So, should we ask wayland developers to consider this? Mailing list idk?
Other events like joysticks are probably not passing through libxkbcommon? Because maybe one could hook a mapping script into libxkbcommon somehow. Maybe libxkbcommon can be modified to provide something like a devnode for reading scancodes and writing characters. But it's probably synchronous and you can't really do anything funky with it (like writing multiple characters with delay), isn't it? |
According to the original draft image, the programs that are likely to change the keymap are also the programs that are likely to change the events themselves: The programs that map the events need to be put in some order in a consistent way, lest there is only some% chance that your whole setup works during any individual reboot. To that end, we need to figure out a way to order the event mapping programs, and that same ordering system will probably be reused for ordering the keymap changes when multiple programs want to modify the keymap. And, of course, we still need to figure out how to authenticate the programs that are or are not allowed to map events or change the keyboard layout. Wayland devs don't want that programs running in a sandbox becomes able to keylog the whole system just because they're allowed to display stuff on screen. The point is that the event mapping protocol and the keymap-changing protocol will probably end up both relying on some common basis. I think it is best to tackle the event-mapping problem and the keymap-changing problem at the same time, than to rush one part of the solution, only to later discover that it doesn't interact nicely with the other half of the solution. In case we end up giving up on finding an event-mapper protocol, then it may be a good time to propose an independent keymap-changing protocol. |
libxkbcommon's API involves the programmer passing individual scancodes to functions like |
&
I think the big question is where to inject unicode.. kernel-level makes no sense to me; XKB-level/libinput might maybe work. Some things to think about:
I think what we maybe should do first, is to find out how IMEs work and whether we can use that to write arbitary characters.
I agree on kernel vs. wayland, but after reading all the new stuff by @KarsMulder, I wonder if we might need to contact the libinput developers as well.
&
As far as I understand, the basic input stuff on wayland is handled by libinput. Compositors are free to implement some more complicated stuff - but that would lead to fragmentation and because of that is probably better left to another library or the end-user applications. Otherwise some stuff will only work on certain DEs. There is one big exception though: libinput does not handle joysticks. Games (and possibly other applications) usually just grab joysticks, bypassing libinput/Wayland. For our purposes, I think libinput is a much better layer to put a mapping API than wayland. It also reads directly from evdev, so it might be the one place where pretty much anything an input mapper would do comes together.
I tried to find out how this works and so far understand that there is a protocol called AT-SPI2 (https://www.freedesktop.org/wiki/Accessibility/) which seems to be implemented by Qt , GTK and possibly other toolkits. When an application starts while some accessibility program is running (not sure how this is checked), that application exports a DBUS interface which exposes metainformation about the GUI as well as some ways to interact with it. See here (https://doc.qt.io/qt-6/qml-qtquick-accessible.html#details) for some information on how it's done in Qt. I don't think the compositor has anything to do with it. |
Just wanted to say that if I seem quiet it's not because I'm ignoring all this, it's because you guys are like, light years ahead of me on this, and I'm kinda following along behind you. I really appreciate all the effort you're putting in and sharing your experience and know-how on this. If I'm quiet, it's not because I'm ignoring all that you're giving, it's because I'm standing in awe and appreciation <3 |
The current Input Method Editor protocolsIt appears that fcitx5 uses the If all we cared about was mapping text input (which is not the case), then the zwp_text_input protocol does offer some nice features:
As far as keyboard mapping goes, this sounds good so far. Now we get to the bad parts:
And last but not least: as you can see in the protocol of It turns out It seems that |
I wonder how games read those joysticks. You generally need to be root or member of the
I think that it is kinda unfortunate that Wayland simultaneously handles display and input. Why should it be the display server's job to decide what input reaches the applications? Why can't the user be free to choose their display and input server separately? If only that (Without LD_PRELOAD please.) But that's pretty much the "Input Daemon" suggestion I made, and that has its drawback too. (But I still sometimes feel like brazenly suggesting a protocol that basically says "The compositor is no longer in charge of the Anyway, even if we did decide that libinput got extended to work nicely with keymapping, we would still need to figure out the protocol that multiple applications could use to simultaneously keymap. And if we had such a protocol, we could think about why it shouldn't just be a Wayland extension protocol. (Also, is libinput even aware to which window its input goes? I got sidetracked by the IME's and still haven't gotten to the bottom of that.) |
I only thought about using IME for injecting unicode text that is not in the keymap anyway, so that's not really a problem (I really can't think of a use case for that other than writing text).
This one is :(. Maybe there is a way to put an injection layer in between the IME and the compositor? Otherwise we would need to extend the protocol to use IME for unicode injection.
No, its simpler: device nodes handling a joystick get different permissions.
When switching to a different user (without logging out the first), the username gets changed to that one ( Also, if you're wondering: grabbing
Yeah, maybe we should focus on defining the protocol first..
I'm pretty sure it is not. That's the thing we really need the compositor to provide and it would be great if we could make it part of the wayland core protocol. Until then, kwin might be the only choice for people who need this. |
I remember that sometimes applications have keyboard shortcuts that use special characters, which aren't accessible on my german layout without using modifiers. |
After looking at libinput some more, it does seem to have some seriously useful features such as such as button debouncing and palm detection, which filter out events sent by the hardware that were never intended by the user. You generally want your keymapping scripts to skip over those as well. If we were to map after libinput, then we run into the problem that libinput merges all input devices into seats, where all similar devices get merged together into one device. This would make it impossible to apply different mappings to different keyboards, which is a use case that is sufficiently real that I'm doing it right now. However, taking a closer look at the libinput source code, the situation may not that bad: libinput does report for each event from which device it originates ( However, if we add our mappers after libinput, then we do have to (?) map libinput events. The problem with mapping libinput events is that they're kind of unwieldy. For example, this is the libinput event for pointer events:
That's quite a lot more than what Wayland reports to applications. Some of it is redundant, like the same coords in different coordinate formats; a mapper script would have to take care of modifying all of them at once. It also contains a painful The ideal solution would involve rewriting libinput with a more modular architecture where the various features it provides are implemented as different layers, and where third party modules can be inserted in the middle of the processing chain (e.g. after filtering out palms, before gesture detection and before the coordinates are formatted in a bazillion different ways), but I have my doubts that we can get the original libinput developers to go along with that plan. The Wayland protocol does not send the entirety of the libinput events to applications either. Maybe we can get away with simplifying the event format after it leaves libinput? [Edit: this sentence is false, Wayland does send the approximate entirety of the libinput events to applications.] |
FWIW, this is a thing disabled users need, too. Definitely a real use-case. |
Now that you mention it, it seems soooo obvious... As someone who uses the programmer dvorak layout, I have often run into programms (mostly games) which expect me to press things like '1', '2' or '3' without a modifier and it's a real pain in the a.. :( Although I have a rough plan on how I can avoid most of these issues in the future, it would be great to have a proper solution that does not involve editing xkb layouts. But considering how many different approaches to handling keys there seem to be in different programms / frameworks, I am very doubtsful about how much events with unicode codepoints will be able to help with this mess.
I think multiple keyboards might not be uncommon among users who use event mapping.
I would not have thought the output struct is that big O.O
Yes, I believe that would be best, too. And share your doubts as well :/. But what are the alternatives (at least if we want a full-parts-mapping-protocol)?
Maybe we should start by compiling a list of features we need from libinput.. I need to think about this some more.. |
When I first read this I wrote and then deleted a few angry responses. Nobody can be forced to help, but nobody should be allowed to stand in the way of fixing this. If somebody prevents fixing this, they are as much a cause of the problem, and their removal from the system is as much a part of the solution, as any code, protocol, or design concept. I can't stand it when people fork or build alternatives, rather than improving existing solutions; it usually just creates a mess and makes it harder for end users to have a coherent system, usually they end up having to choose between two incomplete solutions.... I really dislike forks in general.... but if one is not allowed to improve existing solutions, one has little choice but to build an alternative, be it from a fork or from scratch. I like to hope that the devs of any project which would be involved, will recognise any shortcomings in their implementation and not only be willing to take contributions, but also to assist in contributing themselves. I mean, if you built a thing, you'd surely want it to be the best thing it could be, and not have giant problems that make the entire operating system unusable for a significant percentage of human beings. I would like to remain optimistic that the libinput devs would take all of this on board with a positive response. If it's just some random crippled greybeard retired dev and a small handful of FOSS-enthusiast disabled folk having a cry about it, while all their friends, fellow cripples and demanding high end gamers alike, joke about what a nerd they are and just use Windows or iOS, then I can see it going nowhere - because that's what's happened so far! However, with knowledgeable and experienced input (pardon the pun) from experts, which moves from just having problems towards building solutions, like you all are contributing, I think this thread amounts to the beginnings of a very convincing proposal to improve existing solutions, and I like to think (hope...pray.......) that the devs of whatever project might need enhancements, would take it seriously and view it as constructive, and not be defensive about it. |
From the libinput docs:
I think these are the descriptions that make us sceptical about acceptance of a big change in libinput. But you are right: what we are preparing here is constructive and needed for various reasons and we shouldn't make the mistake of letting (possibly unwarrented) worries of rejection slow us down. I think the next steps should be:
I probably won't have enough time for this before wednesday (or friday) though.
Out of curiosity I looked at the windows input API docs. From maybe an hour of reading, this is what I understand: There seems to be only one input stream which only distinguishes devices between Mouse, Keyboard and other. Also: keyboard events can carry unicode characters (16-bit, I'm sure it means UCS-2 encoding). If that happens, it generates a virtual event (I think). But I really don't understand how multiple applications are supposed to work together - from what I read so far I'd expect the situation to be a lot worse than on linux. I'm most likely missing some knowledge about how the input stream works. |
Yeh I kinda honestly feel like there's a strong likelihood they'll vehemently "nope" this, on the spot. Then again, I've heard a bit recently about them adding support for IIO (as in, Industrial IO; accelerometers, light sensors, weird input devices) and there's the very closely related libei they've recently added to their stack, so ...mehhh I dunno I have strong doubts for the same reasons you mentioned, but kinda also feel like maybe they'll be really feeling all this and might just get involved. I really wish some of those devs were in this thread right now. I feel like even if they "nope" it for their own project, they'd tell us how to make it happen in some other way. Even in the worst case scenario, they say "hell no, and the only way it would happen is if we say yes, so you'll have to fork libinput, now stop wasting our time and don't talk about it any more" at least we know what's in front of us. It feels like there's a pool of knowledge among those devs, that we're missing out on....so
Yeh! 🙂 I feel like getting their input is definitely on the cards. Thanks so much for getting the ball rolling on that one. I'm glad you started a nice new clean issue for it too. Also I might just tag @MerlijnWajer here, who hasn't updated uinput-mapper in a decade but was very early in this game and might have some interesting thoughts here. Sorry if @'ing you was an annoyance, Merlijn! I just thought you could be a valuable player here :) |
I've been thinking about the new Wayland protocol and posted my current (incomplete) draft in a new issue: #4 I've got a good feeling about this one, but there's still quite some work that I need to do. It is neither fully implemented nor fully documented yet, some parts of the current spec are broken, et cetera. Anyway, I just wanted to post my current progress to show that something is getting done. |
This is pretty big. It is basically an API for creating virtual input devices. Combined with our ability to just grab all input devices, we basically have the necessary API's for creating an "input daemon" as I mentioned earlier. While an input daemon is not the perfect solution, it does provide a big possibility: suppose we create some Wayland protocol and write a library that implements it, but compositors are reluctant to implement it. Then we could write a daemon which grabs all event devices, processes those events through libinput and our library, and then makes the resulting events available through libei. Mapper scripts could then check if the compositor natively supports the protocol, and if the compositor doesn't but does support libeis, start the daemon as fallback. The daemon approach still has disadvantages such as requiring another process to run, another program to install, would make all devices show up as "virtual devices", prevents other applications like Qemu from grabbing the evdev devices, may not be able to change the keymap, may not be able to perfectly switch the active mapper based on the active window, et cetera. But it could provide a somewhat suitable fallback for users who are stuck on a compositor that does not support the new protocol but does contain libeis. |
Yes, it could be useful. |
I hope nobody minds, but I came across a related thread, where the above issues were discussed (well, brought up but not discussed much), so I linked this thread in the hopes that some of the (very important, respected, and capable) individuals there might perhaps weigh in on the conversation. The thread is over here https://discuss.kde.org/t/new-ideas-using-wayland-input-methods/3444/19. I just thought I should let this end of the conversation know that I'd linked it. Again, I hope this is OK, apologies if I've done the wrong thing.... Just.... a lot of the people in that thread are a pretty big deal and they're all working in this field at a fairly high level. |
I'm back! Well.. at least I should have some capacity for input stuff again :) One of the time-consuming things I did the last weeks was to switch my keyboard to a Dygma Defy. This made me re-evaluate how I use my keyboard and I ended up modifying my layout (xkb-wise) as well. This gave me an idea for a (partial) workaround to the type-arbitary-unicode-symbols problem. Let's start with a few often overlooked facts about xkb:
Now one sometimes forgotten fact about input remapping via uinput:
That means: as long as we know which unicode characters can possibly occurr (defined by user configuration) and the mapper is aware of the current layout (I already wrote some working proof-of-concept code for this last year), we can:
|
Good to see you back, Kermit :) And Hi,all! This tab stays open in my browser for the time being... I think we're on a long road, here.... I came across a reddit post about this article, entitled, "Input method on Wayland is broken and it's my fault" which rang some bells here. The reddit thread mentioned ibus-typing-booster which I've tried out lately, and it's promising on a few fronts discussed above, but rather bug-prone at the moment. I leave it installed on my machine, in hope, but presently, it remains disabled. Thought I might share the article in case it might be food for thought, or perhaps bring a 'new recruit' to this issue 😆 At least maybe if the author were to see this thread, they would not feel quite so much the lone bearer of fault in this situation... I don't think it's anyone's fault really. We are in need of a hero, or ten 😉 Do you think we should maybe send them a message? |
It seems we've found a real expert here. According one of their other articles, we're talking about the person who designed all the Wayland extension protocols around input methods. Feel free to message them. To make visiting this thread maybe worth their time, here are some of my thoughts about "Mistake 2: bad synchronization" mentioned in the linked article. (I am not actually sure if I understood the problem correctly. Does "commit" mean that the preedit string is to be turned into definitive content of the text box? If yes, the why does the second preedit string "Mo" still contain the "M" which should've become permanent content already? If no, why did the "M" character get reported as content of the text box due to lag? Anyway, here are my thoughts for as far as I think I understand the article.) I get the impression that the fundamental problem is that the IME does not know which of its proposed changes were accepted and which were rejected. If it does not resend its changes when they do not show up, it is possbile that input gets lost when a web document is edited by somebody else. When the IME does aggresively resend any change that does not observe as having shown up in the text box, then there is probably a whole other can of bugs about to spring open. If changing the protocol is still on the table (the protocol is still unstable after all), then I think this could be solved by making the "commit" message include from which state to which state is transformed, which makes it possible for the input method to figure out which of its actions were discarded. Both the application and IME start at state 0. When either of them wants to change the content of the textbox, they must include both the old state number and the new state number in part of the commit message. The IME always uses even numbers for states that it creates, whereas the application always uses odd numbers for states that they create, to avoid clashing state numbers. So, typing "Mo" would result in the following exchange of messages: All of these states were created upon initiative of the IME, so they all use even numbers. The application acknowledges each state transition explicitly, so the IME knows that all of its keys were accepted. Now let's consider the laggy situation where the user is trying to type "Mo", but a collaborator on a web document types an "a" while the IME is still busy composing: While the IME was trying to compose "Mo", the application received some TCP packet telling it to insert an "a" key after it read the "M" key from the IME but before it read the "o" key. From the application's perspective, two state transitions have happened:
At this point, the application is in state 3. It then receives a request from the IME to transition from state 2 to 4, but the application rejects it because it is not in state 2. The application informs the IME that it has observed two state transitions: 0 → 2 → 3. The IME sees that the transition 0 → 2 was acknowledged by the application and thus the "M" key was accepted, but it has also sent a request to transition "2 → 4". Because the application moved from "2 → 3" instead, the IME knows that its second request has been or will be rejected, and thus that the application has not received the "o" key. It knowing that the "o" key has been rejected, it then tries to play back all rejected requests, but this time based on the last state reported by the application. The user tried to type "Mo", the text now shows "Ma". If the "o" key went through, the text would be showing "Moa", so it sends a new request to transition 3 → 6 and change the text to "Moa". A minor thought: maybe the state should not be assumed to start at 0, but instead at a value declared by the compositor. Furthermore, the split between the state numbers allocable by the IME/the application should maybe not be even/odd, but "within a range that is allocated by the compositor". This could maybe make it possible to use multiple IME's at once if they are all allocated distinct ranges, but that's another can of worms I haven't fully thought through. |
I read through the messages we wrote here and got inspired with a new idea. For now I call it UInput Orchestrator (UIO). It is not meant as the final solution, but rather an extensible starting point that we could implement without changing anything in evdev, libinput or wayland. As this will likely be a longer topic, I created a new issue here: #5 |
Saw some news today about the KDE Plasma 6.1 release and they mentioned the "Input Capture Portal" which immediately captured (heh) my attention. Apparently its intended use-case is allowing software which shares keyboard/mouse between PC's, but perhaps we might find some way to use it to get our keyboards working locally? Left this here in case UIO is not the final direction (although, it looks like it might be!) |
I discovered another promising remapper: https://github.com/xremap/xremap and would like to invite it's main developer to the discussion. @k0kubun : if you are wondering why you are mentioned in an unknown project: this is an invitation to take part in our discussion. In time we got more focussed on making it possible for multiple remappers to work together. The relevant post that started this is: #2 (comment) |
Hi,
I'm the developer of a tool called inputMangler, which transforms input events on linux.
After a few years of other priorities I want to continue development (well.. rewrite it from scratch actually..).
As I like to avoid duplicate work, I had a look around the net to see if someone else started another project like mine. I found a few which at least do something similiar and, if you're mentioned at the end of this post, one of them is yours.
While all those projects seem to have more or less different goals and approaches, there still might be enough common ground for collaboration.
So this thread is about exploring possibilities to work together.
In the next post, I will write an overview of my goals. I invite everyone interested to do the same.
Afterwards we can compare those and discuss if it would make sense to
• put some base code in a common library
• merge projects (may be unlikely, but .. maybe)
• just share experience on strange input-related problems ;D
Links to the projects:
https://github.com/kermitfrog/inputmangler
https://github.com/sezanzeb/input-remapper
https://github.com/samvel1024/kbct
https://github.com/shiro/map2
https://github.com/rvaiya/keyd
https://github.com/snyball/Hawck
https://github.com/KarsMulder/evsieve
And the people that I hope will have a look at this after receiving a notification for being mentioned:
@sezanzeb, @samvel1024, @shiro, @rvaiya, @snyball, @KarsMulder
The text was updated successfully, but these errors were encountered: