Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aarch64: LKRG: FATAL: Can't hook 'arch_jump_label_transform' #351

Closed
vt-alt opened this issue Aug 11, 2024 · 21 comments
Closed

aarch64: LKRG: FATAL: Can't hook 'arch_jump_label_transform' #351

vt-alt opened this issue Aug 11, 2024 · 21 comments

Comments

@vt-alt
Copy link
Contributor

vt-alt commented Aug 11, 2024

lkrg suddenly become broken on 6.6.45, 6.10.4, 6.1.104 (all released today stable kernels).

[00:00:19] ++ insmod lkrg.ko log_level=3
[00:00:19] [    1.032612] lkrg: loading out-of-tree module taints kernel.
[00:00:19] [    1.033327] lkrg: module verification failed: signature and/or required key missing - tainting kernel
[00:00:19] [    1.044763] LKRG: ALIVE: Loading LKRG
[00:00:19] [    1.045508] Freezing user space processes
[00:00:19] [    1.047050] Freezing user space processes completed (elapsed 0.001 seconds)
[00:00:19] [    1.047809] OOM killer disabled.
[00:00:19] [    1.051342] LKRG: ISSUE: [kretprobe] register_kretprobe() for <ovl_dentry_is_whiteout> failed! [err=-2]
[00:00:19] [    1.052404] LKRG: ISSUE: Can't hook 'ovl_dentry_is_whiteout'. This is expected when OverlayFS is not used.
[00:00:19] [    1.054388] LKRG: FATAL: [kretprobe] register_kretprobe() for <arch_jump_label_transform> failed! [err=-2]
[00:00:19] [    1.055477] LKRG: FATAL: Can't hook 'arch_jump_label_transform'
[00:00:19] [    1.056140] LKRG: FATAL: Can't register CPU architecture specific metadata
[00:00:19] [    1.056873] LKRG: FATAL: Can't create database
[00:00:19] [    1.057340] LKRG: DYING: Not loading LKRG (initialization failed)
[00:00:19] [    1.295735] WARNING: Flushing system-wide workqueues will be prohibited in near future.
[00:00:19] [    1.296641] CPU: 7 PID: 141 Comm: insmod Tainted: G           OE      6.6.45-un-def-alt1 #1
[00:00:19] [    1.297570] Hardware name: linux,dummy-virt (DT)
[00:00:19] [    1.298095] Call trace:
[00:00:19] [    1.298378]  dump_backtrace+0xa0/0x148
[00:00:19] [    1.298812]  show_stack+0x20/0x48
[00:00:19] [    1.299188]  dump_stack_lvl+0x48/0x78
[00:00:19] [    1.299603]  dump_stack+0x18/0x30
[00:00:19] [    1.299986]  __warn_flushing_systemwide_wq+0x24/0x48
[00:00:19] [    1.300542]  p_exploit_detection_exit+0x48/0xe0 [lkrg]
[00:00:19] [    1.301148]  p_lkrg_register+0x1a8/0xff8 [lkrg]
[00:00:19] [    1.301670]  do_one_initcall+0x4c/0x320
[00:00:19] [    1.302111]  do_init_module+0x60/0x228
[00:00:19] [    1.302534]  load_module+0x23a0/0x24b0
[00:00:19] [    1.302964]  init_module_from_file+0x90/0xf8
[00:00:19] [    1.303442]  __arm64_sys_finit_module+0x168/0x328
[00:00:19] [    1.303973]  invoke_syscall+0x78/0x128
[00:00:19] [    1.304397]  el0_svc_common.constprop.0+0x48/0x130
[00:00:19] [    1.304938]  do_el0_svc+0x24/0x50
[00:00:19] [    1.305315]  el0_svc+0x48/0x190
[00:00:19] [    1.305672]  el0t_64_sync_handler+0x140/0x150
[00:00:19] [    1.306167]  el0t_64_sync+0x194/0x198
[00:00:19] [    1.306693] WARNING: Flushing system-wide workqueues will be prohibited in near future.
[00:00:19] [    1.307610] CPU: 7 PID: 141 Comm: insmod Tainted: G           OE      6.6.45-un-def-alt1 #1
[00:00:19] [    1.308536] Hardware name: linux,dummy-virt (DT)
[00:00:19] [    1.309057] Call trace:
[00:00:19] [    1.309338]  dump_backtrace+0xa0/0x148
[00:00:19] [    1.309766]  show_stack+0x20/0x48
[00:00:19] [    1.310141]  dump_stack_lvl+0x48/0x78
[00:00:19] [    1.310555]  dump_stack+0x18/0x30
[00:00:19] [    1.310941]  __warn_flushing_systemwide_wq+0x24/0x48
[00:00:19] [    1.311495]  p_offload_cache_delete+0x24/0x68 [lkrg]
[00:00:19] [    1.312073]  p_lkrg_register+0x1b0/0xff8 [lkrg]
[00:00:19] [    1.312594]  do_one_initcall+0x4c/0x320
[00:00:19] [    1.313031]  do_init_module+0x60/0x228
[00:00:19] [    1.313453]  load_module+0x23a0/0x24b0
[00:00:19] [    1.313882]  init_module_from_file+0x90/0xf8
[00:00:19] [    1.314360]  __arm64_sys_finit_module+0x168/0x328
[00:00:19] [    1.314893]  invoke_syscall+0x78/0x128
[00:00:19] [    1.315315]  el0_svc_common.constprop.0+0x48/0x130
[00:00:19] [    1.315858]  do_el0_svc+0x24/0x50
[00:00:19] [    1.316233]  el0_svc+0x48/0x190
[00:00:19] [    1.316589]  el0t_64_sync_handler+0x140/0x150
[00:00:19] [    1.317081]  el0t_64_sync+0x194/0x198
[00:00:19] [    1.322821] OOM killer enabled.
[00:00:19] [    1.323177] Restarting tasks ... done.
[00:00:19] [    1.323696] printk: console [lkrg0] disabled
[00:00:19] insmod: ERROR: could not insert module lkrg.ko: Network dropped connection on reset
[00:00:19] [    1.364901] reboot: Power down
@solardiz solardiz changed the title aarch64: LKRG: DYING: Not loading LKRG (initialization failed) aarch64: LKRG: FATAL: Can't hook 'arch_jump_label_transform' Aug 11, 2024
@solardiz
Copy link
Contributor

Thank you for reporting this @vt-alt!

https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.10.4 has this:

commit bea081b0d453f04cff2ffde7265a230065959901
Author: Will Deacon <[email protected]>
Date:   Wed Jul 31 14:36:01 2024 +0100

    arm64: jump_label: Ensure patched jump_labels are visible to all CPUs
    
    [ Upstream commit cfb00a35786414e7c0e6226b277d9f09657eae74 ]
    
    Although the Arm architecture permits concurrent modification and
    execution of NOP and branch instructions, it still requires some
    synchronisation to ensure that other CPUs consistently execute the newly
    written instruction:
    
     >  When the modified instructions are observable, each PE that is
     >  executing the modified instructions must execute an ISB or perform a
     >  context synchronizing event to ensure execution of the modified
     >  instructions
    
    Prior to commit f6cc0c501649 ("arm64: Avoid calling stop_machine() when
    patching jump labels"), the arm64 jump_label patching machinery
    performed synchronisation using stop_machine() after each modification,
    however this was problematic when flipping static keys from atomic
    contexts (namely, the arm_arch_timer CPU hotplug startup notifier) and
    so we switched to the _nosync() patching routines to avoid "scheduling
    while atomic" BUG()s during boot.
    
    In hindsight, the analysis of the issue in f6cc0c501649 isn't quite
    right: it cites the use of IPIs in the default patching routines as the
    cause of the lockup, whereas stop_machine() does not rely on IPIs and
    the I-cache invalidation is performed using __flush_icache_range(),
    which elides the call to kick_all_cpus_sync(). In fact, the blocking
    wait for other CPUs is what triggers the BUG() and the problem remains
    even after f6cc0c501649, for example because we could block on the
    jump_label_mutex. Eventually, the arm_arch_timer driver was fixed to
    avoid the static key entirely in commit a862fc2254bd
    ("clocksource/arm_arch_timer: Remove use of workaround static key").
    
    This all leaves the jump_label patching code in a funny situation on
    arm64 as we do not synchronise with other CPUs to reduce the likelihood
    of a bug which no longer exists. Consequently, toggling a static key on
    one CPU cannot be assumed to take effect on other CPUs, leading to
    potential issues, for example with missing preempt notifiers.
    
    Rather than revert f6cc0c501649 and go back to stop_machine() for each
    patch site, implement arch_jump_label_transform_apply() and kick all
    the other CPUs with an IPI at the end of patching.
    
    Cc: Alexander Potapenko <[email protected]>
    Cc: Mark Rutland <[email protected]>
    Cc: Marc Zyngier <[email protected]>
    Fixes: f6cc0c501649 ("arm64: Avoid calling stop_machine() when patching jump labels")
    Signed-off-by: Will Deacon <[email protected]>
    Reviewed-by: Catalin Marinas <[email protected]>
    Reviewed-by: Marc Zyngier <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Catalin Marinas <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

The referenced mainline commit is:

commit cfb00a35786414e7c0e6226b277d9f09657eae74
Author: Will Deacon <[email protected]>
Date:   Wed Jul 31 14:36:01 2024 +0100
                                 
    arm64: jump_label: Ensure patched jump_labels are visible to all CPUs
                       
    Although the Arm architecture permits concurrent modification and
    execution of NOP and branch instructions, it still requires some
    synchronisation to ensure that other CPUs consistently execute the newly
    written instruction:                                          
                                                                  
     >  When the modified instructions are observable, each PE that is
     >  executing the modified instructions must execute an ISB or perform a
     >  context synchronizing event to ensure execution of the modified
     >  instructions                                              
                 
    Prior to commit f6cc0c501649 ("arm64: Avoid calling stop_machine() when
    patching jump labels"), the arm64 jump_label patching machinery
    performed synchronisation using stop_machine() after each modification,
    however this was problematic when flipping static keys from atomic
    contexts (namely, the arm_arch_timer CPU hotplug startup notifier) and
    so we switched to the _nosync() patching routines to avoid "scheduling
    while atomic" BUG()s during boot.
                                                                  
    In hindsight, the analysis of the issue in f6cc0c501649 isn't quite
    right: it cites the use of IPIs in the default patching routines as the
    cause of the lockup, whereas stop_machine() does not rely on IPIs and
    the I-cache invalidation is performed using __flush_icache_range(),
    which elides the call to kick_all_cpus_sync(). In fact, the blocking
    wait for other CPUs is what triggers the BUG() and the problem remains
    even after f6cc0c501649, for example because we could block on the
    jump_label_mutex. Eventually, the arm_arch_timer driver was fixed to
    avoid the static key entirely in commit a862fc2254bd
    ("clocksource/arm_arch_timer: Remove use of workaround static key").
    
    This all leaves the jump_label patching code in a funny situation on
    arm64 as we do not synchronise with other CPUs to reduce the likelihood
    of a bug which no longer exists. Consequently, toggling a static key on
    one CPU cannot be assumed to take effect on other CPUs, leading to
    potential issues, for example with missing preempt notifiers.
    
    Rather than revert f6cc0c501649 and go back to stop_machine() for each
    patch site, implement arch_jump_label_transform_apply() and kick all
    the other CPUs with an IPI at the end of patching.
    
    Cc: Alexander Potapenko <[email protected]>
    Cc: Mark Rutland <[email protected]>
    Cc: Marc Zyngier <[email protected]>
    Fixes: f6cc0c501649 ("arm64: Avoid calling stop_machine() when patching jump labels")
    Signed-off-by: Will Deacon <[email protected]>
    Reviewed-by: Catalin Marinas <[email protected]>
    Reviewed-by: Marc Zyngier <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Catalin Marinas <[email protected]>

diff --git a/arch/arm64/include/asm/jump_label.h b/arch/arm64/include/asm/jump_label.h
index 4e753908b801..a0a5bbae7229 100644
--- a/arch/arm64/include/asm/jump_label.h
+++ b/arch/arm64/include/asm/jump_label.h
@@ -13,6 +13,7 @@
 #include <linux/types.h>
 #include <asm/insn.h>
 
+#define HAVE_JUMP_LABEL_BATCH
 #define JUMP_LABEL_NOP_SIZE            AARCH64_INSN_SIZE
 
 #define JUMP_TABLE_ENTRY(key, label)                   \
diff --git a/arch/arm64/kernel/jump_label.c b/arch/arm64/kernel/jump_label.c
index faf88ec9c48e..f63ea915d6ad 100644
--- a/arch/arm64/kernel/jump_label.c
+++ b/arch/arm64/kernel/jump_label.c
@@ -7,11 +7,12 @@
  */
 #include <linux/kernel.h>
 #include <linux/jump_label.h>
+#include <linux/smp.h>
 #include <asm/insn.h>
 #include <asm/patching.h>
 
-void arch_jump_label_transform(struct jump_entry *entry,
-                              enum jump_label_type type)
+bool arch_jump_label_transform_queue(struct jump_entry *entry,
+                                    enum jump_label_type type)
 {
        void *addr = (void *)jump_entry_code(entry);
        u32 insn;
@@ -25,4 +26,10 @@ void arch_jump_label_transform(struct jump_entry *entry,
        }
 
        aarch64_insn_patch_text_nosync(addr, insn);
+       return true;
+}
+
+void arch_jump_label_transform_apply(void)
+{
+       kick_all_cpus_sync();
 }

So it looks like the arch_jump_label_transform symbol is gone, also in mainline (but we didn't notice because our CI mainline test is for x86_64 only).

#define HAVE_JUMP_LABEL_BATCH disables usage of this symbol by kernel/jump_label.c.

@solardiz
Copy link
Contributor

Interestingly, #define HAVE_JUMP_LABEL_BATCH is also defined on x86, and has been since 2019. But both kinds of symbols are defined on x86. So our hooking succeeds, but are we hooking the right function, one that's actually used where we care? Probably we are: also since 2019, starting with 29867f6 we also hook arch_jump_label_transform_apply.

So should we simply stop hooking arch_jump_label_transform on newer kernels? Only on aarch64 or also on x86?

@Adam-pi3 Please take a look.

@vt-alt
Copy link
Contributor Author

vt-alt commented Aug 11, 2024

also in mainline (but we didn't notice because our CI mainline test is for x86_64 only).

BTW, GA got arm64 runners in public beta "These runners are available to our customers on our GitHub Team and Enterprise Cloud plans. We expect to begin offering Arm runners for open source projects by the end of the year. "

@vt-alt
Copy link
Contributor Author

vt-alt commented Aug 15, 2024

Is there work on this? [Because in ALT, building a kernel depends on the buildability of all modules, I will be forced to delete the LKRG kernel module temporarily to allow kernel updates, which users expect every week. This is normal and non-permanent (and not threatening), but in the meantime, LKRG will not be tested there. Perhaps the LKRG users will choose not to update the kernel until the module reappears.]

@solardiz
Copy link
Contributor

Yes, I'll experiment with this myself some more and likely propose a fix unless @Adam-pi3 beats me to it.

@solardiz
Copy link
Contributor

Looks like I won't fix this without @Adam-pi3's involvement, at least not as a quick fix that I was hoping I could have. The main reason why not is our hook for arch_jump_label_transform_apply checks for specific x86 opcodes, so it is insufficient to enable this hook - we also need to add aarch64-specific checks into the function.

A smaller issue is that while it looks like on current kernels we don't need to hook arch_jump_label_transform at all, on some older kernels with batch support (so between 2019 and now) it could also be used via arch_jump_label_transform_static. Looks like this was the case until this commit:

commit 7e6b9db27de9f69a705c1a046d45882c768e16c3
Author: Ard Biesheuvel <[email protected]>
Date:   Wed Jun 15 17:41:42 2022 +0200

    jump_label: make initial NOP patching the special case

which includes:

+++ b/kernel/jump_label.c
@@ -332,17 +332,13 @@ static int __jump_label_text_reserved(struct jump_entry *iter_start,
        return 0;
 }
 
-/*
- * Update code which is definitely not currently executing.
- * Architectures which need heavyweight synchronization to modify
- * running code can override this to make the non-live update case
- * cheaper.
- */
-void __weak __init_or_module arch_jump_label_transform_static(struct jump_entry *entry,
-                                           enum jump_label_type type)
+#ifndef arch_jump_label_transform_static
+static void arch_jump_label_transform_static(struct jump_entry *entry,
+                                            enum jump_label_type type)
 {
-       arch_jump_label_transform(entry, type);
+       /* nothing to do on most architectures */
 }
+#endif

So maybe our logic should be a bit trickier than disabling the hooking of arch_jump_label_transform if batch mode is supported.

OTOH, I think we can simplify this check in p_arch_jump_label_transform_apply.h:

#if LINUX_VERSION_CODE >= KERNEL_VERSION(5,3,0) || \
   (defined(RHEL_RELEASE_CODE) && RHEL_RELEASE_CODE >= RHEL_RELEASE_VERSION(8, 2))

to:

#ifdef HAVE_JUMP_LABEL_BATCH

instead of complicating it with a different kernel version check on aarch64 now.

@Adam-pi3
Copy link
Collaborator

So should we simply stop hooking arch_jump_label_transform on newer kernels? Only on aarch64 or also on x86?

Some distros with older kernels didn't switch to the BATCH hooking but other old kernels did. It's a mixup scenario so we still need to keep both logic. I'm not sure if distros ported HAVE_JUMP_LABEL_BATCH - maybe we can simplify the check with this macro - interesting.

The main reason why not is our hook for arch_jump_label_transform_apply checks for specific x86 opcodes

It looks like that ARM version of batch hooking is waaaaay more simplified comparing to x86. I guess we can try hooking arch_jump_label_transform_queue for ARM batch mode and run the same logic as for arch_jump_label_transform hook. I will try to set-up my raspberry PI and play with it when I have time. The problem is I'm going to Poland next weekend which may affect my schedule. However, from the analysis it looks like what I described should work

@solardiz
Copy link
Contributor

try hooking arch_jump_label_transform_queue for ARM batch mode and run the same logic as for arch_jump_label_transform hook.

I haven't yet looked into the detail of this, but my understanding was that queue would have been too early, which is why you chose to hook apply instead. And I doubt that our hooking for batch mode support should differ between archs, because Linux's own usage of these functions does not differ between archs except by HAVE_JUMP_LABEL_BATCH.

But maybe on aarch64 we can in fact run the same logic as for transform on transform_apply.

@Adam-pi3
Copy link
Collaborator

Adam-pi3 commented Aug 22, 2024

I was surprised but on x86 that's the case but on ARM the logic is a bit different. Queue does the memory modification:

bool arch_jump_label_transform_queue(struct jump_entry *entry,
				     enum jump_label_type type)
{
...
	aarch64_insn_patch_text_nosync(addr, insn);
...
}

where:

int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
{
...
	ret = aarch64_insn_write(tp, insn);
...
}

where:

int __kprobes aarch64_insn_write(void *addr, u32 insn)
{
	return __aarch64_insn_write(addr, cpu_to_le32(insn));
}

and real modification is here:

static int __kprobes __aarch64_insn_write(void *addr, __le32 insn)
{
	void *waddr = addr;
	unsigned long flags = 0;
	int ret;

	raw_spin_lock_irqsave(&patch_lock, flags);
	waddr = patch_map(addr, FIX_TEXT_POKE0);

	ret = copy_to_kernel_nofault(waddr, &insn, AARCH64_INSN_SIZE);

	patch_unmap(FIX_TEXT_POKE0);
	raw_spin_unlock_irqrestore(&patch_lock, flags);

	return ret;
}

So copy_to_kernel_nofault will make the changes. While apply on ARM synchronizes the cores via memory barrier:

void arch_jump_label_transform_apply(void)
{
	kick_all_cpus_sync();
}

where:

/**
 * kick_all_cpus_sync - Force all cpus out of idle
 *
 * Used to synchronize the update of pm_idle function pointer. It's
 * called after the pointer is updated and returns after the dummy
 * callback function has been executed on all cpus. The execution of
 * the function can only happen on the remote cpus after they have
 * left the idle function which had been called via pm_idle function
 * pointer. So it's guaranteed that nothing uses the previous pointer
 * anymore.
 */
void kick_all_cpus_sync(void)
{
	/* Make sure the change is visible before we kick the cpus */
	smp_mb();
	smp_call_function(do_nothing, NULL, 1);
}
EXPORT_SYMBOL_GPL(kick_all_cpus_sync);

One problem may be when on SMP machine one core is running verification routine while other does the modification and the cores are not in sync yet. However, do not updating database after the queue will certainly be wrong so we should do it there. We need to play and see if this MB is really needed (in theory we can call synchronization by ourself)

@Adam-pi3
Copy link
Collaborator

@vt-alt does ALT linux support Raspberry Pi 4? I'm trying to set-up testing env and Ubuntu works fine but their PPA kernels are broken - they release linux-headers-generic which is depended on 'linux-headers' which they didn't publish:
https://kernel.ubuntu.com/mainline/v6.10.4/

dpkg: dependency problems prevent configuration of linux-headers-6.10.4-061004-generic:
 linux-headers-6.10.4-061004-generic depends on linux-headers-6.10.4-061004; however:
  Package linux-headers-6.10.4-061004 is not installed.

If ALT linux has pre-baked Raspberry Pi image which I can use and then pull the desired kernel version with dev/headers package to build LKRG, it would make my life a lot easier.

@vt-alt
Copy link
Contributor Author

vt-alt commented Aug 23, 2024

@vt-alt does ALT linux support Raspberry Pi 4?

I am told we have full support. For example any .img.xz from from this list https://nightly.altlinux.org/sisyphus-aarch64/tested/ they could be written on SD-card and etc.
There is also "Simply Linux" but I would suggest Regular-s from link above as it's Sisyphus based which is our "unstable" repository with everything latest. I also try to get mainline kernels into Sisyphus but it's not finished yet. Linux kernel packages have difference that we don't install them with apt-get, but with update kernel tool. So as of today you would install 6.11-rc4 with update-kernel -t 6.11 --headers.

About Ubuntu mainline packages, we already do it successfully in CI scripts. Perhaps you should just install linux-headers .deb in the same command line as linux-image package itself.

Adam-pi3 added a commit to Adam-pi3/lkrg-work that referenced this issue Aug 24, 2024
ARM64 added support for JUMP_LABEL batch-mode feature. This commit
adds support for it and addresses lkrg-org#351
@Adam-pi3
Copy link
Collaborator

Adam-pi3 commented Aug 24, 2024

I created a test patch (#352) implementing the fix/logic described here. It looks like it works. However, I would NOT merge this PR yet. @vt-alt can you run some tests on your end too? @solardiz when @vt-alt also confirms it works I guess we can merge it and see if no new problems pops up.
I tested that fix by running it on Raspberry Pi 4 and on 2 parallel screens I run:

screen 1: run the command while true; do echo 1 > /sys/kernel/debug/tracing/events/enable && echo 0 > /sys/kernel/debug/tracing/events/enable; done which enforces JUMP_LABEL batch mode
screen 2: run the command while true; do sysctl lkrg.trigger=1; sleep 1; done to enforce integrity validation.

I let it run by a few hours and didn't see any problems so far... Probably we need a bit more comprehensive tests anyway.

@vt-alt

This comment was marked as outdated.

@vt-alt
Copy link
Contributor Author

vt-alt commented Aug 24, 2024

I tested the fix from refs/pull/352/head with the test loops as described (from 13:41:31 to 15:55:12 UTC when my xcpu limit is reached), and there was no problems appeared.

Thanks much!

@solardiz
Copy link
Contributor

I was surprised but on x86 that's the case but on ARM the logic is a bit different. Queue does the memory modification:

OK, but do we have to do things differently? Wouldn't it be more correct and safer to hook apply rather than queue? Just like the rest of the kernel uses these arch-specific functions without such knowledge of their internals, maybe we shouldn't either. This would probably also avoid the potential issue you described with us needing the memory barrier - if we hook return from apply, the barrier would have already occurred.

@solardiz
Copy link
Contributor

However, do not updating database after the queue will certainly be wrong so we should do it there.

Why would postponing this to apply be wrong? Our verification is supposed to run after we've acquired a lock anyway, does the kernel acquire the same lock before jump label modifications? When does it release the lock - perhaps after apply?

solardiz pushed a commit that referenced this issue Aug 26, 2024
ARM64 added support for JUMP_LABEL batch-mode feature. This commit
adds support for it and addresses #351
@solardiz
Copy link
Contributor

Issue assumed fixed and no discussion for a month - I'll close for now.

@Adam-pi3
Copy link
Collaborator

@solardiz sorry for long delay - I was traveling with limited access to the internet.

OK, but do we have to do things differently?
Why would postponing this to apply be wrong?

For some reasons kernel dev decided to modify .text on queue not on apply. If we hook apply to track changed, it will be too late - we must update database on queue.

@solardiz
Copy link
Contributor

No problem @Adam-pi3.

Your explanation omits addressing my questions about locking. So once again, if there were no locking then even updating after queue (which makes the modifications) would be too late (because the modifications complete before queue returns and are also not atomic). If there is locking and the lock is released only after apply, then it's not too late for us to update just before return from apply - and we could thus do things more consistently across archs and more similar to Linux's own way.

Nevermind, I don't insist on discussing this further.

@vt-alt
Copy link
Contributor Author

vt-alt commented Oct 19, 2024

The delay in releasing the next version of LKRG is concerning (you keeping the version with known issues for so long). I recommend adopting a revised versioning scheme: designate the next release as v0.10.0 (instead of v0.9.9), incrementing the middle number for regular releases, while using the third number for minor build and stability fixes, allowing for more frequent updates. This will also make it easier for users to understand.

@solardiz
Copy link
Contributor

The delay in releasing the next version of LKRG is concerning (you keeping the version with known issues for so long).

I agree.

I recommend adopting a revised versioning scheme: designate the next release as v0.10.0 (instead of v0.9.9), incrementing the middle number for regular releases, while using the third number for minor build and stability fixes, allowing for more frequent updates. This will also make it easier for users to understand.

Thank you for making this suggestion. I agree about "using the third number for minor build and stability fixes". Yet I think we can call the very next release 0.9.9 and postpone the decision on whether we go with 0.9.10, 0.10.0, or 1.0.0 afterwards. If we feel we need to make a post-0.9.9 release that has only "minor build and stability fixes", this may very well be a reason to call it 1.0.0 - a stable release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants