You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After upgrading the passive/backup primary node of the FW cluster from 23.7 to 24.1 (secondary being upgraded before). It panics upon starting the interfaces (seems to occur right on the Cluster interface specifically) with the following stack trace (cropped due to serial terminal limits but a full crash report was submitted using the WebUI after working around the issue):
lo0: link state changed to UP [317/15483]
[fib_algo] inet.0 (bsearch4#32) rebuild_fd_flm: switching algo to radix4_lockless
Sleeping thread (tid 100538, pid 95063) owns a non-sleepable lock
KDB: stack backtrace of thread 100538:
sched_switch() at sched_switch+0x818/frame 0xfffffe0247de3a10
mi_switch() at mi_switch+0xc2/frame 0xfffffe0247de3a30
_sx_xlock_hard() at _sx_xlock_hard+0x3e4/frame 0xfffffe0247de3ae0
in_leavegroup() at in_leavegroup+0x80/frame 0xfffffe0247de3b10
pfsync_multicast_cleanup() at pfsync_multicast_cleanup+0x2b/frame 0xfffffe0247de3b40
pfsyncioctl() at pfsyncioctl+0x6fd/frame 0xfffffe0247de3bc0
ifioctl() at ifioctl+0x7bc/frame 0xfffffe0247de3cc0
kern_ioctl() at kern_ioctl+0x26d/frame 0xfffffe0247de3d30
sys_ioctl() at sys_ioctl+0x100/frame 0xfffffe0247de3e00
amd64_syscall() at amd64_syscall+0x10c/frame 0xfffffe0247de3f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0247de3f30
--- syscall (54, FreeBSD ELF64, ioctl), rip = 0x17204d3321ca, rsp = 0x17204a309e78, rbp = 0x17204a309ec0 ---
panic: sleeping thread
cpuid = 6
time = 1714383055
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe02137c7980
vpanic() at vpanic+0x151/frame 0xfffffe02137c79d0
panic() at panic+0x43/frame 0xfffffe02137c7a30
propagate_priority() at propagate_priority+0x296/frame 0xfffffe02137c7a70
turnstile_wait() at turnstile_wait+0x323/frame 0xfffffe02137c7ab0
__mtx_lock_sleep() at __mtx_lock_sleep+0x180/frame 0xfffffe02137c7b40
pfsyncioctl() at pfsyncioctl+0x91b/frame 0xfffffe02137c7bc0
ifioctl() at ifioctl+0x803/frame 0xfffffe02137c7cc0
kern_ioctl() at kern_ioctl+0x26d/frame 0xfffffe02137c7d30
sys_ioctl() at sys_ioctl+0x100/frame 0xfffffe02137c7e00
amd64_syscall() at amd64_syscall+0x10c/frame 0xfffffe02137c7f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe02137c7f30
--- syscall (54, FreeBSD ELF64, ioctl), rip = 0x2e515b92d1ca, rsp = 0x2e5157537fc8, rbp = 0x2e5157538000 ---
KDB: enter: panic
[ thread pid 98568 tid 100537 ]
Stopped at kdb_enter+0x37: movq $0,0x1217e0e(%rip)
db:0:kdb.enter.default> textdump set
textdump set
db:0:kdb.enter.default> capture on
db:0:kdb.enter.default> run lockinfo
db:1:lockinfo> show locks
No such command; use "help" to list available commands
db:1:lockinfo> show alllocks
No such command; use "help" to list available commands
db:1:lockinfo> show lockedvnods
Locked vnodes
db:0:kdb.enter.default> show pcpu
cpuid = 6
dynamic pcpu = 0xfffffe0154d6e300
curthread = 0xfffffe0214869740: pid 98568 tid 100537 critnest 1 "ifconfig"
curpcb = 0xfffffe0214869c50
fpcurthread = 0xfffffe0214869740: pid 98568 "ifconfig"
idlethread = 0xfffffe017e889c80: tid 100009 "idle: cpu6"
self = 0xffffffff82e16000
curpmap = 0xfffffe026506ab20
tssp = 0xffffffff82e16384
rsp0 = 0xfffffe02137c8000
kcr3 = 0x241bd8000
ucr3 = 0x241a2b000
scr3 = 0x241a2b000
gs32p = 0xffffffff82e16404
ldt = 0xffffffff82e16444
tss = 0xffffffff82e16434
curvnet = 0xfffff80101648c40
db:0:kdb.enter.default> bt
Tracing pid 98568 tid 100537 td 0xfffffe0214869740
kdb_enter() at kdb_enter+0x37/frame 0xfffffe02137c7980
vpanic() at vpanic+0x182/frame 0xfffffe02137c79d0
panic() at panic+0x43/frame 0xfffffe02137c7a30
propagate_priority() at propagate_priority+0x296/frame 0xfffffe02137c7a70
turnstile_wait() at turnstile_wait+0x323/frame 0xfffffe02137c7ab0
To Reproduce
Steps to reproduce the behavior:
Upgrade secondary node from 23.7 to 24.1
Switch over active/master to secondary node
Upgrade primary node from 23.7 to 24.1 and let it reboot
Expected behavior
Primary node should update and reboot without issues with HA state synchronization enabled
Describe alternatives you considered
After disabling HA state synchronization on the secondary the primary node boots properly without problems.
Failover is not smooth due to states getting lost but works for now.
Relevant log files
See stack trace above. Full crash report was submitted after boot succeeded using Firmware/Reporter.
Environment
Software version used and hardware type if relevant, e.g.:
This is also happening on upgrading to kernel 24.1.8 but I've managed to work around it by setting the respective other firewall node as a the unicast sync target IP using the UI and then the boot loop stopped.
Important notices
Before you add a new report, we ask you kindly to acknowledge the following:
Describe the bug
After upgrading the passive/backup primary node of the FW cluster from 23.7 to 24.1 (secondary being upgraded before). It panics upon starting the interfaces (seems to occur right on the Cluster interface specifically) with the following stack trace (cropped due to serial terminal limits but a full crash report was submitted using the WebUI after working around the issue):
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Primary node should update and reboot without issues with HA state synchronization enabled
Describe alternatives you considered
After disabling HA state synchronization on the secondary the primary node boots properly without problems.
Failover is not smooth due to states getting lost but works for now.
Relevant log files
See stack trace above. Full crash report was submitted after boot succeeded using Firmware/Reporter.
Environment
Software version used and hardware type if relevant, e.g.:
OPNsense 24.1.6-amd64
FreeBSD 13.2-RELEASE-p11
OpenSSL 3.0.13
directly on Dell PowerEdge R6515 with 4x Broadcom Adv. Dual 25Gb Ethernet (everything on latest available firmware)
The text was updated successfully, but these errors were encountered: