[Bug]: RAK4631 not responding #5491

cracky22 · 2024-12-02T15:02:39Z

Hardware

Rak4631

Firmware Version

2.5.15.79da236

Description

I've already tried restarting, resetting and even re-flashing, but my RAK4631 with nrf52 chip doesn't hold 80 nodes. I can definitely identify the problem in connection with the nodes because I have created a channel where there are only 8 nodes and the device stays online there.

With over 80 nodes the rak crashes from time to time (between 6 and 9 hours) and the green LED lights up continuously.

What can I do??

Relevant log output

No response

SimbimChimbetov · 2024-12-04T11:59:22Z

This problem only on 2.5.15?

cracky22 · 2024-12-04T12:05:24Z

This problem only on 2.5.15?

No, unfortunately not. I have already tried older fw versions and always encountered this problem. Do you know a working fw? Actually it works fine if I just change the channel Our place but I do not use the ones from the city

SimbimChimbetov · 2024-12-04T12:14:24Z

I'm not sure but maybe I have the same problem with v 2.5.11
I recently updated a Node on a Mountain with ober 100 nodes in reach, few days ago I lost Signal, last signal showed 88% Battery and the weather was fine.
Because it is really hard to access and I don't had the time to go there, I can't confirm, maybe it is just stolen.
I hope I have the time this Weekend.

cracky22 · 2024-12-04T12:19:23Z

It would be interesting to know if it is the same for other nrf52 devices. I have a t1000-e but it works there. Could it be due to the different storage options such as ram and eeprom?

markbirss · 2024-12-04T12:31:50Z

It would be interesting to know if it is the same for other nrf52 devices. I have a t1000-e but it works there. Could it be due to the different storage options such as ram and eeprom?

All nrf52 devices have only 28kb of littlefs storage available where preferences, ble parings and nodedb are stored (the littlefs block size will also waste some space)

As a measure to prevent this issue the nodedb (db.proto) size was reduced from 100 to 80 recently

#5346

you can confirm the size of the nodedb file running this test fw file

(https://discord.com/channels/867578229534359593/919642584480112750/1305904252626927729)

use "list-files-s140_nrf52_611_softdevice-1.0.0.4265ae9.uf2" for rak6431
the other for seeed boards with newer SoftDevice

if you are able to share the output this could help understand the issue further

cracky22 · 2024-12-04T13:35:06Z

It would be interesting to know if it is the same for other nrf52 devices. I have a t1000-e but it works there. Could it be due to the different storage options such as ram and eeprom?

All nrf52 devices have only 28kb of littlefs storage available where preferences, ble parings and nodedb are stored (the littlefs block size will also waste some space)

As a measure to prevent this issue the nodedb (db.proto) size was reduced from 100 to 80 recently

#5346

you can confirm the size of the nodedb file running this test fw file

(https://discord.com/channels/867578229534359593/919642584480112750/1305904252626927729)

use "list-files-s140_nrf52_611_softdevice-1.0.0.4265ae9.uf2" for rak6431
the other for seeed boards with newer SoftDevice

if you are able to share the output this could help understand the issue further

I can't even open the uf2, can you put it here?

cracky22 · 2024-12-04T13:35:07Z

It would be interesting to know if it is the same for other nrf52 devices. I have a t1000-e but it works there. Could it be due to the different storage options such as ram and eeprom?

All nrf52 devices have only 28kb of littlefs storage available where preferences, ble parings and nodedb are stored (the littlefs block size will also waste some space)

As a measure to prevent this issue the nodedb (db.proto) size was reduced from 100 to 80 recently

#5346

you can confirm the size of the nodedb file running this test fw file

(https://discord.com/channels/867578229534359593/919642584480112750/1305904252626927729)

use "list-files-s140_nrf52_611_softdevice-1.0.0.4265ae9.uf2" for rak6431
the other for seeed boards with newer SoftDevice

if you are able to share the output this could help understand the issue further

I can't even open the uf2, can you put it here?

cracky22 · 2024-12-04T13:37:02Z

So that means I connect the RAK to my computer and send you the logs?

cracky22 · 2024-12-04T13:37:03Z

So that means I connect the RAK to my computer and send you the logs?

thebentern · 2024-12-04T13:44:43Z

Please note that this file will not work for the T1000-E, as it uses a different SoftDevice version
list-files-s140_nrf52_611_softdevice-1.0.0.4265ae9(1).uf2.zip

cracky22 · 2024-12-04T14:43:08Z

So that means I connect the RAK to my computer and send you the logs?

.

thebentern · 2024-12-04T14:52:29Z

I have had rotten luck reproducing this issue so far, but today I am trying an all day run of a RAK board connected to msh/US against MQTT (client proxy) to see if it triggers at all for me. >140 nodes witnessed so far.

cracky22 · 2024-12-05T12:43:07Z

I have had rotten luck reproducing this issue so far, but today I am trying an all day run of a RAK board connected to msh/US against MQTT (client proxy) to see if it triggers at all for me. >140 nodes witnessed so far.

Hi, how do you get this output? Is there a tool that can log and save all the important information?

cracky22 · 2024-12-05T12:44:02Z

Or is it just serial?

thebentern · 2024-12-05T12:53:41Z

Or is it just serial?

It is just serial logs. I like to use tio (https://github.com/tio/tio) because it will re-attach to the device in the case of a failure or reboot.

I ran the RAK node for about 7 hours yesterday on the msh/US topic and picked up over 600 nodes with no crashes or failures. To rule out any issues with file corruption problems, have you tried a factory reset (or even just nodedb rese)?

cracky22 · 2024-12-06T06:11:02Z

Or is it just serial?

It is just serial logs. I like to use tio (https://github.com/tio/tio) because it will re-attach to the device in the case of a failure or reboot.

I ran the RAK node for about 7 hours yesterday on the msh/US topic and picked up over 600 nodes with no crashes or failures. To rule out any issues with file corruption problems, have you tried a factory reset (or even just nodedb rese)?

How did you get 600 nodes? We're not talking about MQTT, are we?

garthvh · 2024-12-06T06:17:06Z

Or is it just serial?

It is just serial logs. I like to use tio (https://github.com/tio/tio) because it will re-attach to the device in the case of a failure or reboot.
I ran the RAK node for about 7 hours yesterday on the msh/US topic and picked up over 600 nodes with no crashes or failures. To rule out any issues with file corruption problems, have you tried a factory reset (or even just nodedb rese)?

How did you get 600 nodes? We're not talking about MQTT, are we?

Yes, that is how you get to 600 nodes quickly, the topics are for mqtt

cracky22 · 2024-12-07T10:52:39Z

@thebentern @garthvh @markbirss

cracky22 · 2024-12-07T11:01:50Z

meshtastic-log-2024-12-07T11-01-23.989Z.log

cracky22 · 2024-12-07T11:08:02Z

cracky22 · 2024-12-10T06:42:06Z

I know that this is unnecessary, but it would be possible if there was an option in the firmware to query how much "memory" is available on the board and that could also be displayed in the Android app in a small graphic/text

markbirss · 2024-12-10T07:10:58Z

I know that this is unnecessary, but it would be possible if there was an option in the firmware to query how much "memory" is available on the board and that could also be displayed in the Android app in a small graphic/text

the littlefs support specific for nrf52 dont currently have function to get free space
you could look at adding a android app feature request on the app for free memory
https://github.com/meshtastic/Meshtastic-Android/issues

cracky22 · 2024-12-11T08:46:06Z

Ok, I can do that. But what about my problem with the RAK? What can I do or how can I debug further

markbirss · 2024-12-11T09:06:58Z

Ok, I can do that. But what about my problem with the RAK? What can I do or how can I debug further

Ok, are you able to capture log as the reboot/crash occur at all?
Listing of files still show db.proto size after crashed ? (or it this the already provided listing ?)

cracky22 · 2024-12-11T10:09:52Z

This is already the list.
No, I can't log it as it crashes as it takes between 6 and 9 hours

cracky22 · 2024-12-29T11:26:54Z

@tavdog hi, I just discovered your commit #5670 and updated it straight away - I hope this solves my problems with the RAK. Is there a possibility that the watchdog, for example, has a fixed memory area in which it logs data? would be very useful if my RAK crashes again. Unfortunately, I can't have it connected to the PC 24/7 and save the logs

tavdog · 2024-12-29T18:08:27Z

@tavdog hi, I just discovered your commit #5670 and updated it straight away - I hope this solves my problems with the RAK. Is there a possibility that the watchdog, for example, has a fixed memory area in which it logs data? would be very useful if my RAK crashes again. Unfortunately, I can't have it connected to the PC 24/7 and save the logs

My fix only formats the filesystem when lfs assert is triggered. I don't think it will help your issue and if it does it will probably result is a wiped state.

analogman2 · 2024-12-29T19:49:31Z

I am experiencing the same or similar issue on a RAK Wisblock 4631. The node crashed after a week and the last time I checked >150 nodes. I got to this issue from this issue #5648

DEBUG | ??:??:?? 1 Filesystem files:
DEBUG | ??:??:?? 1 prefs (directory)
DEBUG | ??:??:?? 1 channels.proto (147 Bytes)
DEBUG | ??:??:?? 2 module.proto (118 Bytes)
DEBUG | ??:??:?? 2 config.proto (169 Bytes)
DEBUG | ??:??:?? 2 db.proto.tmp (27377 Bytes)
DEBUG | ??:??:?? 2 adafruit (directory)
DEBUG | ??:??:?? 2 bond_prph (directory)
DEBUG | ??:??:?? 2 bond_cntr (directory)
DEBUG | ??:??:?? 2 Power::lipoInit lipo sensor is not ready yet
DEBUG | ??:??:?? 2 Use analog input 5 for battery level
INFO | ??:??:?? 2 Scan for i2c devices
DEBUG | ??:??:?? 2 Scan for I2C devices on port 1
INFO | ??:??:?? 2 No I2C devices found
DEBUG | ??:??:?? 2 acc_info = 0
INFO | ??:??:?? 2 S:B:9,2.5.16.f81d3b0
INFO | ??:??:?? 2 Build timestamp: 1733662250
DEBUG | ??:??:?? 2 Reset reason: 0x0
DEBUG | ??:??:?? 2 Set random seed 1102827228
INFO | ??:??:?? 2 Init NodeDB
ERROR | ??:??:?? 2 Could not open / read /prefs/db.proto
WARN | ??:??:?? 2 Devicestate 0 is old, discard
INFO | ??:??:?? 2 Install default DeviceState
DEBUG | ??:??:?? 2 Initial packet id 913518860
DEBUG | ??:??:?? 2 Partially randomized packet id 544149773
DEBUG | ??:??:?? 2 Use nodenum 0x9e76f91f
INFO | ??:??:?? 2 Load /prefs/config.proto
INFO | ??:??:?? 2 Loaded /prefs/config.proto successfully
INFO | ??:??:?? 2 Loaded saved config version 23
INFO | ??:??:?? 2 Load /prefs/module.proto
INFO | ??:??:?? 2 Loaded /prefs/module.proto successfully
INFO | ??:??:?? 2 Loaded saved moduleConfig version 23
INFO | ??:??:?? 2 Load /prefs/channels.proto
INFO | ??:??:?? 2 Loaded /prefs/channels.proto successfully
INFO | ??:??:?? 2 Loaded saved channelFile version 23
DEBUG | ??:??:?? 2 cleanupMeshDB purged 0 entries
DEBUG | ??:??:?? 2 Use nodenum 0x9e76f91f
INFO | ??:??:?? 2 Adding node to database with 1 nodes and 153976 bytes free!
DEBUG | ??:??:?? 2 Expand short PSK #1
INFO | ??:??:?? 2 Wanted region 1, using US
INFO | ??:??:?? 2 Save /prefs/db.proto
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
lfs warn:314: No more free space 224
ERROR | ??:??:?? 2 Error: can't encode protobuf io error
ERROR | ??:??:?? 2 LFS assert: head >= 2 && head <= lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: head >= 2 && head <= lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: head >= 2 && head <= lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: head >= 2 && head <= lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: head >= 2 && head <= lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: head >= 2 && head <= lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: head >= 2 && head <= lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: head >= 2 && head <= lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: head >= 2 && head <= lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: head >= 2 && head <= lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: head >= 2 && head <= lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: head >= 2 && head <= lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: head >= 2 && head <= lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: head >= 2 && head <= lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: block < lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: head >= 2 && head <= lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: head >= 2 && head <= lfs->cfg->block_count
ERROR | ??:??:?? 2 LFS assert: head >= 2 && head <= lfs->cfg->block_count #75ish more times
ERROR | ??:??:?? 2 LFS asser

esev · 2025-01-12T17:41:49Z

Just an observation: For the connection just prior to the LFS failures, there is no Client wants config log.

#[0m #[32m INFO  #[0m| 15:14:57 3360 [EInkDynamicDisplay] #[32m BLE Disconnected, reason = 0x8
#[0m #[34m DEBUG #[0m| 15:14:57 3360 [EInkDynamicDisplay] #[34m PhoneAPI::close()
#[0m #[34m DEBUG #[0m| 15:14:57 3361 [EInkDynamicDisplay] #[34m Async full-refresh complete
#[0m #[34m DEBUG #[0m| 15:14:57 3361 [RadioIf] #[34m Started Tx (id=0x211f2819 fr=0x3c to=0xff, WantAck=0, HopLim=2 Ch=0x8 encrypted rxtime=1736694894 rxSNR=6 rxRSSI=-86 priority=64)
#[0m #[34m DEBUG #[0m| 15:14:57 3361 [RadioIf] #[34m Packet TX: 509ms
#[0m #[34m DEBUG #[0m| 15:14:58 3361 [RadioIf] #[34m Completed sending (id=0x211f2819 fr=0x3c to=0xff, WantAck=0, HopLim=2 Ch=0x8 encrypted rxtime=1736694894 rxSNR=6 rxRSSI=-86 priority=64)
#[0m #[32m INFO  #[0m| 15:15:03 3367 #[32m BLE Connected to "android phone"
#[0m #[32m INFO  #[0m| 15:15:04 3367 #[32m BLE connection secured

cracky22 · 2025-01-13T18:18:57Z

@garthvh soo, the funny thing about the problem is that the RAK flashes all the time as long as you are connected in the serial console and have the tab active. as soon as several log messages accumulate (in the background) and the rak freezes (i.e. the LED lights up continuously), it looks as if it would bootloop, which is the problem i currently have. this means i have no connection on the cell phone. however, if i now go to the tab with the serial console on the pc and scroll down to the end of the logs with the mouse, the rak disconnects from this state. as soon as all logs are loaded, the android device reconnects and the rak starts flashing again

here are my logs
meshtastic-log-2025-01-13T18-17-54.737Z.log

c0dexter · 2025-01-14T22:10:23Z

The same issue had my friend on the custom built NRF node, he flashed modified 2.5.19 FW with limit notes list to 80 pcs. Now his node is stable, no issues after connecting and disconnecting BT, loading collected messages after re-connecting to node and so on. Few days without any freez and non-controlled resets or loosing configs.

esev · 2025-01-14T22:14:51Z

@c0dexter Do you mean 2.5.18 FW? Installing 2.5.19 on NRF52 nodes can/will cause a different issue.

c0dexter · 2025-01-14T22:24:43Z

@esev no, I mean 2.5.19 - he compiled this version with his changes - he limited nodes.db to 80 items. Maybe this version has more issues, but for him is working more than 72h without unexpected behavior

esev · 2025-01-14T22:33:05Z

Oh, I see. 2.5.19 was only released 20 hours ago. But I bet the build string in his version contains 2.5.19. The version released 20 hours ago really shouldn't be used.

What were his changes? The limit of 80 was introduced a couple of releases ago. In 2.5.13 IIRC.

c0dexter · 2025-01-14T22:55:30Z

@esev I will ask him tomorrow about his changes, I will let you know

c0dexter · 2025-01-15T04:56:10Z

He told me that he changed only this value: #define MAX_NUM_NODES 80 wt., 14 sty 2025, 23:33 użytkownik Eric Severance ***@***.***> napisał:

…

Oh, I see. 2.5.19 was only released 20 hours ago. But I bet the build string in his version contains 2.5.19. What were his changes? — Reply to this email directly, view it on GitHub <#5491 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AG6MBDZ6UORYSE7WHMI3FXT2KWGDPAVCNFSM6AAAAABS3TYQNKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJRGI2DIOBUHA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

cracky22 · 2025-01-15T06:48:40Z

@c0dexter But what did he change the 80 nodes to?

cracky22 · 2025-01-15T06:56:37Z

@garthvh @fifieldt @GUVWAF Interesting info, the RAK stays online for more than 3 days if it is not charging but just connected to the battery....

Edit: If that's relevant, I have the RAK connected to a solar panel. Soshine 6w

c0dexter · 2025-01-15T07:09:17Z

During testing and looking for reason of issue with freezing node based on NRF, he tried to limit node.DB because he thought that it could be a potential problem with device stability if device will have too small spacer in storage. That's why he limit node list to 80 Maybe it's not a solutions for solve the main problem but limitting nodes to 80 on 2.5.19 FW showing in his case that node is still live śr., 15 sty 2025, 07:50 użytkownik Martin Blieninger < ***@***.***> napisał:

…

@c0dexter <https://github.com/c0dexter> But *what* did he change the 80 nodes to? — Reply to this email directly, view it on GitHub <#5491 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AG6MBDZFFCXP6VW3HFQAK2L2KYAKPAVCNFSM6AAAAABS3TYQNKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJRG43TGMJTG4> . You are receiving this because you were mentioned.Message ID: ***@***.***>

cracky22 · 2025-01-15T07:20:06Z

@c0dexter ok thanks. Can I limit the nodes myself, for example to 60?
I still have the freezing problem

c0dexter · 2025-01-15T07:48:59Z

You can do this based on 2.5.19 (there are fixes and enghacements for NRF) I think, but you should compile FW by your own. Take a look on this old info From Meshtastic team. FW 2.5+ is larger, need more space so number of nodes could be a problem because of small space in storage https://ibb.co/4N9nwTV

…

--- Another one thing, do you have the same issue with freezing your node devices based on NRF if you're using iPhone? The root issue is located definitelly around BT connection/re-connection. Other friend told me that this issue doesn't come up if you're using iPhone, but every time if you are using Android. Maybe it's worth to investigate this hint to confirm or not. śr., 15 sty 2025, 08:20 użytkownik Martin Blieninger < ***@***.***> napisał:

@c0dexter <https://github.com/c0dexter> ok thanks. Can I limit the nodes myself, for example to 60? I still have the freezing problem — Reply to this email directly, view it on GitHub <#5491 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AG6MBD7ORCLTJPQIMZAL3CL2KYD35AVCNFSM6AAAAABS3TYQNKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJRHAYTINBYGY> . You are receiving this because you were mentioned.Message ID: ***@***.***>

c0dexter · 2025-01-15T08:00:11Z

Hmm... Seems that guys in this ticket: #5858 are trying to solve issue with node.db and bluetooth issue on Android śr., 15 sty 2025, 08:48 użytkownik Michał Dobrowolski < ***@***.***> napisał:

…

You can do this based on 2.5.19 (there are fixes and enghacements for NRF) I think, but you should compile FW by your own. Take a look on this old info From Meshtastic team. FW 2.5+ is larger, need more space so number of nodes could be a problem because of small space in storage https://ibb.co/4N9nwTV --- Another one thing, do you have the same issue with freezing your node devices based on NRF if you're using iPhone? The root issue is located definitelly around BT connection/re-connection. Other friend told me that this issue doesn't come up if you're using iPhone, but every time if you are using Android. Maybe it's worth to investigate this hint to confirm or not. śr., 15 sty 2025, 08:20 użytkownik Martin Blieninger < ***@***.***> napisał: > @c0dexter <https://github.com/c0dexter> ok thanks. Can I limit the nodes > myself, for example to 60? > I still have the freezing problem > > — > Reply to this email directly, view it on GitHub > <#5491 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AG6MBD7ORCLTJPQIMZAL3CL2KYD35AVCNFSM6AAAAABS3TYQNKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJRHAYTINBYGY> > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> >

cracky22 · 2025-01-15T08:46:10Z

During testing and looking for reason of issue with freezing node based on
NRF, he tried to limit node.DB because he thought that it could be a
potential problem with device stability if device will have too small
spacer in storage. That's why he limit node list to 80

Maybe it's not a solutions for solve the main problem but limitting nodes
to 80 on 2.5.19 FW showing in his case that node is still live

śr., 15 sty 2025, 07:50 użytkownik Martin Blieninger <
@.***> napisał:

@c0dexter https://github.com/c0dexter But what did he change the 80
nodes to?

—
Reply to this email directly, view it on GitHub
#5491 (comment),
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AG6MBDZFFCXP6VW3HFQAK2L2KYAKPAVCNFSM6AAAAABS3TYQNKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJRG43TGMJTG4
.
You are receiving this because you were mentioned.Message ID:
@.***>

I can't see the 80 node limit in https://github.com/meshtastic/firmware/blob/master/variants%2Frak4631%2Fvariant.h

thebentern · 2025-01-15T11:39:27Z

There is already an 80 node limit on NRF52 devices. It's defined in nrf52.ini, not in the variant.

cracky22 · 2025-01-15T13:48:07Z

Hi Ben, can you please build me something that will automatically restart my node every 12 hours? Something that I can put into the firmware myself.. I would really like to temporarily stop my crashes..

garthvh · 2025-01-15T15:38:09Z

It does not happen on iOS, we do not do workarounds where things reboot on timers, need to find the source of the bug.

cracky22 · 2025-01-15T15:53:30Z

yes, i know that and i appreciate that you guys fix most of the bugs, but i currently need a workaround solution to temporarily fix the problem. it's quite annoying to have to unplug the rak which is meant to be offside node in my case every 6-9 hours!
To my knowledge there is no firmware version where the problem does not occur, before 2.5 everything was fine

esev · 2025-01-15T16:33:52Z

For folks with the lfs debug error messages, how often do you move your phone outside of Bluetooth range with your nodes without disconnecting first? I think the busyTx errors might be a separate issue.
See #5839 for the lfs debug issue. We may be real close to a fix.

JimTheCowboy · 2025-01-15T18:43:23Z

For folks with the lfs debug error messages, how often do you move your phone outside of Bluetooth range with your nodes without disconnecting first? I think the busyTx errors might be a separate issue. See #5839 for the lfs debug issue. We may be real close to a fix.

I am not sure which issue we have, "lfs debug" or "busyTx".

But I can answer your BT range question:
Most users won't care about disconnecting before walking off.

Some nodes are in a fixed position, the user comes and goes as he pleases.
Other nodes are in cars, the user moves about his house, in and out of range.
I have a portable node which I carry on my belt but have to leave in my car or backpack randomly during the day.

Each of those behaviors will trigger the error over enough elapsed time.

It's just not practical to disconnect it. imho

esev · 2025-01-15T18:44:51Z

It's just not practical to disconnect it. imho

Definitely agree. That's just what we've diagnosed to be the trigger. There is a fix in the works that addresses the lfs debug error cases.

JimTheCowboy · 2025-01-15T18:53:21Z

yes, i know that and i appreciate that you guys fix most of the bugs, but i currently need a workaround solution to temporarily fix the problem. it's quite annoying to have to unplug the rak which is meant to be offside node in my case every 6-9 hours! To my knowledge there is no firmware version where the problem does not occur, before 2.5 everything was fine

before 2.5 everything was fine

With absolute certainty that is not true. The issue started way before 2.5, probably it just got worse afterwards.

One of our users has a T-echo that had no updates for at least 6 months, no issues at all.
His other T-echo which I updated with beta versions, started having issues.

Another user runs his RAK19007/4631 home node and reboots it everyday, because of the boot-loop/memory loss issue.
This has started around late spring/early summer 2024. but we haven't been able to pinpoint the FW version.
(And we stopped trying- sorry)

esev · 2025-01-15T18:56:04Z

What is your current solution for getting out of the boot-loop?

JimTheCowboy · 2025-01-15T18:57:06Z

nrf factory reset...

but most of "our users" are just this: users.
They can't be bothered with python CLI and such, which I understand.

So we who do the updates and settings get the "nice feedback" )=

esev · 2025-01-15T18:59:51Z

So here's my theory. Each of the lfs (LittleFS) errors causes the small file system on the devices to shrink by one block. As those errors keep occurring the file system gets too small to store the preferences/nodedb/etc files. If we can stop the flash write errors (which we've identified to be related to Bluetooth timeouts/disconnects), we can stop LFS from losing a block. #5839

thebentern · 2025-01-15T19:13:19Z

With absolute certainty that is not true. The issue started way before 2.5, probably it just got worse afterwards.

Absolutely correct. Ironically, the intended set of fixes in late 2.4 / early 2.5 to add more atomic save operations actually seems to have exacerbated the issues by introducing more FS contention and less free space. I am optimistic based on some of the recent discoveries that we will improve the reliability dramatically.

cracky22 · 2025-01-15T22:13:21Z

so my problem on the RAK is not only the lfs problem but also busyTX which is why I get a critical error

esev · 2025-01-15T22:17:44Z

so my problem on the RAK is not only the lfs problem but also busyTX which is why I get a critical error

Is it possible the busyTX issue was fixed with #5820?

Edit: Oh! @c0dexter Did your custom firmware include that change?

cracky22 · 2025-01-15T22:35:18Z

@esev so my custom firmware included the fix for this and it still occurs on the RAK...

edit: i used nrf erase before flashing the custom fw

cracky22 added the bug Something isn't working label Dec 2, 2024

fifieldt mentioned this issue Dec 23, 2024

[Bug]: T114 (rev 1) lfs out of space #5648

Open

[Bug]: RAK4631 not responding #5491

[Bug]: RAK4631 not responding #5491

Comments

cracky22 commented Dec 2, 2024

Category

Hardware

Firmware Version

Description

Relevant log output

SimbimChimbetov commented Dec 4, 2024

cracky22 commented Dec 4, 2024

SimbimChimbetov commented Dec 4, 2024

cracky22 commented Dec 4, 2024

markbirss commented Dec 4, 2024 • edited Loading

cracky22 commented Dec 4, 2024

cracky22 commented Dec 4, 2024

cracky22 commented Dec 4, 2024

cracky22 commented Dec 4, 2024

thebentern commented Dec 4, 2024 • edited Loading

cracky22 commented Dec 4, 2024

thebentern commented Dec 4, 2024

cracky22 commented Dec 5, 2024

cracky22 commented Dec 5, 2024

thebentern commented Dec 5, 2024

cracky22 commented Dec 6, 2024

garthvh commented Dec 6, 2024

cracky22 commented Dec 7, 2024 • edited Loading

cracky22 commented Dec 7, 2024

cracky22 commented Dec 7, 2024

cracky22 commented Dec 10, 2024

markbirss commented Dec 10, 2024

cracky22 commented Dec 11, 2024

markbirss commented Dec 11, 2024

cracky22 commented Dec 11, 2024

cracky22 commented Dec 29, 2024

tavdog commented Dec 29, 2024

analogman2 commented Dec 29, 2024

esev commented Jan 12, 2025

cracky22 commented Jan 13, 2025

c0dexter commented Jan 14, 2025

esev commented Jan 14, 2025 • edited Loading

c0dexter commented Jan 14, 2025

esev commented Jan 14, 2025 • edited Loading

c0dexter commented Jan 14, 2025

c0dexter commented Jan 15, 2025 via email

cracky22 commented Jan 15, 2025 • edited Loading

cracky22 commented Jan 15, 2025 • edited Loading

c0dexter commented Jan 15, 2025 via email

cracky22 commented Jan 15, 2025

c0dexter commented Jan 15, 2025 via email

c0dexter commented Jan 15, 2025 via email

cracky22 commented Jan 15, 2025

thebentern commented Jan 15, 2025

cracky22 commented Jan 15, 2025

garthvh commented Jan 15, 2025

cracky22 commented Jan 15, 2025

esev commented Jan 15, 2025 • edited Loading

JimTheCowboy commented Jan 15, 2025

esev commented Jan 15, 2025 • edited Loading

JimTheCowboy commented Jan 15, 2025

esev commented Jan 15, 2025

JimTheCowboy commented Jan 15, 2025 • edited Loading

esev commented Jan 15, 2025 • edited Loading

thebentern commented Jan 15, 2025

cracky22 commented Jan 15, 2025 • edited Loading

esev commented Jan 15, 2025 • edited Loading

cracky22 commented Jan 15, 2025 • edited Loading

markbirss commented Dec 4, 2024 •

edited

Loading

thebentern commented Dec 4, 2024 •

edited

Loading

cracky22 commented Dec 7, 2024 •

edited

Loading

esev commented Jan 14, 2025 •

edited

Loading

esev commented Jan 14, 2025 •

edited

Loading

cracky22 commented Jan 15, 2025 •

edited

Loading

cracky22 commented Jan 15, 2025 •

edited

Loading

esev commented Jan 15, 2025 •

edited

Loading

esev commented Jan 15, 2025 •

edited

Loading

JimTheCowboy commented Jan 15, 2025 •

edited

Loading

esev commented Jan 15, 2025 •

edited

Loading

cracky22 commented Jan 15, 2025 •

edited

Loading

esev commented Jan 15, 2025 •

edited

Loading

cracky22 commented Jan 15, 2025 •

edited

Loading