-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: RAK4631 not responding #5491
Comments
This problem only on 2.5.15? |
No, unfortunately not. I have already tried older fw versions and always encountered this problem. Do you know a working fw? Actually it works fine if I just change the channel Our place but I do not use the ones from the city |
I'm not sure but maybe I have the same problem with v 2.5.11 |
It would be interesting to know if it is the same for other nrf52 devices. I have a t1000-e but it works there. Could it be due to the different storage options such as ram and eeprom? |
All nrf52 devices have only 28kb of littlefs storage available where preferences, ble parings and nodedb are stored (the littlefs block size will also waste some space) As a measure to prevent this issue the nodedb (db.proto) size was reduced from 100 to 80 recently you can confirm the size of the nodedb file running this test fw file (https://discord.com/channels/867578229534359593/919642584480112750/1305904252626927729) use "list-files-s140_nrf52_611_softdevice-1.0.0.4265ae9.uf2" for rak6431 if you are able to share the output this could help understand the issue further |
I can't even open the uf2, can you put it here? |
1 similar comment
I can't even open the uf2, can you put it here? |
So that means I connect the RAK to my computer and send you the logs? |
1 similar comment
So that means I connect the RAK to my computer and send you the logs? |
Please note that this file will not work for the T1000-E, as it uses a different SoftDevice version |
. |
Or is it just serial? |
It is just serial logs. I like to use tio (https://github.com/tio/tio) because it will re-attach to the device in the case of a failure or reboot. I ran the RAK node for about 7 hours yesterday on the msh/US topic and picked up over 600 nodes with no crashes or failures. To rule out any issues with file corruption problems, have you tried a factory reset (or even just nodedb rese)? |
How did you get 600 nodes? We're not talking about MQTT, are we? |
Yes, that is how you get to 600 nodes quickly, the topics are for mqtt |
I know that this is unnecessary, but it would be possible if there was an option in the firmware to query how much "memory" is available on the board and that could also be displayed in the Android app in a small graphic/text |
the littlefs support specific for nrf52 dont currently have function to get free space |
Ok, I can do that. But what about my problem with the RAK? What can I do or how can I debug further |
Ok, are you able to capture log as the reboot/crash occur at all? |
This is already the list. |
@tavdog hi, I just discovered your commit #5670 and updated it straight away - I hope this solves my problems with the RAK. Is there a possibility that the watchdog, for example, has a fixed memory area in which it logs data? would be very useful if my RAK crashes again. Unfortunately, I can't have it connected to the PC 24/7 and save the logs |
My fix only formats the filesystem when lfs assert is triggered. I don't think it will help your issue and if it does it will probably result is a wiped state. |
I am experiencing the same or similar issue on a RAK Wisblock 4631. The node crashed after a week and the last time I checked >150 nodes. I got to this issue from this issue #5648
|
Just an observation: For the connection just prior to the LFS failures, there is no
|
@garthvh soo, the funny thing about the problem is that the RAK flashes all the time as long as you are connected in the serial console and have the tab active. as soon as several log messages accumulate (in the background) and the rak freezes (i.e. the LED lights up continuously), it looks as if it would bootloop, which is the problem i currently have. this means i have no connection on the cell phone. however, if i now go to the tab with the serial console on the pc and scroll down to the end of the logs with the mouse, the rak disconnects from this state. as soon as all logs are loaded, the android device reconnects and the rak starts flashing again here are my logs |
The same issue had my friend on the custom built NRF node, he flashed modified 2.5.19 FW with limit notes list to 80 pcs. Now his node is stable, no issues after connecting and disconnecting BT, loading collected messages after re-connecting to node and so on. Few days without any freez and non-controlled resets or loosing configs. |
@c0dexter Do you mean 2.5.18 FW? Installing 2.5.19 on NRF52 nodes can/will cause a different issue. |
@esev no, I mean 2.5.19 - he compiled this version with his changes - he limited nodes.db to 80 items. Maybe this version has more issues, but for him is working more than 72h without unexpected behavior |
Oh, I see. 2.5.19 was only released 20 hours ago. But I bet the build string in his version contains 2.5.19. The version released 20 hours ago really shouldn't be used. What were his changes? The limit of 80 was introduced a couple of releases ago. In 2.5.13 IIRC. |
@esev I will ask him tomorrow about his changes, I will let you know |
He told me that he changed only this value:
#define MAX_NUM_NODES 80
wt., 14 sty 2025, 23:33 użytkownik Eric Severance ***@***.***>
napisał:
… Oh, I see. 2.5.19 was only released 20 hours ago. But I bet the build
string in his version contains 2.5.19.
What were his changes?
—
Reply to this email directly, view it on GitHub
<#5491 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AG6MBDZ6UORYSE7WHMI3FXT2KWGDPAVCNFSM6AAAAABS3TYQNKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJRGI2DIOBUHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@c0dexter But what did he change the 80 nodes to? |
During testing and looking for reason of issue with freezing node based on
NRF, he tried to limit node.DB because he thought that it could be a
potential problem with device stability if device will have too small
spacer in storage. That's why he limit node list to 80
Maybe it's not a solutions for solve the main problem but limitting nodes
to 80 on 2.5.19 FW showing in his case that node is still live
śr., 15 sty 2025, 07:50 użytkownik Martin Blieninger <
***@***.***> napisał:
… @c0dexter <https://github.com/c0dexter> But *what* did he change the 80
nodes to?
—
Reply to this email directly, view it on GitHub
<#5491 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AG6MBDZFFCXP6VW3HFQAK2L2KYAKPAVCNFSM6AAAAABS3TYQNKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJRG43TGMJTG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@c0dexter ok thanks. Can I limit the nodes myself, for example to 60? |
You can do this based on 2.5.19 (there are fixes and enghacements for NRF)
I think, but you should compile FW by your own. Take a look on this old
info From Meshtastic team. FW 2.5+ is larger, need more space so number of
nodes could be a problem because of small space in storage
https://ibb.co/4N9nwTV
…---
Another one thing, do you have the same issue with freezing your node
devices based on NRF if you're using iPhone? The root issue is located
definitelly around BT connection/re-connection. Other friend told me that
this issue doesn't come up if you're using iPhone, but every time if you
are using Android. Maybe it's worth to investigate this hint to confirm or
not.
śr., 15 sty 2025, 08:20 użytkownik Martin Blieninger <
***@***.***> napisał:
@c0dexter <https://github.com/c0dexter> ok thanks. Can I limit the nodes
myself, for example to 60?
I still have the freezing problem
—
Reply to this email directly, view it on GitHub
<#5491 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AG6MBD7ORCLTJPQIMZAL3CL2KYD35AVCNFSM6AAAAABS3TYQNKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJRHAYTINBYGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hmm... Seems that guys in this ticket:
#5858 are trying to solve issue
with node.db and bluetooth issue on Android
śr., 15 sty 2025, 08:48 użytkownik Michał Dobrowolski <
***@***.***> napisał:
… You can do this based on 2.5.19 (there are fixes and enghacements for NRF)
I think, but you should compile FW by your own. Take a look on this old
info From Meshtastic team. FW 2.5+ is larger, need more space so number of
nodes could be a problem because of small space in storage
https://ibb.co/4N9nwTV
---
Another one thing, do you have the same issue with freezing your node
devices based on NRF if you're using iPhone? The root issue is located
definitelly around BT connection/re-connection. Other friend told me that
this issue doesn't come up if you're using iPhone, but every time if you
are using Android. Maybe it's worth to investigate this hint to confirm or
not.
śr., 15 sty 2025, 08:20 użytkownik Martin Blieninger <
***@***.***> napisał:
> @c0dexter <https://github.com/c0dexter> ok thanks. Can I limit the nodes
> myself, for example to 60?
> I still have the freezing problem
>
> —
> Reply to this email directly, view it on GitHub
> <#5491 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AG6MBD7ORCLTJPQIMZAL3CL2KYD35AVCNFSM6AAAAABS3TYQNKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKOJRHAYTINBYGY>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
I can't see the 80 node limit in https://github.com/meshtastic/firmware/blob/master/variants%2Frak4631%2Fvariant.h |
Hi Ben, can you please build me something that will automatically restart my node every 12 hours? Something that I can put into the firmware myself.. I would really like to temporarily stop my crashes.. |
It does not happen on iOS, we do not do workarounds where things reboot on timers, need to find the source of the bug. |
yes, i know that and i appreciate that you guys fix most of the bugs, but i currently need a workaround solution to temporarily fix the problem. it's quite annoying to have to unplug the rak which is meant to be offside node in my case every 6-9 hours! |
For folks with the |
I am not sure which issue we have, "lfs debug" or "busyTx". But I can answer your BT range question: Some nodes are in a fixed position, the user comes and goes as he pleases. Each of those behaviors will trigger the error over enough elapsed time. It's just not practical to disconnect it. imho |
Definitely agree. That's just what we've diagnosed to be the trigger. There is a fix in the works that addresses the |
With absolute certainty that is not true. The issue started way before 2.5, probably it just got worse afterwards. One of our users has a T-echo that had no updates for at least 6 months, no issues at all. Another user runs his RAK19007/4631 home node and reboots it everyday, because of the boot-loop/memory loss issue. |
What is your current solution for getting out of the boot-loop? |
nrf factory reset... but most of "our users" are just this: users. So we who do the updates and settings get the "nice feedback" )= |
So here's my theory. Each of the |
Absolutely correct. Ironically, the intended set of fixes in late 2.4 / early 2.5 to add more atomic save operations actually seems to have exacerbated the issues by introducing more FS contention and less free space. I am optimistic based on some of the recent discoveries that we will improve the reliability dramatically. |
so my problem on the RAK is not only the lfs problem but also busyTX which is why I get a critical error |
@esev so my custom firmware included the fix for this and it still occurs on the RAK... edit: i used nrf erase before flashing the custom fw |
Category
Other
Hardware
Rak4631
Firmware Version
2.5.15.79da236
Description
I've already tried restarting, resetting and even re-flashing, but my RAK4631 with nrf52 chip doesn't hold 80 nodes. I can definitely identify the problem in connection with the nodes because I have created a channel where there are only 8 nodes and the device stays online there.
With over 80 nodes the rak crashes from time to time (between 6 and 9 hours) and the green LED lights up continuously.
What can I do??
Relevant log output
No response
The text was updated successfully, but these errors were encountered: