fix: off-by-one in syslog.rs #810

CertainLach · 2023-12-04T21:14:44Z

Fixes: #809

If there is no whitespace in the string, then LIMIT + ascii_utf8_len + DOTDOTDOT_END.len() + NULL_BYTE exceeds BUFSZ = LIMIT + DOTDOTDOT_END.len() + NULL_BYTE;

I'm not sure what's the original purpose of ascii_utf8_len was, to move whitespace to the left part of the buffer? Shouldn't it be dropped then, as DOTDOTDOT_START / DOTDOTDOT_END already include space at end / at start?

I may restore this behavior, if this is preferred.

                mid = message[..mid]
                    .rfind(|c: char| c.is_ascii_whitespace())
                    .map(|p| p + ascii_utf8_len)
                    .unwrap_or(mid)
                    .max(mid)

Fixes: trifectatechfoundation#809

codecov · 2023-12-04T21:18:07Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (5b25cb0) 57.30% compared to head (6c6b70c) 57.32%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #810      +/-   ##
==========================================
+ Coverage   57.30%   57.32%   +0.02%     
==========================================
  Files          74       74              
  Lines       10497    10505       +8     
==========================================
+ Hits         6015     6022       +7     
- Misses       4482     4483       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

squell · 2023-12-05T09:38:56Z

This fix looks reasonable to me, but the introduction of ascii_utf8_len does seem very purposeful (but also suspect, especially given that with this alteration the code passes through al the test cases with flying colours, which means it's not doing anything that we explicitly test for).

squell · 2023-12-05T09:53:57Z

@oneElectron

squell

Generally I'd say, since DOTDOT_{START,END} contains whitespace: if we found the whitespace we could even omit it anyway (i.e. the left half doesn't need to include it, nor the right one), although that would involve yet more complex logic which this part of the code base doesn't need.

This fix seems to move the whitespace (if it is found) from the left part to the right part (which seems a neutral change to me, wrt the above observation); and it fixes clearly incorrect behaviour in case no whitespace was there.

oneElectron · 2023-12-05T12:44:49Z

I originally wrote this code.
The reason for the ascii_utf_len variable was because a precious iteration of the code required it, something which i forgot to remove later.

However, adding one to the end index of the syslog message was intended to move the whitespace from the start of the next syslog message to the end of the current message, a quick hack to hide the whitespace. I guess I didn't think about this edge case, so thank you for pointing out my mistake.

The proper method of doing this would be to leave out the whitespace entirely, and a simpler method would be to keep the +1 on the index of the last character in the buffer, but to subtract one from the limit.

Edit: The last solution would have the drawback of sometimes wasting space by pushing a word onto the next buffer, when it could habe fit on the current buffer.

squell · 2023-12-08T09:40:17Z

Thanks for clarifying. I think we can merge this as is (as a quick PR that just fixes a bug & introduces a regression test for it); the remaining issue is then what the proper of presenting long lines in the syslog should be.

I understand that the above suggestion is basically to keep the + 1 but to say

 let left = &message[..mid-1];
 let right = &message[mid..];

But I'm getting a "too hacky" sense (trying to fix a problem by introducing a subtle tweak to a formule here and there); probably it's best to just differentiate clearly between the two cases: either there is some whitespace where we can make a nice break, trim whitespace, etc., or there is none and we just have to cut the message at the limit.

Either of you feel free to open a new issue for that and shoot in a PR!

squell · 2023-12-08T09:42:13Z

Also, note that this is the kind of memory safety bug that would have gone unnoticed in C (until years down the line someone would have found a way to abuse it); so this is a very nice issue to have uncovered. Thanks both of you!

fix: off-by-one in syslog.rs

6c6b70c

Fixes: trifectatechfoundation#809

squell approved these changes Dec 5, 2023

View reviewed changes

squell added this pull request to the merge queue Dec 8, 2023

Merged via the queue into trifectatechfoundation:main with commit 7f82d3a Dec 8, 2023
14 checks passed

CertainLach deleted the fix/syslog-off-by-one branch December 8, 2023 11:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: off-by-one in syslog.rs #810

fix: off-by-one in syslog.rs #810

CertainLach commented Dec 4, 2023 •

edited

Loading

codecov bot commented Dec 4, 2023

squell commented Dec 5, 2023

squell commented Dec 5, 2023

squell left a comment

oneElectron commented Dec 5, 2023 •

edited

Loading

squell commented Dec 8, 2023

squell commented Dec 8, 2023

fix: off-by-one in syslog.rs #810

fix: off-by-one in syslog.rs #810

Conversation

CertainLach commented Dec 4, 2023 • edited Loading

codecov bot commented Dec 4, 2023

Codecov Report

squell commented Dec 5, 2023

squell commented Dec 5, 2023

squell left a comment

Choose a reason for hiding this comment

oneElectron commented Dec 5, 2023 • edited Loading

squell commented Dec 8, 2023

squell commented Dec 8, 2023

CertainLach commented Dec 4, 2023 •

edited

Loading

oneElectron commented Dec 5, 2023 •

edited

Loading