Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: off-by-one in syslog.rs #810

Merged

Conversation

CertainLach
Copy link
Contributor

@CertainLach CertainLach commented Dec 4, 2023

Fixes: #809

If there is no whitespace in the string, then LIMIT + ascii_utf8_len + DOTDOTDOT_END.len() + NULL_BYTE exceeds BUFSZ = LIMIT + DOTDOTDOT_END.len() + NULL_BYTE;

I'm not sure what's the original purpose of ascii_utf8_len was, to move whitespace to the left part of the buffer? Shouldn't it be dropped then, as DOTDOTDOT_START / DOTDOTDOT_END already include space at end / at start?

I may restore this behavior, if this is preferred.

                mid = message[..mid]
                    .rfind(|c: char| c.is_ascii_whitespace())
                    .map(|p| p + ascii_utf8_len)
                    .unwrap_or(mid)
                    .max(mid)

Copy link

codecov bot commented Dec 4, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (5b25cb0) 57.30% compared to head (6c6b70c) 57.32%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #810      +/-   ##
==========================================
+ Coverage   57.30%   57.32%   +0.02%     
==========================================
  Files          74       74              
  Lines       10497    10505       +8     
==========================================
+ Hits         6015     6022       +7     
- Misses       4482     4483       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@squell
Copy link
Member

squell commented Dec 5, 2023

This fix looks reasonable to me, but the introduction of ascii_utf8_len does seem very purposeful (but also suspect, especially given that with this alteration the code passes through al the test cases with flying colours, which means it's not doing anything that we explicitly test for).

@squell
Copy link
Member

squell commented Dec 5, 2023

@oneElectron

Copy link
Member

@squell squell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I'd say, since DOTDOT_{START,END} contains whitespace: if we found the whitespace we could even omit it anyway (i.e. the left half doesn't need to include it, nor the right one), although that would involve yet more complex logic which this part of the code base doesn't need.

This fix seems to move the whitespace (if it is found) from the left part to the right part (which seems a neutral change to me, wrt the above observation); and it fixes clearly incorrect behaviour in case no whitespace was there.

@oneElectron
Copy link
Contributor

oneElectron commented Dec 5, 2023

I originally wrote this code.
The reason for the ascii_utf_len variable was because a precious iteration of the code required it, something which i forgot to remove later.

However, adding one to the end index of the syslog message was intended to move the whitespace from the start of the next syslog message to the end of the current message, a quick hack to hide the whitespace. I guess I didn't think about this edge case, so thank you for pointing out my mistake.

The proper method of doing this would be to leave out the whitespace entirely, and a simpler method would be to keep the +1 on the index of the last character in the buffer, but to subtract one from the limit.

Edit: The last solution would have the drawback of sometimes wasting space by pushing a word onto the next buffer, when it could habe fit on the current buffer.

@squell
Copy link
Member

squell commented Dec 8, 2023

Thanks for clarifying. I think we can merge this as is (as a quick PR that just fixes a bug & introduces a regression test for it); the remaining issue is then what the proper of presenting long lines in the syslog should be.

I understand that the above suggestion is basically to keep the + 1 but to say

 let left = &message[..mid-1];
 let right = &message[mid..];

But I'm getting a "too hacky" sense (trying to fix a problem by introducing a subtle tweak to a formule here and there); probably it's best to just differentiate clearly between the two cases: either there is some whitespace where we can make a nice break, trim whitespace, etc., or there is none and we just have to cut the message at the limit.

Either of you feel free to open a new issue for that and shoot in a PR!

@squell
Copy link
Member

squell commented Dec 8, 2023

Also, note that this is the kind of memory safety bug that would have gone unnoticed in C (until years down the line someone would have found a way to abuse it); so this is a very nice issue to have uncovered. Thanks both of you!

@squell squell added this pull request to the merge queue Dec 8, 2023
Merged via the queue into trifectatechfoundation:main with commit 7f82d3a Dec 8, 2023
14 checks passed
@CertainLach CertainLach deleted the fix/syslog-off-by-one branch December 8, 2023 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Panics in syslog.rs
3 participants