-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A way to get the row/column of a Reader #109
Comments
This could be implemented somewhere along these lines: impl<B: Seek + BufRead> Reader<B> {
fn row_col(&self) -> (usize, usize) {
// seek to 0, count lines until `buffer_position` is reached
}
} (add a proper return type and a better name to taste) Of course, this is pretty slow as it re-reads the entire content. |
This looks like a good solution. |
Out of curiousity, @jonas-schievink do you plan on doing a PR? |
Not right now, no. If anyone wants to implement this, feel free! |
I implemented a That works fine; the problem is that As such, I've had to abandon I've uploaded my work-in-progress if anyone has a use for it. |
Thanks for the report. |
The At a bare minimum, I think |
Great. Thank you |
@DanielKeep I have created #123 which fixes an offset by one for I understand that you may not have time to spend on it but in case, I'll let it open for the moment and merge it later. Thanks again for your feedback! EDIT: All your comments about a proper way to manage buffer position are still valid. This is just a fix for the current situation. I'll try to find a way to explicitly manage positions (probably before managing row/columns). |
Any progress on this? |
I've begun work on something similar, which can possibly be used as a building block for this - adding |
That sounds like a decent improvement. Separately on the question of calculating a line + column, it may not be terribly expensive to do SIMD-accelerated passes over the buffer continuously to keep track of how many lines have been seen previously. That would enable calculating the line + column without needing to "Seek" or have the entire document buffered, as you could start counting from the beginning of the current buffer and add it to what has been seen previously. One issue to look out for is encodings. Byte positions may not perfectly match up depending on how they are calculated in relation to decoding of the contents, |
Actually, I have a prototype implementation for several mouths in my private branch which I've pushed right now. I've used approach with a feature that enables storing
I do not think that this will be a problem because although we work with bytes, we ensure that our boundaries are correct (=not within a character). I think, that users of spans in any case will expect that this numbers are indexes in the original byte stream. |
To clarify, does this mean that my work in #552 is unnecessary? Or would you like me to continue with it? |
Yes, as I can see, the parts that you've done in #552 I already implemented and I have feelings that they are worked right (but I didn't create tests for that yet, that's the one reason why I didn't make a PR). I think, it would be better to use my approach, namely that parts:
You are free to finish my work if you like (you can rebase your branch over mine and update your PR if you wish). I'm not ready to finish it myself for now. I do not insist on using feature-flag, I introduced it because spans can occupy some space and if you does not use them it is just waste of memory, but maybe the savings are not so great. So the things that's should be implemented:
|
How would I go about rebasing a branch on my fork over a branch on a different fork? Sorry, I'm not entirely well-versed in git usage; I've sort of been learning it as I go along.
The primary reason for ignoring |
If you on Windows, then you're probably using TortoiseGit, then just use RMB-click on my commit in commit log and select Rebase "parser_position_tracking" onto this...(G). The other GUI clients should have the similar ways of doing that. If your prefer command-line, then I think you already known how to find required information.
I cannot imagine a reason why someone would be need to compare two events to compare their content or at all. I think we should not worry about that. |
xml-rs provides
TextPosition
for this, while quick-xml only hasReader::buffer_position()
to get the byte position. It would be useful for error reporting if there was a way to compute the row and column from this (preferably without overhead if unused).The text was updated successfully, but these errors were encountered: