-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OutOfMemoryException when parsing Excel document / endless while-loop #30
Comments
I think i have a fix for this but i am in the middle of the release crunch time. @henning-krause can it wait or you need it urgently? |
@henning-krause , fix provided doesn't prevent exception on file loading by OpenMcdf in the described use-case. Is this the expected behaviour? |
@henning-krause : pr31 merged to master. |
@ironfede Of course, this doesn't fix the underlying problem, but at least the load operation fails fast and does not cause an OutOfMemoryException. So it's a quick fix. That being said, I don't even think that this PR fixed all problems - only the simple ones. In multiple places, the following check is performed: if (next != nextSecID)
nextSecID = next;
else
throw new CFCorruptedFileException("Cyclic sector chain found. File is corrupted"); This certainly helps when you have a corrupt file where one sector points to itself. But what if you have something like a cyclic chain? 1 -> 2 -> 3 -> 1 This wouldn't be catched and we would be in the same mess as before. I would propose a change in that check that we keep a list of all processed ids and if we hit one we had before, we'll throw the CFCorruptedFileException. As for the source file: It was created by one of our customers. I'll try to strip out all the content and to reproduce the error. If it still happens, I might be able to share it with you. |
@ironfede shouldn't we implement the safeguards I wrote about in my last message? We're happy to provide a PR for that. |
Ok, thank you @henning-krause ! |
Hi, Not sure what the situation is with the original repro for this, but on a related note: I've been been trying to have a play with SharpFuzz, and when Itried it against OpenMcdf, it produced the file out_of_memory.zip which causes out of memory errors if you try to load it (i only had a quick look at debugging it, but it did look like it included repeated sector ids as suggested above). |
would something like Numpsy@e9badd8 work as approach to treat repeated sector ids as an error? |
I think that would work - not ideal from a performance point of view. However, this check should be implemented in all the places where this error can occur: GetMiniSectorChain, GetFatSectorChain and GetNormalSectorChain. @ironfede Would you agree? If yes, I would implement the necessary changes. |
Yes, there are potentially issues in multiple places (there is a file attached to #40 that makes it blow up in GetDifatSectorChain). |
I'm trying to parse an Excel document (xls) using the OpenMcdf.Extensions package.
We call AsOLEProperties extension method and eventually, the execution gets stuck in this while loop (https://github.com/ironfede/openmcdf/blob/master/sources/OpenMcdf/CompoundFile.cs#L1493-L1514) :
When the loop is entered, the nextSecID is 27. At the end of the loop the nextSecID is set to 0. And the nextSecID keeps being null as the same data is read on each loop.
Is 0 even a valid value for nextSecID?
Any idea, what to do about this?
The document in question can be opened fine in Excel. Unfortunately, we got it from a customer of ours, so we can't share it.
The text was updated successfully, but these errors were encountered: