-
-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MS Word document resulted from RTFEmbeddedObject.getData() byte array cannot be opened #118
Comments
Just to confirm, is the |
Also... if possible could you include the MPP file that the RTF came from? |
Hello, the |
Thanks for the update. Do you have a way to get the original OLE object out of the database without going through the RTF export exercise your describe? I'm looking at starting with a "known good" file which MS Word can open, then comparing that to what we're able to extract from the RTF. |
I'll upload an original OLE file, but it isn't openable by MS Word. In order to be able to open it, I have to add a header and convert it to RTF. |
Hello, I'm trying to extract an MS Word file embedded in an RTF file by using RTFEmbeddedObject.getEmbeddedObjects(String file). The method returns a list with four instances, which is expected. When I check the resulting data array with Apache Tika, it returns the application/x-tika-msoffice mime type, which seems correct.
However, when I try to open the resulting file, it doesn't show the expected result on MS Word. I will attach both files on this issue.
here's the code that I'm using:
`
List<List> rtfl = RTFEmbeddedObject.getEmbeddedObjects(readLineByLine(file));
`
Attachments at:
rtfword.zip
Thanks in advance!
The text was updated successfully, but these errors were encountered: