How to create pdf from xml #2461

igivucko · 2023-06-08T12:42:05Z

igivucko
Jun 8, 2023

What would be the best way to insert data from the xml to the new pdf file?

Up until now I parsed the xml file with xmltodict library. Then flattened the nested dictionary, iterated through it and used insert_text function. This works but I am wondering is there some more appropriate method.

I also tried
doc = fitz.open(xml)
xml = doc.convert_to_pdf()
pdf = fitz.open("pdf", xml)
for page in pdf:
text = page.get_text()

this would work but xml file is in shift-JIS encoding and I don't get the correct output.
Does anyone have any suggestions

JorjMcKie · 2023-06-09T10:10:51Z

JorjMcKie
Jun 9, 2023
Maintainer

Text extraction works for all document types. That happens if you extract the text of the XML page?

1 reply

igivucko Jun 9, 2023
Author

Yes, the xml is in the shift-js encoding.
When I convert xml to dict I use shift js encoding. I add japanese font in pymupdf and insert_text works fine.
But when I use the other method, I am able to open xml, convert it but when I use get_text method the text is not in the proper encoding.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to create pdf from xml #2461

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

How to create pdf from xml #2461

igivucko Jun 8, 2023

Replies: 1 comment · 1 reply

JorjMcKie Jun 9, 2023 Maintainer

igivucko Jun 9, 2023 Author

igivucko
Jun 8, 2023

Replies: 1 comment 1 reply

JorjMcKie
Jun 9, 2023
Maintainer

igivucko Jun 9, 2023
Author