-
-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with whitespace #6
Comments
Since the html data in your example looks like valid xml, it get parsed as well. So when you query The proper way of putting data like this in XML is wrapping it inside CDATA like this
|
The XML I'm working with is as proper as it's going to get. This example uses JATS, which is a highly structured and quite strict DTD used in scholarly publishing. It's possible my example wasn't entirely clear. I do want to strip all tags. In this case, I'm only interested in the text.
I've marked where removed tags resulted in text being concatenated. Would you consider having a single space character be placed between removed tags instead of concatenating the text, maybe as an option? |
I see. You only want to place space char in place of those remove tags. For now, it's not possible because I don't check whether the path is a leaf node or contain child nodes inside. |
OK, thank you for considering! |
I'm having an issue with whitespace and I'm wondering if Camaro is handling it as-designed, or if I should look to another package to help with this.
Given this chunk of XML (truncated, but you get the idea)
Using this to construct my template…
body: "article/body",
I get this result…
I do want to take the entire text of the body as just text, without any tags preserved. Should I expect to see a space character between where tags were stripped, or should it be concatenated like this?
The text was updated successfully, but these errors were encountered: