Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When calling the html serializer pass an encoding #239

Merged
merged 1 commit into from
May 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions news/238.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix an issue with unicode characters happening with lxml 5 [ale-rt]
27 changes: 26 additions & 1 deletion src/plone/app/theming/tests/test_transform.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
from App.config import getConfiguration
from diazo.compiler import compile_theme
from html import unescape
from lxml import etree
from os import environ
from plone.app.testing import setRoles
from plone.app.testing import TEST_USER_ID
from plone.app.theming.interfaces import IThemeSettings
from plone.app.theming.testing import THEMING_FUNCTIONAL_TESTING
from plone.app.theming.testing import THEMING_INTEGRATION_TESTING
from plone.app.theming.transform import ThemeTransform
from plone.app.theming.utils import applyTheme
from plone.app.theming.utils import getTheme
Expand All @@ -25,7 +27,30 @@
import unittest


class TestCase(unittest.TestCase):
class IntegrationTestCase(unittest.TestCase):

layer = THEMING_INTEGRATION_TESTING

def test_transform_parseTree_with_unicode(self):
request = self.layer["request"]
request.response.setHeader("Content-Type", "text/html; charset=utf-8")
transform = ThemeTransform(None, request)
snippet = "\n".join(
(
"<!DOCTYPE html>",
"<html>",
"<body>",
"<div>à</div>",
"</body>",
"</html>",
)
)
parsed = transform.parseTree([snippet.encode()])
serialized = unescape(parsed.serialize().decode())
self.assertEqual(snippet, serialized)


class FunctionalTestCase(unittest.TestCase):
layer = THEMING_FUNCTIONAL_TESTING

def setUp(self):
Expand Down
5 changes: 4 additions & 1 deletion src/plone/app/theming/transform.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
from zope.component import adapter
from zope.interface import implementer
from zope.interface import Interface
from ZPublisher.HTTPRequest import default_encoding

import logging

Expand Down Expand Up @@ -120,7 +121,9 @@ def parseTree(self, result):
return None

try:
return getHTMLSerializer(result, pretty_print=False)
return getHTMLSerializer(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's beyond the scope of this fix, but I was doing a transform yesterday and changed getHTMLSerializer from repoze.xmliter with lxml.etree.HTMLParser. Should we actually move to that one?

repoze.itertools does not look much taken care of 😓

And actually it is just a tiny wrapper to lxml...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks a nice idea to get rid of a dependency.
I created #240. I added the milestone 6.1 there, I would avoid doing that change for Plone 6.0 right now.

result, pretty_print=False, encoding=default_encoding
)
except (AttributeError, TypeError, etree.ParseError):
return None

Expand Down