-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backport unicode fix from piotr1212@17e23ef #2643
Conversation
@@ -93,7 +105,8 @@ def setRaw(s, loc, toks): | |||
).setParseAction(setRaw)('call') | |||
|
|||
# Metric pattern (aka. pathExpression) | |||
validMetricChars = ''.join((set(printables) - set(symbols))) | |||
unicodePrintables = u''.join(unichr(c) for c in xrange(sys.maxunicode) if not unichr(c).isspace()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 concerns here:
- This is a huge range
- Should we be excluding more characters than just spaces, there are many unicode categories https://en.wikipedia.org/wiki/Unicode_character_property#General_Category
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Yes, I missed that point, thanks! I'll probably put it behind switch then - if user want UTF-8 in metrics, it can enable that. Would it be better?
- I think only spaces is good for start, not sure should we add anything later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with Dan, this is basically the reason why I didn't create the PR back then, I didn't think this was an elegant solution and didn't know how to do it properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, TIL pyparsing 2.3.0 and newer giving you pyparsing_unicode.printables
!
But it's better to hide it behind switch anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah pyparsing_unicode.printables
is much better! Still not sure if this is the right way to do this but as long it is behind the switch it is good enough for me.
Still need to test is it really fix #2641 |
Yes, it's helping with #2641, if UTF8_METRICS is enabled ofc. Merging. |
Backport unicode fix from https://github.com/piotr1212/graphite-web/c…ommit/17e23efecb6cb40059264fabd3ea36718065176e
Fixing #2641