Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport unicode fix from piotr1212@17e23ef #2643

Merged
merged 7 commits into from
Oct 22, 2020
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion webapp/graphite/metrics/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,9 +132,14 @@ def find_view(request):
automatic_variants = queryParamAsInt(queryParams, 'automatic_variants', 0)

try:
query = str(queryParams['query'])
if type(queryParams['query']) is unicode:
query = queryParams['query'].encode('utf-8')
else:
query = str(queryParams['query'])
except KeyError:
raise InputParameterError('Missing required parameter \'query\'')
except NameError:
query = str(queryParams['query'])

if query == '':
raise InputParameterError('Required parameter \'query\' is empty')
Expand Down
17 changes: 15 additions & 2 deletions webapp/graphite/render/grammar_unsafe.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,22 @@
from pyparsing import (
Forward, Combine, Optional, Word, Literal, CaselessKeyword,
CaselessLiteral, Group, FollowedBy, LineEnd, OneOrMore, ZeroOrMore,
alphas, alphanums, printables, delimitedList, quotedString, Regex,
alphas, alphanums, delimitedList, quotedString, Regex,
__version__, Suppress, Empty
)

import sys

try:
unichr
except NameError:
unichr = chr

try:
xrange
except NameError:
xrange = range

grammar = Forward()

expression = Forward()
Expand Down Expand Up @@ -93,7 +105,8 @@ def setRaw(s, loc, toks):
).setParseAction(setRaw)('call')

# Metric pattern (aka. pathExpression)
validMetricChars = ''.join((set(printables) - set(symbols)))
unicodePrintables = u''.join(unichr(c) for c in xrange(sys.maxunicode) if not unichr(c).isspace())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 concerns here:

  1. This is a huge range
  2. Should we be excluding more characters than just spaces, there are many unicode categories https://en.wikipedia.org/wiki/Unicode_character_property#General_Category

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yes, I missed that point, thanks! I'll probably put it behind switch then - if user want UTF-8 in metrics, it can enable that. Would it be better?
  2. I think only spaces is good for start, not sure should we add anything later

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with Dan, this is basically the reason why I didn't create the PR back then, I didn't think this was an elegant solution and didn't know how to do it properly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, TIL pyparsing 2.3.0 and newer giving you pyparsing_unicode.printables !
But it's better to hide it behind switch anyway

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah pyparsing_unicode.printables is much better! Still not sure if this is the right way to do this but as long it is behind the switch it is good enough for me.

validMetricChars = ''.join((set(unicodePrintables) - set(symbols)))
escapedChar = backslash + Word(symbols + '=', exact=1)
partialPathElem = Combine(
OneOrMore(
Expand Down