-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
query validation endpoint #2585
Comments
Hi! Julie here. Can you please create the endpoint first and return a generic message like "Query Validator coming soon." and merge that in before working on the actual validator? This will allow us to get the partners working on integration so when the capability is ready we can quickly turn it on. |
@jstewart-shield certainly, I will try to get a quick turnaround on that for you. |
Add a skeleton implementation of a query validation endpoint to allow partners to begin implementation. Part of work for #2585
Add a skeleton implementation of a query validation endpoint to allow partners to begin implementation. Part of work for #2585
@jstewart-shield A skeleton implementation of the endpoint is up in PR #2595. A POST to the endpoint {
"HasResults": false,
"Messages": [
"Query validator coming soon."
],
"OperationTimeMS": 0
} |
@ivakegg are there any cases for the rules above where we need to support query model mapping? |
Yes! At least #3, #4, #5 (maybe #8?) in the list will need to check the query model. Also, another thing we want to do in the future if verifying a field is active (has had data ingested) during the date range of the query. That would need to check the query model and verify at least one (not all) of the fields is active. |
@jstewart-shield understood, thank you for the clarification. |
Follow-up question for @ivakegg and @jstewart-shield, should any of these validation criteria also apply to fields given via query parameters, or just the query itself? |
Oh, great question. I would say long term that we would definitely want to consider that since it's part of the query and will impact the results but it's ok if the first version doesn't do it. |
@jstewart-shield understood, thanks. |
A couple of additional questions for @ivakegg and @jstewart-shield:
|
|
|
@ivakegg @jstewart-shield for #10, do we still want to flag cases for escaped wildcards inside of phrases, such as |
Julie came back and said that any wildcard in the field value regardless of escaping should be flagged. What does the lucene parser do with such an expression anyway? |
She is goint to test that and see what happens. |
See below for examples on how the parser handles the LUCENE to JEXL conversion with wildcards:
It looks like backslashes are dropped when the LUCENE query is parsed by AccumuloSyntaxParser. |
You got to it before she did... thanks. I think the answer still stands. If there is a '*' in the expression within quotes, then flag it as potentially invalid |
Understood, will do. |
Sorry, to add. If the * within the quotes is escaped then you don't need to flag it because I would assume the user was purposely trying to search the * and not intending a wildcard if they took the time to escape it. |
@jstewart-shield understood, I'll factor that into the rule logic. |
* Add a skeleton implementation of a query validation endpoint to allow external clients to begin implementation. Part of work for #2585
Currently we have a plan endpoint that syntactically validates the query, however there are many ways a user can create a valid query that is most obviously not what they had intended. We need an end point where we can configure rules that find potential query issues and return text describing what those may be.
Here is a list if issues with examples that we should be able to detect. Most all of these based on LUCENE queries:
FIELD:1234 OR 5678 -> should be FIELD:(1234 OR 5678)
(FIELD:1234 OR 5678) -> should be FIELD:(1243 OR 5687)
anywhere that the boolean operator is missing
FIELD:term1 term2 -> should be FIELD:"term1 term2"
If you see field X, then output message Y
If you see pattern X (regex), then output message Y
#INCLUDE(OR,value) -> should be #INCLUDE(OR, FIELD, value)
FIELD1:abc OR FIELD2:def NOT FIELD3:123 -> should be (FIELD1:abc OR FIELD2:def) NOT FIELD3:123
FIELD:"abc*def" -> either the * is incorrect or they meant FIELD:/abc*def/
FIELD:"term1 term2 term3"~1 -> the 1 should be 3 or greater
FIELD:abc:def -> should be FIELD:abc\:def
The text was updated successfully, but these errors were encountered: