-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LIVY-322. Catch JsonParseExceptions in PythonInterpreter on rawtext response from fake_shell. #304
base: master
Are you sure you want to change the base?
Conversation
…ially insert raw text into the sys_stdout in the fake_shell main(). This will then fail to be correctly parsed by PythonInterpreter in the sendRequest, as it will trigger a JsonParseException that is not caught. Added code to catch the JsonParseException and then retry reads of stdout until a valid line of JSON is reached, or 100 retries have been attempted.
Codecov Report
@@ Coverage Diff @@
## master #304 +/- ##
============================================
- Coverage 70.43% 70.22% -0.21%
+ Complexity 685 683 -2
============================================
Files 93 93
Lines 4843 4850 +7
Branches 727 728 +1
============================================
- Hits 3411 3406 -5
- Misses 943 955 +12
Partials 489 489
Continue to review full report at Codecov.
|
@rickbernotas Can you please fix the title? |
I'm not sure if it is a thorough fix or bandaid for this specific issue, because your scenario is little different. @zjffdu can you please take a look? |
Title updated. |
I can't help but think our current implementation in fake_shell.py should be revamped. I don't think fake_shell.py should use stdout/stderr to communicate with livy-sever. @jerryshao @zjffdu Do you 2 happen to know how Zeppelin handles this? |
@rickbernotas I'm so sorry for the late reply. If the answer is yes, I think we can fix this problem in another way:
|
@alex-the-man I suspect that is what is happening, although I don't know for sure the exact behavior of subprocess.call(). subprocess.call() will return a result (return code like 0 or 1 based on success or failure of the command) but then also starts dumping output into stdout. The problem is that output from subprocess.call() ends up in this sys_stdout without ever getting caught and parsed to JSON: https://github.com/cloudera/livy/blob/master/repl/src/main/resources/fake_shell.py#L631 ...so you end up with non-JSON formatted rawtext in the flushed response to the PythonInterpreter. The problem we were having was specific to "hadoop fs -rm" subprocess calls on Hadoop 2.7. In Hadoop 2.7 the response from those commands includes control characters in the output (like tabs, newlines, etc) that I also suspect might have been a contributing factor. If you do the same thing on Hadoop 2.8 (which does not have control chars in the response to hadoop fs commands) the subprocess.call() works fine with Livy. It is as though the combination of how subprocess.call() handles stdout, and the inclusion of control characters in the stdout, creates the problem with the response. I didn't have time to dig deeper on the issue with the stdout in the fake_shell, which is why I ended up just catching the JsonParseException in the PythonInterpreter. |
Would you mind sharing how are you calling subprocess.call()? |
Statement 1 (works fine): import subprocess Statement 2 (works fine): print(1) Statement 3 (works fine): subprocess.call(["hadoop", "fs", "-touchz", "foo.tmp"]) Statement 4 (JsonParseException): subprocess.call(["hadoop", "fs", "-rm", "foo.tmp"]) Statement 5 (fails to return 1, instead returns nothing, so the responses are messed up going forward after statement 4): print(1) This is only reproducible with Hadoop 2.7 due to the formatting of the response from hadoop fs -rm in Hadoop 2.7 (which includes control characters like tab and newline). In Hadoop 2.8 the formatting of the response from hadoop fs commands changed and you will not get the JsonParseException. Note that the hadoop fs -touchz also works fine on Hadoop 2.7, but that response is not breaking the fake_shell response like the -rm is. subprocess.check_output() also works as expected, i.e. if you give the hadoop fs -rm to subprocess.check_output() on Hadoop 2.7, it's okay...but the whole response from subprocess.check_output() is returned as the result as a string, which is different than the behavior of subprocess.call() (which returns a return code as the result, and then dumps the text reponse straight into stdout). Theoretically, however, any command run by subprocess.call() that dumps text output straight to stdout, and also includes control characters in the output, will break the parser in the same fashion. |
Can you change your code to the following just to root cause the problem?
|
Yes, that works, the problematic line was the hadoop fs -rm, but it does work with that change to the call() and returns 0 as result. No JsonParseException seen. subprocess.call(["hadoop", "fs", "-rm", "foo.tmp"], stdout=open("out", "w"), stderr=open("err", "w")) |
@rickbernotas Cool. Processes started by subprocess.call() inherit the original stdin/out/error then. Is this workaround good enough for now? |
Yes, the workarounds are sufficient for us, we were also using subprocess.check_output alternatively (as it is also a workaround). I'm leaving the contents of the PR in place on my end just because it minimally catches users who go ahead and use subprocess.call anyways, and prevents the JsonParseException. But if there is a change in how stdin/out/err is handled by the fake_shell in the future I'll definitely be interested in that. |
I think the proper fix is:
If you are interested, I will do my best to help you on this :). |
Any update on this issue? |
subprocess.call() commands in a PySpark snippet can potentially insert raw text into the sys_stdout in the fake_shell main(). This will then fail to be correctly parsed by PythonInterpreter in the sendRequest, as it will trigger a JsonParseException that is not caught. Added code to catch the JsonParseException and then retry reads of stdout until a valid line of JSON is reached, or 100 retries have been attempted.