-
Notifications
You must be signed in to change notification settings - Fork 36
Fix the profile API returns prematurely. #340
Fix the profile API returns prematurely. #340
Conversation
MultiResponsesDelegateActionListener helps send multiple requests asynchronously and return one final response altogether. While waiting for all inflight requests, the method respondImmediately and failImmediately can stop waiting and return immediately. While these two methods are convenient, it is easy to misuse them and cause bugs (see opendistro-for-elasticsearch#339 for example). This PR removes the method respondImmediately and failImmediately and refactor profile runner to avoid using them. This PR also stops printing out the unknown entity state since it is not useful. Testing done: 1. Added unit tests to verify the bug fix. 2. Manual tests to run profile calls for single-stream and multi-entity detectors for different phases of the detector lifecycle (disabled, init, running). Verified profile results make sense.
Codecov Report
@@ Coverage Diff @@
## master #340 +/- ##
============================================
+ Coverage 75.50% 76.30% +0.80%
- Complexity 2160 2224 +64
============================================
Files 207 209 +2
Lines 10030 10139 +109
Branches 898 902 +4
============================================
+ Hits 7573 7737 +164
+ Misses 2035 1991 -44
+ Partials 422 411 -11
Flags with carried forward coverage won't be shown. Click here to find out more. |
src/main/java/com/amazon/opendistroforelasticsearch/ad/AnomalyDetectorProfileRunner.java
Show resolved
Hide resolved
new MultiResponsesDelegateActionListener<EntityProfile>( | ||
listener, | ||
totalResponsesToWait, | ||
"Fail to fetch profile for " + entityValue + " of detector " + detectorId, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may replace with FAIL_FETCH_ERR_MSG
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replaced
@@ -214,7 +214,7 @@ public XContentBuilder toXContent(XContentBuilder builder, Params params) throws | |||
if (modelProfile != null) { | |||
builder.field(CommonName.MODEL, modelProfile); | |||
} | |||
if (state != null) { | |||
if (state != EntityState.UNKNOWN) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible for state to be null
? Do we still need to keep checking if state is null
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible. Added back.
listener.onFailure(new RuntimeException(CommonErrorMessages.FAIL_TO_FIND_DETECTOR_MSG + detectorId, e)); | ||
} | ||
} else { | ||
listener.onFailure(new RuntimeException(CommonErrorMessages.FAIL_TO_FIND_DETECTOR_MSG + detectorId)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use AnomalyDetectionException
here? Same question for other places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do it. Most of transport APIs use AnomalyDetectionException (except the recent ones added by Sarat). When I reviewed Sarat's PRs, I thought about pointing it out, but didn't because changing to use AnomalyDetectionException does not add too much benefit except that we have standardized the exceptions we throw. Our public APIs do not standardize the exception it throws back to the user. Take profile API as an example, sometimes it throws IOException, sometimes NullPointerException, and sometimes RuntimeException. Do you think we should change all of public APIs and transport APIs to use AnomalyDetectionException? The benefit is that we can have a standard wrapper exception to throw. The drawback is that this might be another PR due to the large changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we only catch exceptions in AD realtime job and count in failure stats. We may need to count all exceptions from other places, then we need to wrap the exception.
No so urgent. We can fix it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense.
src/main/java/com/amazon/opendistroforelasticsearch/ad/model/EntityProfile.java
Show resolved
Hide resolved
…asticsearch#340) Besides backporting, this PR also: First, it fixes another premature return in AnomalyDetectorProfileRunner.onInittedEver. Second, it replaces set-env with GITHUB_ENV so that backport PR can invoke CI. The `set-env` command is disabled. Testing done: 1. Verfied manually the new early return bug is fixed.. 2. Manual tests to run profile calls for single-stream and multi-entity detectors for different phases of the detector lifecycle (disabled, init, running). Verified profile results make sense.
* Backport: Fix the profile API returns prematurely. (#340) Besides backporting, this PR also: First, it fixes another premature return in AnomalyDetectorProfileRunner.onInittedEver. Second, it replaces set-env with GITHUB_ENV so that backport PR can invoke CI. The `set-env` command is disabled. It also fixes golang lint version so that CI can run. Testing done: 1. Verfied manually the new early return bug is fixed.. 2. Manual tests to run profile calls for single-stream and multi-entity detectors for different phases of the detector lifecycle (disabled, init, running). Verified profile results make sense.
* Fix the profile API returns prematurely. MultiResponsesDelegateActionListener helps send multiple requests asynchronously and return one final response altogether. While waiting for all inflight requests, the method respondImmediately and failImmediately can stop waiting and return immediately. While these two methods are convenient, it is easy to misuse them and cause bugs (see #339 for example). This PR removes the method respondImmediately and failImmediately and refactor profile runner to avoid using them. This PR also stops printing out the unknown entity state since it is not useful. Testing done: 1. Added unit tests to verify the bug fix. 2. Manual tests to run profile calls for single-stream and multi-entity detectors for different phases of the detector lifecycle (disabled, init, running). Verified profile results make sense.
Issue #, if available:
#339
Description of changes:
MultiResponsesDelegateActionListener helps send multiple requests asynchronously and return one final response altogether. While waiting for all inflight requests, the method respondImmediately and failImmediately can stop waiting and return immediately. While these two methods are convenient, it is easy to misuse them and cause bugs (see #339 for example). This PR removes the method respondImmediately and failImmediately and refactor profile runner to avoid using them.
This PR also stops printing out the unknown entity state since it is not useful.
Testing done:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.