Added reader rewind to remote payload processor #112

RobGeada · 2023-10-16T14:54:41Z

Motivation

Addresses #111.

Modifications

Adds a byteBuf readerIndex rewind before payloads are parsed in the RemotePayloadProcessor, which deals with any concurrency issues arising from a too-early read of a queued payload. This is a workaround that mitigates the effects of some unknown race condition in bytebuf reading, which should likely still be addressed, but this works in the interim.

Result

Null response payloads no longer occur in high-process-time scenarios

kserve-oss-bot · 2023-10-16T14:54:45Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: RobGeada
To complete the pull request process, please assign tjohnson31415 after the PR has been reviewed.
You can assign the PR to them by writing /assign @tjohnson31415 in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ckadner

Hi @RobGeada -- could this change be tested in a unit test?

Signed-off-by: Rob Geada <[email protected]>

RobGeada · 2023-10-25T13:20:46Z

@ckadner a mock test has been added, checking if the payload processor functions if the payload has been read too early. It's hard to actually verify the payloads sent on the processor without creating a mock receiver, which seems like perhaps too heavy for a single unit test

RobGeada · 2023-11-10T11:06:38Z

@ckadner any updates?

ckadner

Thanks for adding the unit tests.

Going back to your investigation:

It looks like the response bytebufs are read too early somewhere, causing their readerIndex to equal their capacity at processing time. This read occurs somewhere between their addition to the queue in the AsyncPayloadProcessor but before payloads.take() is called. A hacky patch is to call byteBuf = byteBuf.readerIndex(0); as the first line in RemotePayloadProcessor.encodeBinaryToString(), to reset their reader index, and indeed this prevents the issue from arising

Could you add a comment above the line that resets the readerIndex to explain why it is necessary.

Should we spend the time to find out where the response bytebufs are read too early and why. If that read happens erroneously, it might cause problems in other places as well?

ckadner · 2023-11-10T19:57:35Z

src/test/java/com/ibm/watson/modelmesh/payload/RemotePayloadProcessorTest.java

+        ByteBuf byteBuf = Unpooled.wrappedBuffer("{[0, 0.1, 2.3, 4, 5.6]}".getBytes());
+        String encodedString = RemotePayloadProcessor.encodeBinaryToString(byteBuf);
+        assertFalse(encodedString.isEmpty());
+    }
 }


needs a new line at the end of file

njhill · 2023-11-16T00:50:19Z

Thanks @RobGeada. I've opened another PR #120 which will hopefully address this properly, would you mind reviewing that?

It would also be useful to have a unit test for this, but the tests included here don't exercise the actual bug. Ideally we'd have a test that actually runs a local modelmesh with an asyncprocessor configured and can trigger the problem (other unit tests already run modelmesh to test other stuff and could hopefully be used as a starting point and adapted).

ckadner · 2023-11-22T22:20:29Z

Should this PR be closed in favor of PR #120 ?

njhill · 2023-11-22T23:55:18Z

Should this PR be closed in favor of PR #120 ?

@ckadner yes! @RobGeada would be great if you could review #120

kserve-oss-bot requested a review from ckadner October 16, 2023 14:54

kserve-oss-bot requested a review from tjohnson31415 October 16, 2023 14:54

Added reader rewind to remote payload processor

256395b

ckadner force-pushed the Issue111 branch from 08b3e01 to 256395b Compare October 18, 2023 23:49

ckadner requested changes Oct 18, 2023

View reviewed changes

ruivieira mentioned this pull request Oct 24, 2023

MM payload processing occasional failure trustyai-explainability/trustyai-explainability#368

Closed

njhill self-requested a review October 24, 2023 20:52

RobGeada added 2 commits October 25, 2023 14:19

Add payload processor mock test

e21998b

Merge branch 'main' into Issue111

ea94676

Signed-off-by: Rob Geada <[email protected]>

ckadner reviewed Nov 10, 2023

View reviewed changes

ckadner mentioned this pull request Nov 22, 2023

feat: Add vModelId to PayloadProcessor Payload #123

Merged

ckadner closed this Nov 23, 2023

ckadner mentioned this pull request Nov 24, 2023

Add unit test for Payload Processor #129

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added reader rewind to remote payload processor #112

Added reader rewind to remote payload processor #112

RobGeada commented Oct 16, 2023 •

edited

Loading

kserve-oss-bot commented Oct 16, 2023

ckadner left a comment

RobGeada commented Oct 25, 2023

RobGeada commented Nov 10, 2023

ckadner left a comment

ckadner Nov 10, 2023

njhill commented Nov 16, 2023 •

edited by ckadner

Loading

ckadner commented Nov 22, 2023

njhill commented Nov 22, 2023

Added reader rewind to remote payload processor #112

Added reader rewind to remote payload processor #112

Conversation

RobGeada commented Oct 16, 2023 • edited Loading

Motivation

Modifications

Result

kserve-oss-bot commented Oct 16, 2023

ckadner left a comment

Choose a reason for hiding this comment

RobGeada commented Oct 25, 2023

RobGeada commented Nov 10, 2023

ckadner left a comment

Choose a reason for hiding this comment

ckadner Nov 10, 2023

Choose a reason for hiding this comment

njhill commented Nov 16, 2023 • edited by ckadner Loading

ckadner commented Nov 22, 2023

njhill commented Nov 22, 2023

RobGeada commented Oct 16, 2023 •

edited

Loading

njhill commented Nov 16, 2023 •

edited by ckadner

Loading