Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

netty: netty tests resource leakage issue #11593

Open
wants to merge 37 commits into
base: master
Choose a base branch
from

Conversation

vinodhabib
Copy link
Contributor

@vinodhabib vinodhabib commented Oct 3, 2024

I can see there are multiple tests in netty server and client side are randomly failing with resource leakage error and this PR will be fixing some of them at Server and Client side as below.

Note : Made some common method changes (as per Eric Suggestion/Reference) in all 3 tests files and currently tests which are shown below in Pending Server Base are one which are majorly impacting in both Server and Client test randomly which has common root cause while invoking the dataFrame() which we discussed earlier and tried suggestion as shown in the comment
#3353 (comment)

Server Side UTs Fixes:
sendFrameShouldSucceed()
inboundDataWithEndStreamShouldForwardToStreamListener()
inboundDataShouldForwardToStreamListener()
keepAliveEnforcer_sendingDataResetsCounters()
maxRstCount_withinLimit_succeeds()
maxRstCount_exceedsLimit_fails()

NettyHandlerTestBase.dataSizeSincePingAccumulates()

Client Side UTs Fixes
sendFrameShouldSucceed()

Pending UTs at Server Base
NettyHandlerTestBase.bdpPingWindowResizing
NettyHandlerTestBase.bdpPingLimitOutstanding
NettyHandlerTestBase.windowUpdateMatchesTarget
NettyHandlerTestBase.testPingBackoff

Fixes #3353

# Conflicts:
#	netty/src/test/java/io/grpc/netty/NettyHandlerTestBase.java
@vinodhabib vinodhabib changed the title Defect 3353 - netty tests resource leak fix netty - netty tests resource leakage issue Oct 4, 2024
@vinodhabib vinodhabib changed the title netty - netty tests resource leakage issue netty: netty tests resource leakage issue Oct 4, 2024
…stenerMessageQueue in NettyServerTests as it's not working as expected
@vinodhabib vinodhabib changed the title netty: netty tests resource leakage issue grpc-netty: netty tests resource leakage issue Oct 14, 2024
@@ -1346,7 +1368,7 @@ private ByteBuf emptyGrpcFrame(int streamId, boolean endStream) throws Exception
try {
return dataFrame(streamId, endStream, buf);
} finally {
buf.release();
buf.clear();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this change.

Copy link
Contributor Author

@vinodhabib vinodhabib Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we revert this change then below tests are failing with mentioned reason in snippet. so I used clear() also you can refer previous comment on the same.
#3353 (comment)
image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per documentation clear() does not release memory: https://github.com/netty/netty/blob/64c52255c073d4a308d3dbb56a12d3cca7692c47/buffer/src/main/java/io/netty/buffer/ByteBuf.java#L467

I tried with your PR and not calling either byteBuf.clear() or byteBuf.release() here and I did not observe the leak report for any of the 3 tests calling emptyDataFrame() viz headersWithErrAndEndStreamReturnErrorButNotThrowNpe, streamErrorShouldNotCloseChannel and clientHalfCloseShouldForwardToStreamListener.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in the above comment and snippet we are discussing about 3 tests failures with refCnt decrement error if we use byteBuf.release() instead of byteBuf.clear(), not about memory leakage issue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, my question is why byteBuf.release() is even required when it doesn't show any report leak.

Copy link
Contributor Author

@vinodhabib vinodhabib Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean to say we need to remove this catch block which is doing the cleanup part, as this was an existing code? please confirm.

/**
* Must be called by subclasses to initialize the handler and channel.
*/
protected final void initChannel(Http2HeadersDecoder headersDecoder) throws Exception {
content = Unpooled.copiedBuffer("hello world", UTF_8);
//ByteBuf content = Unpooled.copiedBuffer("hello world", UTF_8);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -869,7 +873,8 @@ public void keepAliveEnforcer_sendingDataResetsCounters() throws Exception {
future.get();
for (int i = 0; i < 10; i++) {
future = enqueue(
new SendGrpcFrameCommand(stream.transportState(), content().retainedSlice(), false));
//new SendGrpcFrameCommand(stream.transportState(), content().retainedSlice(), false));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -262,25 +268,27 @@ public void deliverFrame(
if (frame != null) {
ByteBuf bytebuf = ((NettyWritableBuffer) frame).bytebuf();
compressionFrame.writeBytes(bytebuf);
bytebuf.release();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per comment #3353 (comment) we should not be doing this in the test.

Copy link
Contributor Author

@vinodhabib vinodhabib Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kannanjgithub As I remember with this change I was able to fix the below issue with grpcFrame() method.
#3353 (comment)

As per my understanding the comment you mentioned above does not apply here as I am not modifying frame after passing it to channelRead(), also I am not doing anything here around the channelRead() method as you mentioned in above comment link. please confirm

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What leak does this fix? I tried commenting this out and ran some of the tests that end up calling this method but I didn't observe any leak report.

Copy link
Contributor Author

@vinodhabib vinodhabib Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per my knowledge the above change fixed the leakage issue in the grpcFrame() method as mentioned in the comment #3353 (comment) also I remember this error will occur if you run the complete server test by 5+ times (sometimes I have seen these issue in 10 time also). if you run individual tests you will not see leakage error.

@@ -357,6 +365,7 @@ protected final ByteBuf captureWrite(ChannelHandlerContext ctx) {
for (ByteBuf buf : captor.getAllValues()) {
composite.addComponent(buf);
composite.writerIndex(composite.writerIndex() + buf.readableBytes());
buf.clear();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, it was part of trial and error.

@@ -1302,6 +1320,7 @@ public void maxRstCount_exceedsLimit_fails() throws Exception {
manualSetUp();
assertThrows(ClosedChannelException.class, () -> rapidReset(maxRstCount + 1));
assertFalse(channel().isOpen());
channel().releaseOutbound();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any outbound messages in the channel when this is called. Check here and in other places whether it is really necessary.
Also since you also have it in tearDown does it not suffice already?

Copy link
Contributor Author

@vinodhabib vinodhabib Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kannanjgithub I can see even with tearDown() method the channel().releaseOutbound() entry is required as I have validated the multiple UT's, without this outbound messages are not getting released. please find the below snippet for one of UT which says messages are not released without this entry(commented this entry in line 379) and its printing the count of the same.

image

image

Copy link
Contributor Author

@vinodhabib vinodhabib Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kannanjgithub Fixed. As per our recent discussion, Removed the channel().releaseOutbound() entries in the UT's which are calling the rapidReset() method as its already handled here.

Also Reverted back the channel().releaseOutbound() in other UTs which is not required as its calling the tearDown() after execution of every test and releasing the outbound messages if any

@vinodhabib vinodhabib changed the title grpc-netty: netty tests resource leakage issue netty: netty tests resource leakage issue Oct 18, 2024
accumulator += wireSize;
}
long pingData = handler.flowControlPing().payload();
channelRead(pingFrame(true, pingData));
channel().releaseOutbound();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed as this tearDown will do it for you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Resource leak in netty tests
2 participants