-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
netty: netty tests resource leakage issue #11593
base: master
Are you sure you want to change the base?
netty: netty tests resource leakage issue #11593
Conversation
…ts and commented 3 Failing UTs.
…ts and commented 3 Failing UTs.
…ts and commented 3 Failing UTs.
…ts and commented 3 Failing UTs.
… framer and release buffers
…tenerMessageQueue in NettyServerHandlerTest
# Conflicts: # netty/src/test/java/io/grpc/netty/NettyHandlerTestBase.java
…stenerMessageQueue in NettyServerTests as it's not working as expected
@@ -1346,7 +1368,7 @@ private ByteBuf emptyGrpcFrame(int streamId, boolean endStream) throws Exception | |||
try { | |||
return dataFrame(streamId, endStream, buf); | |||
} finally { | |||
buf.release(); | |||
buf.clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we revert this change then below tests are failing with mentioned reason in snippet. so I used clear() also you can refer previous comment on the same.
#3353 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per documentation clear() does not release memory: https://github.com/netty/netty/blob/64c52255c073d4a308d3dbb56a12d3cca7692c47/buffer/src/main/java/io/netty/buffer/ByteBuf.java#L467
I tried with your PR and not calling either byteBuf.clear() or byteBuf.release() here and I did not observe the leak report for any of the 3 tests calling emptyDataFrame() viz headersWithErrAndEndStreamReturnErrorButNotThrowNpe, streamErrorShouldNotCloseChannel and clientHalfCloseShouldForwardToStreamListener.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in the above comment and snippet we are discussing about 3 tests failures with refCnt decrement error if we use byteBuf.release() instead of byteBuf.clear(), not about memory leakage issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, my question is why byteBuf.release() is even required when it doesn't show any report leak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean to say we need to remove this catch block which is doing the cleanup part, as this was an existing code? please confirm.
/** | ||
* Must be called by subclasses to initialize the handler and channel. | ||
*/ | ||
protected final void initChannel(Http2HeadersDecoder headersDecoder) throws Exception { | ||
content = Unpooled.copiedBuffer("hello world", UTF_8); | ||
//ByteBuf content = Unpooled.copiedBuffer("hello world", UTF_8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove commented line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
@@ -869,7 +873,8 @@ public void keepAliveEnforcer_sendingDataResetsCounters() throws Exception { | |||
future.get(); | |||
for (int i = 0; i < 10; i++) { | |||
future = enqueue( | |||
new SendGrpcFrameCommand(stream.transportState(), content().retainedSlice(), false)); | |||
//new SendGrpcFrameCommand(stream.transportState(), content().retainedSlice(), false)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove commented line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
@@ -262,25 +268,27 @@ public void deliverFrame( | |||
if (frame != null) { | |||
ByteBuf bytebuf = ((NettyWritableBuffer) frame).bytebuf(); | |||
compressionFrame.writeBytes(bytebuf); | |||
bytebuf.release(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per comment #3353 (comment) we should not be doing this in the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kannanjgithub As I remember with this change I was able to fix the below issue with grpcFrame() method.
#3353 (comment)
As per my understanding the comment you mentioned above does not apply here as I am not modifying frame after passing it to channelRead(), also I am not doing anything here around the channelRead() method as you mentioned in above comment link. please confirm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What leak does this fix? I tried commenting this out and ran some of the tests that end up calling this method but I didn't observe any leak report.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per my knowledge the above change fixed the leakage issue in the grpcFrame() method as mentioned in the comment #3353 (comment) also I remember this error will occur if you run the complete server test by 5+ times (sometimes I have seen these issue in 10 time also). if you run individual tests you will not see leakage error.
@@ -357,6 +365,7 @@ protected final ByteBuf captureWrite(ChannelHandlerContext ctx) { | |||
for (ByteBuf buf : captor.getAllValues()) { | |||
composite.addComponent(buf); | |||
composite.writerIndex(composite.writerIndex() + buf.readableBytes()); | |||
buf.clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, it was part of trial and error.
@@ -1302,6 +1320,7 @@ public void maxRstCount_exceedsLimit_fails() throws Exception { | |||
manualSetUp(); | |||
assertThrows(ClosedChannelException.class, () -> rapidReset(maxRstCount + 1)); | |||
assertFalse(channel().isOpen()); | |||
channel().releaseOutbound(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any outbound messages in the channel when this is called. Check here and in other places whether it is really necessary.
Also since you also have it in tearDown does it not suffice already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kannanjgithub I can see even with tearDown() method the channel().releaseOutbound() entry is required as I have validated the multiple UT's, without this outbound messages are not getting released. please find the below snippet for one of UT which says messages are not released without this entry(commented this entry in line 379) and its printing the count of the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kannanjgithub Fixed. As per our recent discussion, Removed the channel().releaseOutbound() entries in the UT's which are calling the rapidReset() method as its already handled here.
Also Reverted back the channel().releaseOutbound() in other UTs which is not required as its calling the tearDown() after execution of every test and releasing the outbound messages if any
… from client test
… from base and server test
accumulator += wireSize; | ||
} | ||
long pingData = handler.flowControlPing().payload(); | ||
channelRead(pingFrame(true, pingData)); | ||
channel().releaseOutbound(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be removed as this tearDown will do it for you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
…ulates() server base test
I can see there are multiple tests in netty server and client side are randomly failing with resource leakage error and this PR will be fixing some of them at Server and Client side as below.
Note : Made some common method changes (as per Eric Suggestion/Reference) in all 3 tests files and currently tests which are shown below in Pending Server Base are one which are majorly impacting in both Server and Client test randomly which has common root cause while invoking the dataFrame() which we discussed earlier and tried suggestion as shown in the comment
#3353 (comment)
Server Side UTs Fixes:
sendFrameShouldSucceed()
inboundDataWithEndStreamShouldForwardToStreamListener()
inboundDataShouldForwardToStreamListener()
keepAliveEnforcer_sendingDataResetsCounters()
maxRstCount_withinLimit_succeeds()
maxRstCount_exceedsLimit_fails()
NettyHandlerTestBase.dataSizeSincePingAccumulates()
Client Side UTs Fixes
sendFrameShouldSucceed()
Pending UTs at Server Base
NettyHandlerTestBase.bdpPingWindowResizing
NettyHandlerTestBase.bdpPingLimitOutstanding
NettyHandlerTestBase.windowUpdateMatchesTarget
NettyHandlerTestBase.testPingBackoff
Fixes #3353