Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible memory leak in .net agent #1988

Closed
kaluznyt opened this issue Oct 18, 2023 · 11 comments
Closed

Possible memory leak in .net agent #1988

kaluznyt opened this issue Oct 18, 2023 · 11 comments
Labels
bug Something isn't working community To tag external issues and PRs

Comments

@kaluznyt
Copy link

We're seeing an increasing number of objects related to New Relic in our .net core app.

Description
Took a couple of memory dumps of a .net core app (running on kubernetes) and seeing an increasing number of objects from newrelic.agent namespace like:
MethodCallData
Segment
ParsedSqlStatement
ConnectionInfo
DataStoreSegmentData

Which is causing the application to hit the OOM when it's running for couple of days.

Expected Behavior
Not seeing a constant increase in memory used caused by the new relic objects

Steps to Reproduce
No particular steps required.

Your Environment
[TIP]: # ( Describe your environment, please include the running version of the agent, .NET Framework, .NET Core, or .NET versions, and any relevant configurations)
.NET Core API app running on Kubernetes, .NET Core 6, NewRelic agent version 10.17.0
image

Those screenshots are from dotMemory, dumps taken from the same container 1 day apart:
image
image

@kaluznyt kaluznyt added the bug Something isn't working label Oct 18, 2023
@workato-integration
Copy link

@github-actions github-actions bot added the community To tag external issues and PRs label Oct 18, 2023
@nrcventura
Copy link
Member

Thank you for reporting this to us. The datatypes that you are reporting in this issue and in the screenshots are datatypes that we expect to see created for every single call to a database/datastore.

  • MethodCallData is an object that is created for every single call to an instrumented method
  • Segment is an object that typically represents a method call within a transaction
  • ParsedSqlStatement is an object that is created for each database/datastore call
  • ConnectionInfo is an object that is created for each database/datastore call
  • DatastoreSegmentData is an object that is created for each database/datastore call

In general, this data should only be considered alive until a transaction ends and is transformed into the wire models that are stored in reservoirs until they are ready to be transmitted to New Relic. The agent does have some caching in place for ParsedSqlStatement which can keep those instances alive longer.

We will need more information to better understand what is happening. Information like the following can help us understand what is going on.

  1. How many database calls do you expect to see per transaction?
  2. How many transactions do you expect to see executing concurrently?
  3. Do you have a shareable reproduction of this memory problem?

@kaluznyt
Copy link
Author

Thanks for your reply.

Ok, basically, the app mostly calls the Redis via the StackExchange.Redis multiplexers (cached/singletons). It calls different instances of Redis (like one hosted in the cloud, one in kubernetes itself).

It calls SQL Server, but that's small percentage of whole db traffic.

In terms of database calls, difficult to tell it really dependes on the request, but usually multiple calls are going out to Redis per Transaction/Request, however, there are many concurrent requests, I would say the traffic is quite high all the time.

As for the reproduction, I don't have anything at the moment, we're seeing this on live environment. Perhaps I'll try it. Or maybe I could run some new relic instrumentation if that can help anyhow ?

@kaluznyt
Copy link
Author

So this is the breakdown in the sample transaction from New Relic:
image

@kaluznyt
Copy link
Author

And this is dump from the same pod, but from today, we see a large increase in the objects counts compared to previous dumps:
image

@kaluznyt
Copy link
Author

Is the ParsedSqlStatement used for Redis, or only for the SQL Queries ? From the name I suppose only for SQL Queries, perhaps that's somewhere we should look into ?

@kaluznyt
Copy link
Author

I just took a look into one of the ParsedSqlStatement objects, and seems it's redis related:
image

@kaluznyt
Copy link
Author

Also found that the transaction, thats still hold in memory, it's also showing in New Relic, (assuming that the _transactionGuid on the TransactionMetadata is the tripId. (and this one is from 2 days ago)

image image

@nrcventura
Copy link
Member

That definitely seems like a memory leak. Since you can find that transaction in the New Relic UI, the transaction probably ended and got transformed, which would allow the garbage collector to reclaim that memory. However, it's possible that it is being referenced on another thread and being kept in memory. This may be a side-effect of how the agent maintains state by leveraging AsyncLocal storage to allow the transaction to flow with all of the async thread hops that a transaction/request may go through. Things like starting a timer, and certain ways of kicking off async background work can cause AsyncLocal state to be captured even though that async work is not part of the request.

To further help you we will either need an application that we can use to reproduce the problem. Or you may need to work with our support team in order to share the memory dump with us and possibly some sample code. By working with our support team you can avoid sharing potentially sensitive information on github. For more information on our support team you can refer to Support Options. If you open a request with the support team please reference this github issue so that we can ensure that the issues are linked together.

@kaluznyt
Copy link
Author

Thank you for directing us. We'll try to investigate what can be the cause based on your input, if we cannot figure it out, we'll open a support request to get help.

@nrcventura
Copy link
Member

I'm closing this ticket for now. If a support request is opened for this, we will address the problem through the support process. If more information becomes available, and we can identify or reproduce the problem, we can reopen the ticket later.

@nrcventura nrcventura closed this as not planned Won't fix, can't repro, duplicate, stale Oct 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working community To tag external issues and PRs
Projects
None yet
Development

No branches or pull requests

2 participants