-
Notifications
You must be signed in to change notification settings - Fork 300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory problem on ConcurrentHClientPool using HThriftClient with TFramedTransport #665
Comments
It keeps the data, but it is overwritten during the next usage. This is a "feature" of that version of Thrift. It keeps the underlying byte[] to avoid having to re-allocate/re-grow. The problem, as you have discovered, is that they will grow out to except a larger payload, but they will not shrink, doing so all the way out to the max message length (15mb by default).
Did not need to have all MAX_CONNECTIONS threads allocated and that seemed a good number from empirical observation of adding a service into a running architecture. That was a good guess it seemed as no one has yet had a big enough issue with it to want to add a MIN_CONNECTIONS or similar :) |
Thank you for the answer. So, the maximum (approximately ) retained heap by the ConcurrentHClientPool using HThriftClient with TFramedTransport follows this rule : |
Yes - exactly that. This by itself may be a reason to incorporate the DataStax Java Driver for simple operations in your code as well, maintaining a much smaller pool of hector connections for large batch mutates or getting at dynamic columns easier. Further, the binary protocol for CQL uses evented IO via Netty on the client and server so is significantly more efficient resource wise. That said, despite what you may read elsewhere, using raw thrift is more performant and flexible if (a really big "if" there) you understand the underlying storage model and its limits. There's really no reason you can't use both. |
In version 1.1-4, when ConcurrentHClientPool release HClient, if it is opened, it is pooled in availableClientQueue.
If we are in case of HThrifClient with TTransport wrapped with TFramedTransport, TMemoryInputTransport readBuffer_ keeps datas of operations done by HClient.
These given datas multiplied by connection's number can increase quickly the memory.
Why doesn't clear readBuffer_ on HClient release ?
I have 1 additional question,
Why max active connection is divided by 3 to obtain HClient number by host ?
Thanks
The text was updated successfully, but these errors were encountered: