Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InitAPI memory leak #3168

Closed
1 task
saad-alsaad1 opened this issue Oct 31, 2024 · 16 comments
Closed
1 task

InitAPI memory leak #3168

saad-alsaad1 opened this issue Oct 31, 2024 · 16 comments
Assignees
Labels
bug This issue is a bug. closing-soon This issue will automatically close in 4 days unless further comments are made. p2 This is a standard priority issue

Comments

@saad-alsaad1
Copy link

saad-alsaad1 commented Oct 31, 2024

Describe the bug

I've compiled AWS SDK on three RHEL machines. RHEL7, RHEL8, and RHEL9
I noticed that when I call Aws::InitAPI(options) the memory raise by 1847 MB on RHEL7 and RHEL9!
while on RHEL8 Aws::InitAPI(options) API consume around 370 MB.
Also, when I call the Aws::ShutdownAPI(options) function it only free 50 MB.
I've the following library versions on each machine:

RHEL7:
Cmake: 3.14.2
GCC: 8.3.1
libz: 1.2.7
libcurl: 7.29.0
OpenSSL & libcrypto: 1.0.2k
Python: 3.6.8

RHEL8:
Cmake: 3.14.2
GCC: 8.5.0
libz: 1.2.11
libcurl: 7.61.1
OpenSSL & libcrypto: 1.1.1k
Python: 3.6.8

RHEL9:
Cmake: 3.26.5
GCC: 11.4.1
libz: 1.2.11
libcurl: 7.76.1
OpenSSL & libcrypto: 3.0.7
Python: 3.9.18

I used the following command to compile AWS SDK on the machines:

./aws-sdk-cpp -DCMAKE_BUILD_TYPE=Release -DCURL_INCLUDE_DIR=/usr/include/ -DBUILD_ONLY="s3;sts;s3-crt;ec2;core;ec2-instance-connect;ecr;ecs;ecr-public;s3control;transfer;iam;identity-management;access-management;monitoring;s3-encryption" -DCMAKE_PREFIX_PATH=./aws-sdk-libs -DCMAKE_INSTALL_PREFIX=./aws-sdk-libs -DBUILD_SHARED_LIBS=ON

Code:

    Aws::SDKOptions options;
    switch(log_level){
        case 1:
            options.loggingOptions.logLevel = Aws::Utils::Logging::LogLevel::Fatal;
        break;
        case 2:
            options.loggingOptions.logLevel = Aws::Utils::Logging::LogLevel::Error;
        break;
        case 3:
            options.loggingOptions.logLevel = Aws::Utils::Logging::LogLevel::Warn;
        break;
        case 4:
            options.loggingOptions.logLevel = Aws::Utils::Logging::LogLevel::Info;
        break;
        case 5:
            options.loggingOptions.logLevel = Aws::Utils::Logging::LogLevel::Debug;
        break;
        case 6:
            options.loggingOptions.logLevel = Aws::Utils::Logging::LogLevel::Trace;
        break;
    }

    Aws::InitAPI(options); 

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

I believe Aws::InitAPI shouldn't need around 1.8 GB of memory on Centos 7 and 9 since on Centos 8 it only consume 370 MB of memory.
I guess Aws::ShutdownAPI should clean all the memory allocated by Aws::InitAPI.

Current Behavior

Aws::InitAPI(options) take 1847 MB of memory on RHEL7 and RHEL9, but on RHEL8 it only take 370 MB.
Aws::ShutdownAPI(options) doesn't free all the memory allocated by Aws::InitAPI(options).

Reproduction Steps

Compile AWS SDK on RHEL 7.6 or RHEL 9.4 with the follow libs:
RHEL7:
Cmake: 3.14.2
GCC: 8.3.1
libz: 1.2.7
libcurl: 7.29.0
OpenSSL & libcrypto: 1.0.2k
Python: 3.6.8

RHEL9:
Cmake: 3.26.5
GCC: 11.4.1
libz: 1.2.11
libcurl: 7.76.1
OpenSSL & libcrypto: 3.0.7
Python: 3.9.18

Use the following command for compilation:
./aws-sdk-cpp -DCMAKE_BUILD_TYPE=Release -DCURL_INCLUDE_DIR=/usr/include/ -DBUILD_ONLY="s3;sts;s3-crt;ec2;core;ec2-instance-connect;ecr;ecs;ecr-public;s3control;transfer;iam;identity-management;access-management;monitoring;s3-encryption" -DCMAKE_PREFIX_PATH=./aws-sdk-libs -DCMAKE_INSTALL_PREFIX=./aws-sdk-libs -DBUILD_SHARED_LIBS=ON

Possible Solution

No response

Additional Information/Context

No response

AWS CPP SDK version used

1.11.365

Compiler and Version used

gcc 8.3.1 & gcc 8.5.0 & gcc 11.4.1

Operating System and version

Centos 7.6 & Centos 8.5.2 & Centos 9.4

@saad-alsaad1 saad-alsaad1 added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Oct 31, 2024
@jmklix
Copy link
Member

jmklix commented Nov 7, 2024

I'm not seeing any memory leaks when reproducing this on RHEL9.4.

#include <iostream>
#include <aws/core/Aws.h>

using namespace std;
using namespace Aws;

int main() {
    Aws::SDKOptions options;
    options.loggingOptions.logLevel = Aws::Utils::Logging::LogLevel::Trace;

    Aws::InitAPI(options);
    {
            std::cout << "Test" <<std::endl;
    }
    Aws::ShutdownAPI(options);
    return 0;
}

Can you try the following:

  • make sure you are following the basic guidelines of making all sdk calls within {}
  • make sure you are only calling InitAPI and ShutdownAPI once each
  • run valgrind memory check with your application.

Valgrind also put my simple application usage at only 15 MB. So it would be helpful if you could give us a minimal repro code sample that shows a memory leak.

@jmklix jmklix self-assigned this Nov 7, 2024
@jmklix jmklix added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. p2 This is a standard priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Nov 7, 2024
Copy link

Greetings! It looks like this issue hasn’t been active in longer than a week. We encourage you to check if this is still an issue in the latest release. Because it has been longer than a week since the last update on this, and in the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or add an upvote to prevent automatic closure, or if the issue is already closed, please feel free to open a new one.

@github-actions github-actions bot added the closing-soon This issue will automatically close in 4 days unless further comments are made. label Nov 17, 2024
@saad-alsaad1
Copy link
Author

I used your example code and tried Valgrind on centos 9.4, it didn't help.
I noticed in the trace log that Initializing edge-triggered epoll is redundant 24 times after Successfully reloaded configuration, I think this is not normal.

Here is a snapshot:
image

@SergeyRyabinin
Copy link
Contributor

Hi @saad-alsaad1 , thank you for your reply!
I'm not fully sure yet, but it seems to be CRT allocating quite many worker threads for the event loop.
Could you please specify if the machines spec (CPU, ram) you are using for RHEL 7,8,9?
Have you applied any special non-default os/kernel config?

Best regards,
Sergey

@jmklix jmklix added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. and removed closing-soon This issue will automatically close in 4 days unless further comments are made. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. labels Nov 19, 2024
@saad-alsaad1
Copy link
Author

saad-alsaad1 commented Nov 20, 2024

Hi @SergeyRyabinin,
I didn't modify the os/kernal>
Here are the specs for each machine:
Centos 9.4:
CPU info:
CPU op-mode(s): 32-bit, 64-bit
CPU(s): 48
Model name: Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz
CPU family: 6
Model: 63
Thread(s) per core: 1
Core(s) per socket: 2
Socket(s): 24
Stepping: 2

RAM: 263 GB

Centos 8.5:
CPU info:
CPU op-mode(s): 32-bit, 64-bit
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 8
NUMA node(s): 1
CPU family: 6
Model: 62
Model name: Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz

RAM: 32

Centos 7:
CPU info:
CPU op-mode(s): 32-bit, 64-bit
CPU(s): 48
On-line CPU(s) list: 0-47
Thread(s) per core: 2
Core(s) per socket: 12
Socket(s): 2
NUMA node(s): 2
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU E5-2687W v4 @ 3.00GHz

RAM: 264 GB

@jmklix
Copy link
Member

jmklix commented Nov 21, 2024

What you're seeing doesn't appear to be a memory leak, but rather the sdk setting up the event loops in proportion to how many threads that you have. To elaborate further the sdk detects that you have 48 CPUs and then will start 24 event loop threads to prepare for processing large amounts of data. You can see the code that decides that number here:

        /* cut them in half to avoid using hyper threads for the IO work. */
        el_count = processor_count > 1 ? processor_count / 2 : processor_count;

This can take up a large chunk of memory when you compare it to a instance with only 8CPUs. This would also explain why nothing is showing up on valgraind as there is not memory leak. It would be a problem if you're still seeing large memory usage after the program is terminated. But valgrind would catch if that was the problem.

I don't think there are currently any actionable items on this. But please do let us know what expectations/preferences that you have for this sdk.

@jmklix jmklix added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. and removed response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. labels Nov 21, 2024
@saad-alsaad1
Copy link
Author

Hi @jmklix,
Thanks for the explanation.
Is there a way to control number of threads for the event loops, using an sdk option or environment variable?
If not, is it possible to add this feature in future releases?

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. label Nov 22, 2024
@DmitriyMusatkin
Copy link
Contributor

I would really doubt its related to the number of threads that CRT launched. CRT barely stores any extra meta info for each threads. From my memory, typical ELG memory usage on C5n (which has 72 cores and will run 36 threads by default) is less than 40 MB.

do you have a complete sample where you are observing the issue? how are you actually measuring mem usage?

@jmklix jmklix added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. label Nov 27, 2024
@saad-alsaad1
Copy link
Author

saad-alsaad1 commented Dec 1, 2024

I get the memory usage from /proc/self/status and parse VmSize value
Here my sample:

#include <iostream>
#include <aws/core/Aws.h>

using namespace std;
using namespace Aws;

static int parseLine(char* line){
    // This assumes that a digit will be found and the line ends in " Kb".
    int i = (int)strlen(line);
    const char* p = line;
    while (*p <'0' || *p > '9') p++;
    line[i-3] = '\0';
    i = atoi(p);
    return i;
}

int getVmMemUsage(void)
{ //Note: this value is in KB!
    FILE* file = fopen("/proc/self/status", "r");
    int result = -1;
    char line[128];

    while (fgets(line, 128, file) != NULL){
        if (strncmp(line, "VmSize:", 7) == 0){
            result = parseLine(line);
            break;
        }
    }
    fclose(file);

    return result/1000;
}

int main() {
    Aws::SDKOptions options;
    options.loggingOptions.logLevel = Aws::Utils::Logging::LogLevel::Trace;

    Aws::InitAPI(options);
    {
            std::cout << "Memory usage in MB: " << getVmMemUsage() << std::endl;
    }
    Aws::ShutdownAPI(options);
    return 0;
}

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. label Dec 2, 2024
@DmitriyMusatkin
Copy link
Contributor

Are you sure VmSize is what you are looking for? Looking at proc status docs, it seems like VmSize is the total addressable space of the process and how OS maps that might vary based on hw and other things.
Does VmRss show the same difference?

@jmklix jmklix added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. label Dec 2, 2024
Copy link

Greetings! It looks like this issue hasn’t been active in longer than a week. We encourage you to check if this is still an issue in the latest release. Because it has been longer than a week since the last update on this, and in the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or add an upvote to prevent automatic closure, or if the issue is already closed, please feel free to open a new one.

@github-actions github-actions bot added the closing-soon This issue will automatically close in 4 days unless further comments are made. label Dec 13, 2024
@saad-alsaad1
Copy link
Author

I see totally different numbers between VmRSS and VmSize for the same process and seems VmRSS number make more sense.
on Centos 7, I get the following results When calling InitAPI
VmSize: 2045.32 MB
VmRSS: 17.04 MB

@jmklix
Copy link
Member

jmklix commented Dec 16, 2024

As noted here:

VmRSS in /proc/[pid]/statm is a useful data. It shows how much memory in RAM is occupied by the process. The rest extra memory has either been not used or has been swapped out.

VmSize is how much virtual memory the process has in total. This includes all types of memory, both in RAM and swapped out. These numbers can get skewed because they also include shared libraries.

So your 17.04 MB VmRSS size is normal memory size by this sdk. Please let us know if you have any other questions about this sdk

@jmklix jmklix added closing-soon This issue will automatically close in 4 days unless further comments are made. and removed closing-soon This issue will automatically close in 4 days unless further comments are made. response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. labels Dec 16, 2024
@bjosv
Copy link

bjosv commented Dec 20, 2024

We had a similar problem when upgrading aws-sdk-cpp from v1.7 to v.1.11 and using a S3Client in an application.
The memory usage in the pod increased with 50% (150MB) by just upgrading which was not acceptable, and I guess VmSize is used in K8s (?). We could see that the problem started from 1.9 when the crt library was introduced.

Anyway, we could lower the numbers from the additional 150MB to 31MB by tuning the number of initiated threads via clientBootstrap_create_fn in Aws::SDKOptions.

Not sure if this can help you @saad-alsaad1 , but see this simple example.

@jmklix jmklix closed this as completed Dec 20, 2024
Copy link

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

@DmitriyMusatkin
Copy link
Contributor

@bjosv can you share any other details about what your app was doing? crt uses green threads and on their own those threads should not use a whole lot of mem.
How are you measuring mem usage in K8s? from my quick research it seems they are using VmRss to show active mem usage, which should not jump that much regardless of number of threads

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. closing-soon This issue will automatically close in 4 days unless further comments are made. p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests

5 participants