Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Add tracing records for memory alloc/free #24

Open
wrwilliams opened this issue Oct 10, 2024 · 0 comments
Open

[Feature]: Add tracing records for memory alloc/free #24

wrwilliams opened this issue Oct 10, 2024 · 0 comments

Comments

@wrwilliams
Copy link

Suggestion Description

In Score-P, we usually provide memory allocated on the CPU and GPU as process-level metrics (across a variety of accelerator paradigms). We are able to create finer-grained metrics as desired, for e.g. pinned vs. managed vs. stream-local async allocations.

The problem: in order to know what memory has been allocated, and of what type, we potentially need to intercept all HIP allocation functions and have per-function handling of their semantics (what type of buffer/handle is returned, is "size" an array size or an actual size in bytes, what kind of memory has been allocated). Contrast this with e.g. kernel launch and memory copy, where the semantics of the various API calls are digested by rocprofiler and handed back in the associated event records.

My suggested solution: allow a MEMORY_ALLOCATION record kind, operations ALLOC, FREE, and potentially REALLOC, and something like:

enum ROCPROFILER_ALLOCATION_KIND {
    ROCPROFILER_ALLOC_NONE,
    ROCPROFILER_ALLOC_HOST_PINNED,
    ROCPROFILER_ALLOC_UNIFIED,
    ROCPROFILER_ALLOC_DEVICE,
    ... // any other kinds that are worth separating out
    ROCPROFILER_ALLOC_GENERIC, // covers the case where `hipFree` is used as a wildcard
    ROCPROFILER_ALLOC_LAST
};

typedef struct alloc_free_record {
    void* address;
    size_t bytes; // undefined for operation FREE
    ROCPROFILER_ALLOCATION_KIND kind;
} alloc_free_record_t;

to be delivered by the callback and buffer tracing systems. This then allows us to register for allocation and free notifications with memory kinds that are meaningful and correct and data that we can feed into our internal SCOREP_AllocMetric structures (designed for malloc/free/new/delete) directly.

Operating System

No response

GPU

No response

ROCm Component

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant