Calling convention for custom gradients #1903

erick-xanadu · 2024-05-22T13:20:06Z

erick-xanadu
May 22, 2024

Hi,

I will be working on integrating Enzyme's custom gradients with some code, but I am having a bit of difficulty understanding the calling convention for custom gradients.

I already have the three functions (the function itself, the forward augmented pass and the reverse pass). But they don't match the calling convention yet. I'm thinking of just writing wrappers around the forward and reverse pass. The function itself can be described as follows in MLIR's LLVM Dialect (it does not return a value):

  llvm.func @f(%input0: !llvm.ptr, ..., %inputN: !llvm.ptr, %ret0: !llvm.ptr, ... %retM: !llvm.ptr)

I've looked into the tests, and I found a test for the custom gradient that matches this signature closely.

Would it then be correct to say that the forward pass will have the following signature where the shadow arguments are never used and the tape can be a pointer?

  llvm.func @fwd(%input0: !llvm.ptr, %input0_shadow: !llvm.ptr, ..., 
                           %inputN: !llvm.ptr, %inputN_shadow: !llvm_ptr, ....,
                           %ret0: !llvm.ptr, %ret0_shadow: !llvm_ptr, ...,
                           %retM: !llvm.ptr, %retM_shadow: !llvm_ptr) -> !llvm.ptr

and the reverse pass will have the following signature:

  llvm.func @rev(%input0: !llvm.ptr, %input0_shadow: !llvm.ptr, ..., 
                           %inputN: !llvm.ptr, %inputN_shadow: !llvm_ptr, ....,
                           %ret0: !llvm.ptr, %ret0_shadow: !llvm_ptr, ...,
                           %retM: !llvm.ptr, %retM_shadow: !llvm_ptr,
                           %tape: !llvm.ptr)

where %ret*_shadow are initialized with the cotangents and it is this function's task to fill in %input*_shadow with the correct values.

Thanks!

P.S. the pointers all point to MLIR tensor/memrefs. And the tape will likely contain a pointer to struct to pointers to MLIR tensors/memrefs. Is it generally advised to avoid pointers in the tape and favour structs? I can easily return a struct to pointers instead of pointer to struct to pointers. Also, I noticed a test that uses sret instead of returning a struct. Any preferred/canonical way to return the tape?

erick-xanadu · 2024-07-05T14:36:57Z

erick-xanadu
Jul 5, 2024
Author

Just a note, everything worked out. If I have time I'll try to create a PR with more examples of the calling convention and the return tape. Thanks!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Calling convention for custom gradients #1903

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Calling convention for custom gradients #1903

erick-xanadu May 22, 2024

Replies: 1 comment

erick-xanadu Jul 5, 2024 Author

erick-xanadu
May 22, 2024

erick-xanadu
Jul 5, 2024
Author