Calling convention for custom gradients #1903
Replies: 1 comment
-
Just a note, everything worked out. If I have time I'll try to create a PR with more examples of the calling convention and the return tape. Thanks! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I will be working on integrating Enzyme's custom gradients with some code, but I am having a bit of difficulty understanding the calling convention for custom gradients.
I already have the three functions (the function itself, the forward augmented pass and the reverse pass). But they don't match the calling convention yet. I'm thinking of just writing wrappers around the forward and reverse pass. The function itself can be described as follows in MLIR's LLVM Dialect (it does not return a value):
I've looked into the tests, and I found a test for the custom gradient that matches this signature closely.
Would it then be correct to say that the forward pass will have the following signature where the shadow arguments are never used and the tape can be a pointer?
and the reverse pass will have the following signature:
where
%ret*_shadow
are initialized with the cotangents and it is this function's task to fill in%input*_shadow
with the correct values.Thanks!
P.S. the pointers all point to MLIR tensor/memrefs. And the tape will likely contain a pointer to struct to pointers to MLIR tensors/memrefs. Is it generally advised to avoid pointers in the tape and favour structs? I can easily return a struct to pointers instead of pointer to struct to pointers. Also, I noticed a test that uses
sret
instead of returning a struct. Any preferred/canonical way to return the tape?Beta Was this translation helpful? Give feedback.
All reactions