-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add montgomery to vec_ops and example of that #566
base: V2
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, left 1 comment
icicle/include/vec_ops/vec_ops.cuh
Outdated
@@ -42,6 +44,7 @@ namespace vec_ops { | |||
false, // is_b_on_device | |||
false, // is_result_on_device | |||
false, // is_async | |||
false, // is_in_montgomery_form |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it would be better to separate the Montgomery form flag for inputs and the result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We treat the same way for montgomery inputs in other configs as well, if changing it, best to change the apis all together
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In MSM we let the user specify it per input. For example
icicle/icicle/include/msm/msm.cuh
Line 67 in 621676b
bool are_scalars_montgomery_form; /**< True if scalars are in Montgomery form and false otherwise. Default value: |
I am not aware of other APIs with this flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rust need formatting, other than that lgtm
CHK_IF_RETURN(mont::from_montgomery(d_alloc_vec_a, n * sizeof(E), config.ctx.stream, d_alloc_vec_a)); | ||
is_d_alloc_vec_a_allocated = 1; | ||
d_vec_a = d_alloc_vec_a; | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that in both cases you allocated and copy. You could do it like
Line 524 in 621676b
if (are_points_montgomery_form) { |
(1) allocate
(2) copy
(3) if mont --> inplace convert
nvmlDeviceGetHandleByIndex(0, &device); // for GPU 0 | ||
std::cout << "Icicle-Examples: vector mul / add / sub operations." << std::endl; | ||
char name[NVML_DEVICE_NAME_BUFFER_SIZE]; | ||
if (nvmlDeviceGetName(device, name, NVML_DEVICE_NAME_BUFFER_SIZE) == NVML_SUCCESS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what all this device name and power has to do with the example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two notes:
(1) there seems to be no tests for this feature (vec ops for montgomery data). I don't consider the example to be a test.
(2) In my opinion, the example is very long and it makes it hard to see that basically we can compute on montgomery data by setting a flag in the config struct.
icicle/src/vec_ops/vec_ops.cu
Outdated
} else { | ||
d_result = d_vec_a; | ||
} | ||
} else { | ||
if (!is_in_place) { | ||
d_result = result; | ||
} else { | ||
d_result = result = d_vec_a; | ||
// d_result = result = d_vec_a; // DEBUG - looks like a bug for in-place. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checking - not sure it's a bug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems not a bug (since there are, I believe, all the tests for all the cases) but a redundant assignment (since it's in-place so result == vec_a
and on-device so d_vec_a = vec_a
earlier). should be d_result= d_vec_a;
for clarity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test fails with
d_result= d_vec_a;
and passes with
d_result= result;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohh, unfortunately
if (!is_in_place) {
d_result = result;
} else {
d_result = result;
}
I'm afraid - is logically incorrect 😏 so there must be a bug somewhere else too - still figuring it out, hope to find out asap
icicle/src/vec_ops/vec_ops.cu
Outdated
} | ||
|
||
int is_d_result_allocated = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls use bool everywhere for all true/false unless it's multiple choice.
icicle/src/vec_ops/vec_ops.cu
Outdated
} | ||
|
||
int is_d_alloc_vec_b_allocated = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls use bool everywhere for all true/false unless it's multiple choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great 👍🏻 and cool functionality, one note just Like @yshekel mentioned - there are few places where logic is copied and should be better not 😏 to avoid code duplication
@danny-shterman I agree with Yuval here. |
Describe the changes
This PR...
Linked Issues
Resolves #