-
Hi, appreciate this crate it's really well thought. I am using it in scientific project where performance is important, so I have a question I am wondering because when I looked at flamegraph of my calculations and almost 70% of my time is spend on heap allocation and freeing, after adding those operations. I tried many changes but I didn't succeed. The first transformation is like a generalized matrix multiplying (I am also wondering whether there is a way to multiply not 1d lane but 2d "multi lane" if it is possible)
and the second transformation transform each lane using fft
Thanks in advance! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 5 replies
-
I haven't looked into the details, but I'd recommend first trying:
It looks like you need a function like general_mat_vec_mul which ndarray has right there, but maybe I read the code wrong. |
Beta Was this translation helpful? Give feedback.
-
What do you mean by that? It's not an either/or. You can use what I suggested in a parallel Zip. Btw, if you consider this conversation resolved, please use "Mark as answer". |
Beta Was this translation helpful? Give feedback.
I haven't looked into the details, but I'd recommend first trying:
It looks like you need a function like general_mat_vec_mul which ndarray has right there, but maybe I read the code wrong.