-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Avx2 engine #1
base: master
Are you sure you want to change the base?
Conversation
Some relevant documentation: https://doc.rust-lang.org/core/arch/index.html Right now, this PR completely breaks this library for x86 CPUs without AVX2. Here's my suggestion to how choosing this engine should behave: And since the
An alternative would be to have
Assuming we go with #[cfg(all(not(feature = "no_unsafe"), any(target_arch = "x86", target_arch = "x86_64")))]
pub type DefaultEngine = if is_x86_feature_detected!("avx2") { Avx2 } else { NoSimd };
#[cfg(not(all(not(feature = "no_unsafe"), any(target_arch = "x86", target_arch = "x86_64"))))]
pub type DefaultEngine = NoSimd; But that doesn't work for a type alias. I tried with something like pub enum DefaultEngine {}
impl DefaultEngine {
pub fn new() -> impl Engine {
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
{
if is_x86_feature_detected!("avx2") {
return Avx2::new();
}
}
NoSimd::new()
}
} But that seems to require a modification of |
We could totally avoid Note: here's a footgun. By default, That is why most of the methods are annotated with We could drop the privileges again by using another layer of abstraction. I.e:
It adds quite a few "dummy" functions, but would move a lot of the code out of unsafe. Would you like to see an example of what that would look like? |
Nice work! Unfortunately I've read that SIMD code of Leopard may have patent restrictions. For example this comment at FastECC:
I havn't been able to verify this (nor do I have enough knowledge to verify patent claims), but I've seen some evidence suggesting that this may be true. I can't take any risks concerning patents and so I don't think I can include any SIMD code for now. ps. I don't have any experience on Rust SIMD so currently I also can't review code like this. I've been meaning to learn that, but because of the patent problem I havn't even started on this. |
Thanks!
Think I found the patent you are referring to: A break down of the patent can be found here: https://erasurecodepatents.wordpress.com/ It looks like StreamScale is having at least some success enforcing it: https://www.datanami.com/2023/10/16/cloudera-hit-with-240-million-judgement-over-erasure-coding/ But it sounds like Cloudera is going to appeal though:
(Btw, the US District Court for the Western District of Texas is know for beeing very patent friendly: https://www.eff.org/deeplinks/2014/07/why-do-patent-trolls-go-texas-its-not-bbq ) Anyways, as that patent is only a risk for companies doing business in the US, how about making this code opt-in, and including a warning in the readme? There are many other open source projects that include variations of this code: |
So they have successfully enforced it against a company and got an open-source project (GF-Complete & Jerasure) to quit. I can't take any risks with patent like that. Also my interest in this algorithm is longterm and I have no problem waiting until 2032 when that patent expires before adding SIMD. p.s. It's always possible to implement AVX 2 engine in a separate crate, if you or someone else thinks this patent isn't a problem. |
That's totally fair. I'll fork and submit a separate crate |
For anyone interested, I've submitted a fork under the name On AArch64 it uses the Neon SIMD instructions, and on x86(-64) either AVX2 or SSSE3 is used. The best implementation is chosen at runtime with fallback to plain Rust. |
Hi!
First of all, thank you very much for writing this well structured library!
With inspiration from https://github.com/catid/leopard/ I wrote a new
engine
that leverages AVX2 instructions.On an AMD Ryzen 5 3600, encoding is speed up by up to 1700%, and decoding up to about 1100%. Ex:
Full run: https://pastebin.com/m8AcP8T5