Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance for GetNames in .NET 8 #71

Open
silkfire opened this issue Oct 20, 2023 · 5 comments
Open

Improve performance for GetNames in .NET 8 #71

silkfire opened this issue Oct 20, 2023 · 5 comments

Comments

@silkfire
Copy link

In .NET 8, the performance of Enum-related methods has been considerably improved, albeit not as fast as your excellent extension.

Recently, Nick Chapsas promoted your extension and ran benchmarks on it in .NET 8 and interestingly, the method GetNames of the source generator is slower than the native method.

See https://youtu.be/UBY4Y6AykdM?si=VzbUusG5YOu1Ke2X&t=418

Could we work out a solution to improve the performance so it's on par or faster than the native equivalent?

@silkfire silkfire changed the title Improve performance for GetNames for .NET 8 Improve performance for GetNames in .NET 8 Oct 20, 2023
@andrewlock
Copy link
Owner

Thanks @silkfire! Huh, that's interesting, I had seen things about the enum perf improvements in .NET 8 and so I had already run the tests in the repository myself, and found that the extension version was still faster...

This was the test I used

[MemoryDiagnoser]
public class GetNamesBenchmark
{
#if NETFRAMEWORK
    [Benchmark(Baseline = true)]
    [MethodImpl(MethodImplOptions.NoInlining)]
    public string[] EnumGetNames()
    {
        return Enum.GetNames(typeof(TestEnum));
    }
#else
    [Benchmark(Baseline = true)]
    [MethodImpl(MethodImplOptions.NoInlining)]
    public string[] EnumGetNames()
    {
        return Enum.GetNames<TestEnum>();
    }
#endif

    [Benchmark]
    [MethodImpl(MethodImplOptions.NoInlining)]
    public string[] ExtensionsGetNames()
    {
        return TestEnumExtensions.GetNames();
    }
}

Which gave these results:

BenchmarkDotNet v0.13.9+228a464e8be6c580ad9408e98f18813f6407fb5a, Windows 10 (10.0.19045.3448/22H2/2022Update)
Intel Core i7-7500U CPU 2.70GHz (Kaby Lake), 1 CPU, 4 logical and 2 physical cores
.NET SDK 8.0.100-rc.1.23463.5
  [Host]     : .NET 8.0.0 (8.0.23.41904), X64 RyuJIT AVX2
  Job-VRBYAA : .NET 8.0.0 (8.0.23.41904), X64 RyuJIT AVX2

Runtime=.NET 8.0  Toolchain=net8.0
Type Method Mean Error StdDev Median Ratio RatioSD Gen0 Allocated Alloc Ratio
GetNamesBenchmark EnumGetNames 14.3923 ns 0.2631 ns 0.2054 ns 14.3414 ns 0.729 0.02 0.0229 48 B 1.00
GetNamesBenchmark ExtensionsGetNames 7.0327 ns 0.1008 ns 0.0842 ns 7.0475 ns 0.357 0.01 0.0229 48 B 1.00

i.e. native version was 14ns, and 7ns for the extension. So directly contradicts Nick's findings 😅

Perhaps it's related to some SIMD work they're doing now? I ran these on a pretty old laptop, which only has 2 cores, if Nick's using something much beefier, maybe it's different.

Alternatively, could be related to the enum itself. My test enum only had three values:

[EnumExtensions]
public enum TestEnum
{
    First = 0,

    [Display(Name = "2nd")]
    Second = 1,
    Third = 2,
}

I can try testing with the same one as Nick (Day has 7 values) to see if there's any difference. Any other ideas @Elfocrash? 🤔

@silkfire
Copy link
Author

silkfire commented Oct 20, 2023

Very interesting indeed! I think it's likely that Mr Chapsas' machine is a bit more modern than the one you tested with, as I'm getting similar results (using the TestEnum).

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.22621
AMD Ryzen 9 5950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK=8.0.100-rc.2.23502.2
  [Host]     : .NET 8.0.0 (8.0.23.47906), X64 RyuJIT
  DefaultJob : .NET 8.0.0 (8.0.23.47906), X64 RyuJIT
Method Mean Error StdDev Ratio RatioSD Gen 0 Allocated
EnumGetNames 8.459 ns 0.0715 ns 0.0634 ns 1.00 0.00 0.0029 48 B
ExtensionsGetNames 12.933 ns 0.2027 ns 0.1896 ns 1.53 0.02 0.0029 48 B

I checked the source code of the native method and it seems it's written using super-optimized code that leverages internal P/Invoke calls to QCall. Something similar is used at least in .NET 6 too. Perhaps that's the only situation where the native method is hard to compete with, but I'm no expert at these micro optimizations so it's hard for me to tell.

@andrewlock
Copy link
Owner

That is interesting. I just ran a test with 7 enum values to make sure that wasn't part of it, and I'm still getting the same overall results

Method Mean Error StdDev Ratio Gen0 Allocated Alloc Ratio
EnumGetNames 19.249 ns 0.3034 ns 0.2534 ns 1.00 0.0383 80 B 1.00
ExtensionsGetNames 9.445 ns 0.1123 ns 0.0995 ns 0.49 0.0383 80 B 1.00

Will have to dig in further 🤔

@silkfire
Copy link
Author

Could it be as you mentioned in your previous reply that more modern processors leverage hardware intrinsics which results in faster execution of the native method?

@andrewlock
Copy link
Owner

Yeah, I'm assuming that's it (will confirm I also repro on my work machine instead of my old personal laptop). I'm guessing the magic happened in this PR 👀 dotnet/runtime#78580

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants