-
Notifications
You must be signed in to change notification settings - Fork 0
dist1ll/memset
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Clears 4GiB of pre-faulted memory from 1, 2, 4, 8 and 16 threads. Requirements: - Linux - x86-64 CPU with clzero Example output for AMD Milan system: dist1ll@sys:~/memset$ ./memset [*] pre-faulted 4GiB of memory. [*] running 5 iterations per method... THREADS=1 min avg max avg GB/s ----------- ----- ----- ----- ---------- memset_loop 295ms 296ms 297ms 14.50 GB/s memset_fsrm 393ms 394ms 395ms 10.89 GB/s memset_avx2 300ms 301ms 301ms 14.25 GB/s memset_avx2_nt 123ms 123ms 124ms 34.76 GB/s memset_clzero 122ms 124ms 125ms 34.59 GB/s THREADS=2 min avg max avg GB/s ----------- ----- ----- ----- ---------- memset_loop 152ms 155ms 165ms 27.60 GB/s memset_fsrm 200ms 201ms 201ms 21.36 GB/s memset_avx2 159ms 159ms 160ms 26.88 GB/s memset_avx2_nt 62ms 62ms 63ms 68.89 GB/s memset_clzero 61ms 61ms 62ms 69.32 GB/s THREADS=4 min avg max avg GB/s ----------- ----- ----- ----- ---------- memset_loop 111ms 113ms 114ms 37.99 GB/s memset_fsrm 105ms 111ms 119ms 38.36 GB/s memset_avx2 87ms 93ms 113ms 46.07 GB/s memset_avx2_nt 31ms 48ms 59ms 88.51 GB/s memset_clzero 31ms 31ms 32ms 134.87 GB/s THREADS=8 min avg max avg GB/s ----------- ----- ----- ----- ---------- memset_loop 54ms 62ms 66ms 69.15 GB/s memset_fsrm 59ms 60ms 61ms 71.09 GB/s memset_avx2 54ms 55ms 56ms 77.27 GB/s memset_avx2_nt 18ms 20ms 25ms 213.78 GB/s memset_clzero 18ms 19ms 19ms 225.18 GB/s THREADS=16 min avg max avg GB/s ----------- ----- ----- ----- ---------- memset_loop 54ms 55ms 57ms 76.92 GB/s memset_fsrm 53ms 53ms 54ms 79.63 GB/s memset_avx2 49ms 52ms 53ms 82.35 GB/s memset_avx2_nt 20ms 21ms 26ms 196.73 GB/s memset_clzero 20ms 20ms 20ms 210.86 GB/s Output for Intel i5-9600k (Coffee Lake) (modified, removed clzero) dist1ll@sys:~/memset$ ./memset [*] pre-faulted 4GiB of memory. [*] running 5 iterations per method... THREADS=1 min avg max avg GB/s ----------- ----- ----- ----- ---------- memset_loop 276ms 276ms 276ms 15.51 GB/s memset_fsrm 181ms 182ms 183ms 23.57 GB/s memset_avx2 270ms 270ms 271ms 15.85 GB/s memset_avx2_nt 115ms 115ms 115ms 37.28 GB/s THREADS=2 min avg max avg GB/s ----------- ----- ----- ----- ---------- memset_loop 236ms 236ms 236ms 18.15 GB/s memset_fsrm 109ms 110ms 110ms 39.04 GB/s memset_avx2 238ms 238ms 238ms 18.00 GB/s memset_avx2_nt 92ms 93ms 93ms 46.17 GB/s THREADS=4 min avg max avg GB/s ----------- ----- ----- ----- ---------- memset_loop 254ms 255ms 255ms 16.82 GB/s memset_fsrm 152ms 154ms 155ms 27.86 GB/s memset_avx2 254ms 254ms 255ms 16.87 GB/s memset_avx2_nt 93ms 93ms 94ms 45.70 GB/s
About
Multithreaded store bandwidth benchmark with FSRM, AVX2 and non-temporal stores
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published