AVX-512 Optimization For Linux RAID Showing Up To 41% Improvement On AMD Ryzen 9 9950X

Written by Michael Larabel in Linux Storage on 12 June 2026 at 06:42 AM EDT. 11 Comments

Linux cryptography subsystem expert Eric Biggers Eric Biggers of Google worked on some pretty nice Intel/AMD x86_64 optimizations over the years. Especially around AVX-512 optimizations within the Linux kernel's crypto code has been one of his many nice improvements to the kernel in recent times. Today he's out with another enticing AVX-512 optimization and this time it's for the software RAID code.

Biggers has written an AVX-512 optimized xor_gen() function for the RAID code. The Linux kernel's xor_gen() function is used for generating and validating parity blocks such as for RAID5/RAID6. He commented with today's patch the details and it targeting AMD Zen 4 and newer, Intel Sapphire Rapids and newer, or on the Intel client side is either Rocket Lake generation or upcoming Nova Lake.

"Add an implementation of xor_gen() using AVX-512.

It uses 512-bit vectors, i.e. ZMM registers. It also uses the vpternlogq instruction to do three-input XORs when applicable.

It's enabled on x86_64 CPUs that have AVX512F && !PREFER_YMM. In practice that means:

- AMD Zen 4 and later (client and server)
- Intel Sapphire Rapids and later (server)
- Intel Rocket Lake (client)
- Intel Nova Lake and later (client)

The !PREFER_YMM condition excludes the older AVX-512 implementations in Intel Skylake Server and Intel Ice Lake. They could run this code, but they're known to have overly-eager downclocking when ZMM registers are used. This is the same policy that the crypto and CRC code uses."

Where it gets really exciting is the improvement out of this AVX-512 implementation. In testing on an AMD Ryzen 9 9950X (Zen 5) desktop processor is between a 19% and 41% improvement:

AVX-512 RAID optimization benchmark

Pretty damn nice improvement on top of all the other AVX-512 optimizations made by Eric Biggers in recent times. Hopefully this patch will work its way to the mainline kernel in the near future.

11 Comments