-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Description
This issue will track work being done to support the AVX-512 ISA extensions in .NET.
A GitHub project also exists to do this tracking: AVX-512
GitHub query for all issues and pull requests tagged arch-avx512
: https://github.com/dotnet/runtime/labels/arch-avx512
We will alter how we track this work based on experience with both mechanisms.
The following work is all planned for .NET 8. If we determine work will not make .NET 8, it will be noted.
RyuJIT feature work
- Implement AVX-512 state support in VM. Adding zmmStateSupport and AVX512F, AVX512CD, AVX512BW, AVX512DQ and AVX512VL ISAs. #74113
- Implement EVEX encoding. Adding EVEX encoding support for emitOutputRRR(). #75934 and related.
- Add support for new AVX-512 vector registers Enable AVX512 Additional 16 SIMD Registers #79544
- Enable 512-bit SIMD types in .NET Runtime and JIT.
- Optimize
lsraRegOrder
andbuildPhysRegRecords
to skip non-AVX512 registers if AVX512 not available #81847 - Implement light-weight K register ExtractMostSignificantBits for Vector512. Implement light-weight K register
ExtractMostSignificantBits
forVector512
#80820 - Enable 512-bit SIMD types in .NET Runtime and JIT #80810. Initial support for zmm in .NET #80960
- Implement minimal Vector512 hardware accelerated lowering in the JIT. Implement minimal
Vector512
hardware accelerated lowering in the JIT #80811 - Implement Vector512 API surface for Vector512 hardware acceleration. Implement
Vector512<T>
API surface forVector512
hardware acceleration #80814 - Add opmask (k) registers to the register allocator. Add opmask (k) registers to the register allocator #80823
- Enable EVEX embedded broadcasting support in xarch emitter. Enable EVEX embedded broadcasting support in xarch emitter #80825
- Add optimization for vector conversion of uint/ulong to float/double Add optimization for vector conversion of
uint
/ulong
tofloat
/double
. #89277 - Implement System.Runtime.Intrinsics.X86.Avx512F lowerings in JIT. Implement
System.Runtime.Intrinsics.X86.Avx512F
lowerings in JIT. #80865 - Implement System.Runtime.Intrinsics.X86.Avx512F.VL lowerings in JIT. Implement
System.Runtime.Intrinsics.X86.Avx512F.VL
lowerings in JIT #80867 - Implement System.Runtime.Intrinsics.X86.Avx512BW lowerings in JIT. Implement
System.Runtime.Intrinsics.X86.Avx512BW
lowerings in JIT. #80868 - Implement System.Runtime.Intrinsics.X86.Avx512CD lowerings in JIT. Implement
System.Runtime.Intrinsics.X86.Avx512CD
lowerings in JIT. #80869 - Implement System.Runtime.Intrinsics.X86.Avx512DQ lowerings in JIT. Implement
System.Runtime.Intrinsics.X86.Avx512DQ
lowerings in JIT. #80870 - Implement permanent solution to integrate simd64 type to VecCon node. Implement permanent solution to integrate simd64 type to VecCon node #82312
- Upgrading Vector256/512 Shuffle() with VBMI support #87083
RyuJIT optimization work
- Optimize block unrolling operations using AVX-512 #83798
- AVX-512 throughput improvement opportunties #83946
- Enable AVX-512 in Memmove unrolling #84348
- Use AVX-512 in LowerCallMemcmp #84854
- Enable AVX-512 for string/span Equals/StartsWith #84885
- Enable EVEX feature: embedded broadcast for Vector128/256/512.Add() in limited cases #84821
- Optimize scalar conversions with AVX512 #84384
- Use AVX512 to zero locals #91166
CI/testing work
- Enable CI testing on AVX-512 machines on Windows. Add AVX-512 testing pipeline #77930
- Enable CI testing on AVX-512 machines on Linux. Enable
runtime-coreclr jitstress-isas-avx512
pipeline on Linux #79417 - Create AVX-512 AzDO pipeline for testing with AVX-512 stress modes. Add AVX-512 testing pipeline #77930
- Enable performance lab AVX-512 testing
-
Test performance in ASP.NET performance lab
VM work
- Update Linux/Mac Context State for AVX512 #81846
- Ensure XMM16-XMM31 and K0-K7 are handled where appropriate #84087
Debugging / diagnostics work
- AVX-512 debugger support: breakpoints #87843
- Ensure debugging works with AVX-512 types and intrinsics.
Libraries work
- Light up Span with Vector512 code paths. Light up
Span
withVector512
code paths. #80824, Upgrading SpanHelpers with Vector512 #86655 - Light up Ascii.Utility methods with Vector512 code paths. Light up
Ascii.Utility
methods withVector512
code paths. #89280
API design work
- Expose Vector512<T> to support the x86/x64 AVX-512 instruction set #73262
- Expose System.Runtime.Intrinsics.X86.Avx512F #73604
- Expose System.Runtime.Intrinsics.X86.Avx512F.VL #74813
- (Will not implement) Expose VectorMask<T> to support generic masking for Vector<T> #74613
- Expose AVX512BW, AVX512CD, and AVX512DQ #76579
Note: all API implementation work that has been planned for .NET 8 has been completed. There are a few remaining "esoteric" APIs that still need to be completed, and that work has been moved to .NET 9. The linked issues will not be closed until the entire API surface area is complete, due to API design issue tracking rules.
Related work
- Vector{<T>|64|128|256|512}.Narrow with saturation #75724
- Initial Draft for Vector SIMD Codegen Enhancements designs#268
Future work
The following work items for work that did not get implemented for .NET 8 will be considered for .NET 9.
RyuJIT feature work
- Add EVEX encoding opmask (k) register masking for per-instruction opmask to xarch emitter. Add EVEX encoding opmask (k) register masking to xarch emitter #80821
- Enable EVEX embedded rounding support in xarch emitter.
- Add optimization for scalar/vector conversion of uint32/uint64 to/from packed float/double. Add optimization for scalar/vector conversion of
uint32
/uint64
to/from packedfloat
/double
#80829 - Finish Avx512 specific lightup for Vector128/256/512<T> #85207
RyuJIT optimization work
- AVX512: Fold some bitwise operations to vpternlogq AVX512: Fold some bitwise operations to vpternlogq #84534
- Add optimization for scalar conversion of float/double to ulong Add optimization for scalar conversion of
float
/double
toulong
. #89279
CI/testing work
VM work
Debugging / diagnostics work
- AVX-512 debugger support: view registers #87854
- Ensure ELT (enter/leave/tailcall hooks, for profiling) works.
Libraries work
- Light up Utf8/Utf16 code with Vector512. Light up Utf8Utility.*.cs and Utf16Utility.*.cs with Vector512 code paths. #86119
API design work
Metadata
Metadata
Assignees
Labels
Type
Projects
Status