For the complete Mojo documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /docs/manual/basics.md).
mma_amd
AMD CDNA Matrix Cores implementation for matrix multiply-accumulate operations.
This module provides MMA implementations for AMD CDNA2, CDNA3, and CDNA4 data center GPUs using the MFMA (Matrix Fused Multiply-Add) instructions.
Reference: https://gpuopen.com/learn/amd-lab-notes/amd-lab-notes-matrix-cores-readme/