IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /docs/manual/basics.md). For the complete Mojo documentation index, see llms.txt.
Skip to main content
Version: Nightly
For the complete Mojo documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /docs/manual/basics.md).

load_matrix_b

load_matrix_b[m: Int, n: Int, k: Int](b_ptr: UnsafePointer[Float32], tile_row: Int, tile_col: Int, ldm: Int) -> SIMD[DType.float32, 2]

Loads a tile of matrix B from memory to registers for TF32 tensor core operations.

Constraints:

The tile dimensions must be m=16, n=8, k=8.

Parameters:

  • m (Int): Number of rows in the output matrix tile.
  • n (Int): Number of columns in the output matrix tile.
  • k (Int): Inner dimension for matrix multiplication.

Args:

  • b_ptr (UnsafePointer[Float32]): Pointer to matrix B data in memory.
  • tile_row (Int): Starting row index of the tile.
  • tile_col (Int): Starting column index of the tile.
  • ldm (Int): Leading dimension of matrix B (stride between rows).

Returns:

SIMD[DType.float32, 2]: SIMD vector containing 2 TF32 values loaded from matrix B in the required order.

load_matrix_b[m: Int, n: Int, k: Int](b_ptr: UnsafePointer[Float16], tile_row: Int, tile_col: Int, ldm: Int) -> SIMD[DType.float16, 2]

Loads a tile of matrix B from memory to registers for FP16 tensor core operations.

Constraints:

The tile dimensions must be m=16, n=8, k=8.

Parameters:

  • m (Int): Number of rows in the output matrix tile.
  • n (Int): Number of columns in the output matrix tile.
  • k (Int): Inner dimension for matrix multiplication.

Args:

  • b_ptr (UnsafePointer[Float16]): Pointer to matrix B data in memory.
  • tile_row (Int): Starting row index of the tile.
  • tile_col (Int): Starting column index of the tile.
  • ldm (Int): Leading dimension of matrix B (stride between rows).

Returns:

SIMD[DType.float16, 2]: SIMD vector containing 2 FP16 values loaded from matrix B in the required order.

load_matrix_b[m: Int, n: Int, k: Int](b_ptr: UnsafePointer[BFloat16], tile_row: Int, tile_col: Int, ldm: Int) -> SIMD[DType.bfloat16, (k // 4)]

Loads a tile of matrix B from memory to registers for BF16 tensor core operations.

Constraints:

The tile dimensions must be m=16, n=8, k=8 or m=16, n=8, k=16.

Parameters:

  • m (Int): Number of rows in the output matrix tile.
  • n (Int): Number of columns in the output matrix tile.
  • k (Int): Inner dimension for matrix multiplication.

Args:

  • b_ptr (UnsafePointer[BFloat16]): Pointer to matrix B data in memory.
  • tile_row (Int): Starting row index of the tile.
  • tile_col (Int): Starting column index of the tile.
  • ldm (Int): Leading dimension of matrix B (stride between rows).

Returns:

SIMD[DType.bfloat16, (k // 4)]: SIMD vector containing k//4 BF16 values loaded from matrix B in the required order.