IMPORTANT: To view this page as Markdown, append `.md` to the URL (e.g. /docs/manual/basics.md). For the complete Mojo documentation index, see llms.txt.
Skip to main content
Version: 1.0
For the complete Mojo documentation index, see llms.txt. Markdown versions of all pages are available by appending .md to any URL (e.g. /docs/manual/basics.md).

create_tma_tile

create_tma_tile[*tile_sizes: Int, *, swizzle_mode: TensorMapSwizzle = TensorMapSwizzle.SWIZZLE_NONE](ctx: DeviceContext, tensor: LayoutTensor[address_space=tensor.address_space, element_layout=tensor.element_layout, layout_int_type=tensor.layout_int_type, linear_idx_type=tensor.linear_idx_type, masked=tensor.masked, alignment=tensor.alignment]) -> TMATensorTile[tensor.dtype, 2, IndexList(tile_sizes.values[0], tile_sizes.values[1], __list_literal__=NoneType(None))]

Creates a TMATensorTile with specified tile dimensions and swizzle mode.

This function creates a hardware-accelerated Tensor Memory Access (TMA) descriptor for efficient asynchronous data transfers between global memory and shared memory. It configures the tile dimensions and memory access patterns based on the provided parameters.

Constraints:

  • The last dimension's size in bytes must not exceed the swizzle mode's byte limit (32B for SWIZZLE_32B, 64B for SWIZZLE_64B, 128B for SWIZZLE_128B).
  • Only supports 2D tensors in this overload.

Parameters:

  • *tile_sizes (Int): The dimensions of the tile to be transferred. For 2D tensors, this should be [height, width]. The dimensions determine the shape of data transferred in each TMA operation.
  • swizzle_mode (TensorMapSwizzle): The swizzling mode to use for memory access optimization. Swizzling can improve memory access patterns for specific hardware configurations.

Args:

Returns:

TMATensorTile[tensor.dtype, 2, IndexList(tile_sizes.values[0], tile_sizes.values[1], __list_literal__=NoneType(None))]: A TMATensorTile configured with the specified tile dimensions and swizzle mode, ready for use in asynchronous data transfer operations.

Raises:

If TMA descriptor creation fails.