Classes
struct	mlp_activation_impl
struct	mlp_activation_impl< ActivationType::Gelu, TDeviceType, TPrecision >
struct	mlp_activation_impl< ActivationType::Swiglu, TDeviceType, TPrecision >

Functions
std::string	formatBytes (size_t bytes)
template<TensorDataType TDataType, typename MR>
constexpr size_t	get_alignment ()
	Determines optimal memory alignment based on memory resource and data type.
template<TensorDataType TDataType>
constexpr size_t	getStorageSize (size_t logical_size)
	Calculates storage size in bytes for given logical element count.

Variables
constexpr size_t	CPU_SIMD_ALIGN = 64
	AVX-512 alignment boundary for optimal CPU SIMD operations.
constexpr size_t	CUDA_WARP_SIZE = 32
	CUDA warp size alignment for optimal GPU memory access patterns.

Function Documentation

◆ formatBytes()

std::string Mila::Dnn::Detail::formatBytes ( size_t bytes )

inline

Here is the caller graph for this function:

◆ get_alignment()

template<TensorDataType TDataType, typename MR>

size_t Mila::Dnn::Detail::get_alignment ( )

constexpr

Determines optimal memory alignment based on memory resource and data type.

Calculates the appropriate memory alignment boundary considering both the memory resource characteristics (CPU vs GPU) and data type requirements. This ensures optimal performance across different hardware architectures.

Template Parameters

TDataType	Abstract tensor data type from TensorDataType enumeration
MR	Memory resource type determining allocation strategy

Returns: Optimal alignment boundary in bytes for the given configuration

◆ getStorageSize()

template<TensorDataType TDataType>

size_t Mila::Dnn::Detail::getStorageSize ( size_t logical_size )

constexpr

Calculates storage size in bytes for given logical element count.

Computes the required storage bytes for a given number of logical elements of the specified tensor data type, handling potential overflow conditions.

Template Parameters

TDataType Abstract tensor data type from TensorDataType enumeration

Parameters

logical_size Number of logical elements

Returns: Required storage size in bytes

Exceptions

std::overflow_error If calculation would overflow

Here is the caller graph for this function:

Variable Documentation

◆ CPU_SIMD_ALIGN

size_t Mila::Dnn::Detail::CPU_SIMD_ALIGN = 64

constexpr

AVX-512 alignment boundary for optimal CPU SIMD operations.

Defines the alignment requirement for maximum efficiency with AVX-512 vector instructions, enabling optimal vectorized operations on modern CPU architectures.

◆ CUDA_WARP_SIZE

size_t Mila::Dnn::Detail::CUDA_WARP_SIZE = 32

constexpr

CUDA warp size alignment for optimal GPU memory access patterns.

Defines the alignment boundary required for optimal memory coalescing in CUDA warp-level operations, ensuring maximum memory throughput in GPU kernels and device functions.

Classes

Functions

Variables

Function Documentation

◆ formatBytes()

◆ get_alignment()

◆ getStorageSize()

Variable Documentation

◆ CPU_SIMD_ALIGN

◆ CUDA_WARP_SIZE