Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
Mila::Dnn::Detail Namespace Reference

Classes

struct  mlp_activation_impl
struct  mlp_activation_impl< ActivationType::Gelu, TDeviceType, TPrecision >
struct  mlp_activation_impl< ActivationType::Swiglu, TDeviceType, TPrecision >

Functions

std::string formatBytes (size_t bytes)
template<TensorDataType TDataType, typename MR>
constexpr size_t get_alignment ()
 Determines optimal memory alignment based on memory resource and data type.
template<TensorDataType TDataType>
constexpr size_t getStorageSize (size_t logical_size)
 Calculates storage size in bytes for given logical element count.

Variables

constexpr size_t CPU_SIMD_ALIGN = 64
 AVX-512 alignment boundary for optimal CPU SIMD operations.
constexpr size_t CUDA_WARP_SIZE = 32
 CUDA warp size alignment for optimal GPU memory access patterns.

Function Documentation

◆ formatBytes()

std::string Mila::Dnn::Detail::formatBytes ( size_t bytes)
inline
Here is the caller graph for this function:

◆ get_alignment()

template<TensorDataType TDataType, typename MR>
size_t Mila::Dnn::Detail::get_alignment ( )
constexpr

Determines optimal memory alignment based on memory resource and data type.

Calculates the appropriate memory alignment boundary considering both the memory resource characteristics (CPU vs GPU) and data type requirements. This ensures optimal performance across different hardware architectures.

Template Parameters
TDataTypeAbstract tensor data type from TensorDataType enumeration
MRMemory resource type determining allocation strategy
Returns
Optimal alignment boundary in bytes for the given configuration

◆ getStorageSize()

template<TensorDataType TDataType>
size_t Mila::Dnn::Detail::getStorageSize ( size_t logical_size)
constexpr

Calculates storage size in bytes for given logical element count.

Computes the required storage bytes for a given number of logical elements of the specified tensor data type, handling potential overflow conditions.

Template Parameters
TDataTypeAbstract tensor data type from TensorDataType enumeration
Parameters
logical_sizeNumber of logical elements
Returns
Required storage size in bytes
Exceptions
std::overflow_errorIf calculation would overflow
Here is the caller graph for this function:

Variable Documentation

◆ CPU_SIMD_ALIGN

size_t Mila::Dnn::Detail::CPU_SIMD_ALIGN = 64
constexpr

AVX-512 alignment boundary for optimal CPU SIMD operations.

Defines the alignment requirement for maximum efficiency with AVX-512 vector instructions, enabling optimal vectorized operations on modern CPU architectures.

◆ CUDA_WARP_SIZE

size_t Mila::Dnn::Detail::CUDA_WARP_SIZE = 32
constexpr

CUDA warp size alignment for optimal GPU memory access patterns.

Defines the alignment boundary required for optimal memory coalescing in CUDA warp-level operations, ensuring maximum memory throughput in GPU kernels and device functions.