Namespaces
namespace	Compute

namespace	Data

namespace	detail

namespace	Gpt2

namespace	Serialization

namespace	Utils

Classes
class	Component
	Abstract base class for all components in the Mila framework. More...

class	ComponentConfig
	Base configuration class for all neural network components. More...

class	CompositeModule
	A module class that can contain and manage child modules. More...

class	CrossEntropy
	CrossEntropy loss module for neural networks. More...

class	CrossEntropyConfig
	Configuration class for CrossEntropy module. More...

class	Dropout
	Dropout regularization module for neural networks. More...

class	DropoutConfig
	Configuration class for Dropout module. More...

class	Encoder
	An encoder module that provides token and positional embeddings. More...

class	EncoderConfig
	Configuration class for Encoder module. More...

class	FusedModule

class	Gelu
	Gaussian Error Linear Unit (GELU) activation function module. More...

class	GeluConfig
	Configuration class for GELU module. More...

class	LayerNorm
	Layer Normalization module. More...

class	LayerNormConfig
	Configuration class for Layer Normalization module. More...

class	Linear
	A class representing a linear transformation module. More...

class	LinearConfig
	Configuration class for Linear module. More...

class	MLP
	Multi-Layer Perceptron (MLP) block for neural networks. More...

class	MLPConfig
	Configuration class for MLP block. More...

class	Model
	A class representing a neural network model. More...

class	ModelCallback
	Interface for callbacks during training. More...

class	Module
	Abstract base class for all modules in the Mila DNN framework. More...

class	MultiHeadAttention
	Multi-head attention module for transformer architectures. More...

class	MultiHeadAttentionConfig
	Configuration class for MultiHeadAttention module. More...

class	Residual
	A class implementing a residual connection module. More...

class	ResidualConfig
	Configuration class for Residual connection module. More...

class	Softmax
	Softmax module for neural networks. More...

class	SoftmaxConfig
	Configuration class for Softmax module. More...

class	Tensor

class	TensorBuffer
	A buffer for storing tensor data with configurable memory management. More...

class	TensorPtr
	Base tensor pointer class that wraps a raw pointer with memory-type safety. More...

struct	TensorTrait
	Primary template for tensor type traits. More...

struct	TensorTrait< __nv_fp8_e4m3 >
	Specialization of TensorTrait for 8-bit floating point type (e4m3). More...

struct	TensorTrait< __nv_fp8_e5m2 >
	Specialization of TensorTrait for alternative 8-bit floating point type (e5m2). More...

struct	TensorTrait< float >
	Specialization of TensorTrait for float type. More...

struct	TensorTrait< half >
	Specialization of TensorTrait for half-precision float type. More...

struct	TensorTrait< int >
	Specialization of TensorTrait for 32-bit signed integer type. More...

struct	TensorTrait< int16_t >
	Specialization of TensorTrait for 16-bit signed integer type. More...

struct	TensorTrait< nv_bfloat16 >
	Specialization of TensorTrait for NVIDIA bfloat16 type. More...

struct	TensorTrait< uint16_t >
	Specialization of TensorTrait for 16-bit unsigned integer type. More...

struct	TensorTrait< uint32_t >
	Specialization of TensorTrait for 32-bit unsigned integer type. More...

struct	TrainingConfig
	Configuration for training a model. More...

class	TransformerBlock
	TransformerBlock implements a standard transformer encoder block. More...

class	TransformerBlockConfig
	Configuration class for TransformerBlock. More...

class	UniqueIdGenerator

Concepts
concept	ValidTensorType
	Concept that constrains types to those with valid tensor trait specializations.

concept	ValidFloatTensorType
	Concept that constrains types to valid floating-point tensor types.

concept	ValidFloatTensorTypes
	Concept that verifies both types are valid floating-point tensor types.

concept	ValidTensorTypes
	Concept that verifies both input and compute types have valid tensor trait mappings.

Typedefs
template<typename TDataType = float>
using	Mila::Dnn::CpuCompositeModule = CompositeModule< DeviceType::Cpu, TDataType >

template<typename TLogits = float, typename TTargets = int>
using	Mila::Dnn::CpuCrossEntropy = CrossEntropy< DeviceType::Cpu, TLogits, TTargets >
	Type alias for CPU-based cross entropy module with customizable tensor types.

template<typename TInput = float, typename TOutput = TInput>
using	Mila::Dnn::CpuDropout = Dropout< DeviceType::Cpu, TInput, TOutput >
	Type alias for CPU-based dropout module with customizable tensor types.

template<typename TInput = int, typename TOutput = float>
using	Mila::Dnn::CpuEncoder = Encoder< DeviceType::Cpu, TInput, TOutput >
	Type alias for CPU-based encoder module with customizable tensor types.

template<typename TDataType = float>
using	Mila::Dnn::CpuGelu = Gelu< DeviceType::Cpu, TDataType >
	Type alias for CPU-specific GELU module.

template<typename TInput = float, typename TOutput = TInput>
using	Mila::Dnn::CpuLayerNorm = LayerNorm< DeviceType::Cpu, TInput, TOutput >
	Type alias for CPU-based layer normalization module with customizable tensor types.

template<typename TInput = float, typename TOutput = TInput>
using	Mila::Dnn::CpuLinear = Linear< DeviceType::Cpu, TInput, TOutput >
	Type alias for CPU-based linear module with customizable tensor types.

template<typename TDataType = float>
using	Mila::Dnn::CpuMLP = MLP< DeviceType::Cpu, TDataType >
	Type alias for CPU-based MLP module with customizable tensor type.

template<typename TInput = float, typename TOutput = TInput>
using	Mila::Dnn::CpuModel = Model< DeviceType::Cpu, TInput, TOutput >
	Type alias for CPU-based models with customizable tensor types.

template<typename TInput = float, typename TOutput = TInput>
using	Mila::Dnn::CpuModule = Module< DeviceType::Cpu, TInput, TOutput >
	Type alias for CPU-based modules with customizable tensor types.

template<typename TInput = float, typename TOutput = TInput>
using	Mila::Dnn::CpuMultiHeadAttention = MultiHeadAttention< DeviceType::Cpu, TInput, TOutput >
	Type alias for CPU-based multi-head attention module with customizable tensor types.

template<typename TInput = float, typename TOutput = TInput>
using	Mila::Dnn::CpuResidual = Residual< DeviceType::Cpu, TInput, TOutput >
	Type alias for CPU-based residual module with customizable tensor types.

template<typename TInput = float, typename TOutput = TInput>
using	Mila::Dnn::CpuSoftmax = Softmax< DeviceType::Cpu, TInput, TOutput >
	Type alias for CPU-based softmax module with customizable tensor types.

template<typename TDataType = float>
using	Mila::Dnn::CpuTransformerBlock = TransformerBlock< DeviceType::Cpu, TDataType >
	Type alias for CPU-based transformer block with customizable tensor type.

template<typename TDataType = float>
using	Mila::Dnn::CudaCompositeModule = CompositeModule< DeviceType::Cuda, TDataType >

template<typename TLogits = float, typename TTargets = int>
using	Mila::Dnn::CudaCrossEntropy = CrossEntropy< DeviceType::Cuda, TLogits, TTargets >
	Type alias for CUDA-based cross entropy module with customizable tensor types.

template<typename TInput = float, typename TOutput = TInput>
using	Mila::Dnn::CudaDropout = Dropout< DeviceType::Cuda, TInput, TOutput >
	Type alias for CUDA-based dropout module with customizable tensor types.

template<typename TInput = int, typename TOutput = float>
using	Mila::Dnn::CudaEncoder = Encoder< DeviceType::Cuda, TInput, TOutput >
	Type alias for CUDA-based encoder module with customizable tensor types.

template<typename TDataType = float>
using	Mila::Dnn::CudaGelu = Gelu< DeviceType::Cuda, TDataType >
	Type alias for CUDA-specific GELU module.

template<typename TInput = float, typename TOutput = TInput>
using	Mila::Dnn::CudaLayerNorm = LayerNorm< DeviceType::Cuda, TInput, TOutput >
	Type alias for CUDA-based layer normalization module with customizable tensor types.

template<typename TInput = float, typename TOutput = TInput>
using	Mila::Dnn::CudaLinear = Linear< DeviceType::Cuda, TInput, TOutput >
	Type alias for CUDA-based linear module with customizable tensor types.

template<typename TDataType = float>
using	Mila::Dnn::CudaMLP = MLP< DeviceType::Cuda, TDataType >
	Type alias for CUDA-based MLP module with customizable tensor type.

template<typename TInput = float, typename TOutput = TInput>
using	Mila::Dnn::CudaModel = Model< DeviceType::Cuda, TInput, TOutput >
	Type alias for CUDA-based models with customizable tensor types.

template<typename TInput = float, typename TOutput = TInput>
using	Mila::Dnn::CudaModule = Module< DeviceType::Cuda, TInput, TOutput >
	Type alias for CUDA-based modules with customizable tensor types.

template<typename TInput = float, typename TOutput = TInput>
using	Mila::Dnn::CudaMultiHeadAttention = MultiHeadAttention< DeviceType::Cuda, TInput, TOutput >
	Type alias for CUDA-based multi-head attention module with customizable tensor types.

template<typename TInput = float, typename TOutput = TInput>
using	Mila::Dnn::CudaResidual = Residual< DeviceType::Cuda, TInput, TOutput >
	Type alias for CUDA-based residual module with customizable tensor types.

template<typename TInput = float, typename TOutput = TInput>
using	Mila::Dnn::CudaSoftmax = Softmax< DeviceType::Cuda, TInput, TOutput >
	Type alias for CUDA-based softmax module with customizable tensor types.

template<typename TDataType = float>
using	Mila::Dnn::CudaTransformerBlock = TransformerBlock< DeviceType::Cuda, TDataType >
	Type alias for CUDA-based transformer block with customizable tensor type.

template<typename T >
using	Mila::Dnn::DevicePtr = TensorPtr< T, false >
	Type alias for device memory pointers.

template<class T >
using	Mila::Dnn::DeviceTensor = Tensor< T, Compute::CudaMemoryResource >
	Tensor type that uses device (GPU) memory.

template<typename T >
using	Mila::Dnn::HostPtr = TensorPtr< T, true >
	Type alias for host memory pointers.

template<typename T >
using	Mila::Dnn::HostTensor = Tensor< T, Compute::HostMemoryResource >
	Tensor type that uses host (CPU) memory.

template<class T >
using	Mila::Dnn::PinnedTensor = Tensor< T, Compute::CudaPinnedMemoryResource >
	Tensor type that uses pinned (page-locked) host memory.

template<class T >
using	Mila::Dnn::UniversalTensor = Tensor< T, Compute::CudaManagedMemoryResource >
	Tensor type that uses CUDA managed memory accessible from both CPU and GPU.

Enumerations
enum class	Mila::Dnn::ActivationType { None , Relu , Gelu , Silu , Tanh , Sigmoid , LeakyRelu , Mish }
	Enumeration of supported activation function types. More...

Functions
std::string	Mila::Dnn::activationTypeToString (ActivationType type)
	Converts an ActivationType enum value to its string representation.

template<typename TElementType , typename MR = Compute::CpuMemoryResource> requires ValidTensorType<TElementType> && std::is_base_of_v<Compute::MemoryResource, MR>
void	Mila::Dnn::random (Tensor< TElementType, MR > &tensor, TElementType min, TElementType max)
	Initializes a tensor with random values within a specified range.

template<typename T , bool IsHostAccessible>
T *	Mila::Dnn::raw_pointer_cast (const TensorPtr< T, IsHostAccessible > &ptr) noexcept
	Gets a raw pointer from a TensorPtr (similar to thrust::raw_pointer_cast)

ActivationType	Mila::Dnn::stringToActivationType (const std::string &name)
	Converts a string to its corresponding ActivationType enum value.

template<typename T >
constexpr std::string_view	Mila::Dnn::tensor_type_name ()
	Get the string representation of a tensor element type.

template<typename T >
constexpr size_t	Mila::Dnn::tensor_type_size ()
	Get the size in bytes of a tensor element type.

template<typename TElementType , typename MR > requires ValidTensorType<TElementType>&& std::is_base_of_v<Compute::MemoryResource, MR>
void	Mila::Dnn::xavier (Tensor< TElementType, MR > &tensor, size_t input_size, size_t output_size)
	Initializes a tensor with Xavier/Glorot uniform initialization.

Typedef Documentation

◆ CpuCompositeModule

template<typename TDataType = float>

using Mila::Dnn::CpuCompositeModule = typedef CompositeModule<DeviceType::Cpu, TDataType>

export

◆ CpuCrossEntropy

template<typename TLogits = float, typename TTargets = int>

using Mila::Dnn::CpuCrossEntropy = typedef CrossEntropy<DeviceType::Cpu, TLogits, TTargets>

export

Type alias for CPU-based cross entropy module with customizable tensor types.

Template Parameters

TLogits	Data type of the input logits tensor elements.
TTargets	Data type of the target indices, typically int.

◆ CpuDropout

template<typename TInput = float, typename TOutput = TInput>

using Mila::Dnn::CpuDropout = typedef Dropout<DeviceType::Cpu, TInput, TOutput>

export

Type alias for CPU-based dropout module with customizable tensor types.

Template Parameters

TInput	Data type of the input tensor elements.
TOutput	Data type of the output tensor elements, defaults to TInput.

◆ CpuEncoder

template<typename TInput = int, typename TOutput = float>

using Mila::Dnn::CpuEncoder = typedef Encoder<DeviceType::Cpu, TInput, TOutput>

export

Type alias for CPU-based encoder module with customizable tensor types.

Template Parameters

TInput	Data type of the input token IDs (typically int).
TOutput	Data type of the output embeddings (typically float).

◆ CpuGelu

template<typename TDataType = float>

using Mila::Dnn::CpuGelu = typedef Gelu<DeviceType::Cpu, TDataType>

export

Type alias for CPU-specific GELU module.

Convenience type that pre-configures the Gelu template for CPU execution.

Template Parameters

TDataType Floating-point data type (default: float)

◆ CpuLayerNorm

template<typename TInput = float, typename TOutput = TInput>

using Mila::Dnn::CpuLayerNorm = typedef LayerNorm<DeviceType::Cpu, TInput, TOutput>

export

Type alias for CPU-based layer normalization module with customizable tensor types.

Template Parameters

TInput	Data type of the input tensor elements.
TOutput	Data type of the output tensor elements, defaults to TInput.

◆ CpuLinear

template<typename TInput = float, typename TOutput = TInput>

using Mila::Dnn::CpuLinear = typedef Linear<DeviceType::Cpu, TInput, TOutput>

export

Type alias for CPU-based linear module with customizable tensor types.

Template Parameters

TInput	Data type of the input tensor elements.
TOutput	Data type of the output tensor elements, defaults to TInput.

◆ CpuMLP

template<typename TDataType = float>

using Mila::Dnn::CpuMLP = typedef MLP<DeviceType::Cpu, TDataType>

export

Type alias for CPU-based MLP module with customizable tensor type.

Template Parameters

TDataType Data type of the tensor elements.

◆ CpuModel

template<typename TInput = float, typename TOutput = TInput>

using Mila::Dnn::CpuModel = typedef Model<DeviceType::Cpu, TInput, TOutput>

export

Type alias for CPU-based models with customizable tensor types.

Template Parameters

TInput	Data type of the input tensor elements.
TOutput	Data type of the output tensor elements, defaults to TInput.

◆ CpuModule

template<typename TInput = float, typename TOutput = TInput>

using Mila::Dnn::CpuModule = typedef Module<DeviceType::Cpu, TInput, TOutput>

export

Type alias for CPU-based modules with customizable tensor types.

Template Parameters

TInput	Data type of the input tensor elements.
TOutput	Data type of the output tensor elements, defaults to TInput.
TCompute	Data type used for internal calculations, defaults to TOutput.

◆ CpuMultiHeadAttention

template<typename TInput = float, typename TOutput = TInput>

using Mila::Dnn::CpuMultiHeadAttention = typedef MultiHeadAttention<DeviceType::Cpu, TInput, TOutput>

export

Type alias for CPU-based multi-head attention module with customizable tensor types.

Template Parameters

TInput	Data type of the input tensor elements.
TOutput	Data type of the output tensor elements, defaults to TInput.

◆ CpuResidual

template<typename TInput = float, typename TOutput = TInput>

using Mila::Dnn::CpuResidual = typedef Residual<DeviceType::Cpu, TInput, TOutput>

export

Type alias for CPU-based residual module with customizable tensor types.

Template Parameters

TInput	Data type of the input tensor elements.
TOutput	Data type of the output tensor elements, defaults to TInput.

◆ CpuSoftmax

template<typename TInput = float, typename TOutput = TInput>

using Mila::Dnn::CpuSoftmax = typedef Softmax<DeviceType::Cpu, TInput, TOutput>

export

Type alias for CPU-based softmax module with customizable tensor types.

Template Parameters

TInput	Data type of the input tensor elements.
TOutput	Data type of the output tensor elements, defaults to TInput.

◆ CpuTransformerBlock

template<typename TDataType = float>

using Mila::Dnn::CpuTransformerBlock = typedef TransformerBlock<DeviceType::Cpu, TDataType>

export

Type alias for CPU-based transformer block with customizable tensor type.

Template Parameters

TDataType Data type used for tensor elements throughout the network.

◆ CudaCompositeModule

template<typename TDataType = float>

using Mila::Dnn::CudaCompositeModule = typedef CompositeModule<DeviceType::Cuda, TDataType>

export

◆ CudaCrossEntropy

template<typename TLogits = float, typename TTargets = int>

using Mila::Dnn::CudaCrossEntropy = typedef CrossEntropy<DeviceType::Cuda, TLogits, TTargets>

export

Type alias for CUDA-based cross entropy module with customizable tensor types.

Template Parameters

TLogits	Data type of the input logits tensor elements.
TTargets	Data type of the target indices, typically int.

◆ CudaDropout

template<typename TInput = float, typename TOutput = TInput>

using Mila::Dnn::CudaDropout = typedef Dropout<DeviceType::Cuda, TInput, TOutput>

export

Type alias for CUDA-based dropout module with customizable tensor types.

Template Parameters

TInput	Data type of the input tensor elements.
TOutput	Data type of the output tensor elements, defaults to TInput.

◆ CudaEncoder

template<typename TInput = int, typename TOutput = float>

using Mila::Dnn::CudaEncoder = typedef Encoder<DeviceType::Cuda, TInput, TOutput>

export

Type alias for CUDA-based encoder module with customizable tensor types.

Template Parameters

TInput	Data type of the input token IDs (typically int).
TOutput	Data type of the output embeddings (typically float).

◆ CudaGelu

template<typename TDataType = float>

using Mila::Dnn::CudaGelu = typedef Gelu<DeviceType::Cuda, TDataType>

export

Type alias for CUDA-specific GELU module.

Convenience type that pre-configures the Gelu template for CUDA GPU execution.

Template Parameters

TDataType Floating-point data type (default: float)

◆ CudaLayerNorm

template<typename TInput = float, typename TOutput = TInput>

using Mila::Dnn::CudaLayerNorm = typedef LayerNorm<DeviceType::Cuda, TInput, TOutput>

export

Type alias for CUDA-based layer normalization module with customizable tensor types.

Template Parameters

TInput	Data type of the input tensor elements.
TOutput	Data type of the output tensor elements, defaults to TInput.

◆ CudaLinear

template<typename TInput = float, typename TOutput = TInput>

using Mila::Dnn::CudaLinear = typedef Linear<DeviceType::Cuda, TInput, TOutput>

export

Type alias for CUDA-based linear module with customizable tensor types.

Template Parameters

TInput	Data type of the input tensor elements.
TOutput	Data type of the output tensor elements, defaults to TInput.

◆ CudaMLP

template<typename TDataType = float>

using Mila::Dnn::CudaMLP = typedef MLP<DeviceType::Cuda, TDataType>

export

Type alias for CUDA-based MLP module with customizable tensor type.

Template Parameters

TDataType Data type of the tensor elements.

◆ CudaModel

template<typename TInput = float, typename TOutput = TInput>

using Mila::Dnn::CudaModel = typedef Model<DeviceType::Cuda, TInput, TOutput>

export

Type alias for CUDA-based models with customizable tensor types.

Template Parameters

TInput	Data type of the input tensor elements.
TOutput	Data type of the output tensor elements, defaults to TInput.

◆ CudaModule

template<typename TInput = float, typename TOutput = TInput>

using Mila::Dnn::CudaModule = typedef Module<DeviceType::Cuda, TInput, TOutput>

export

Type alias for CUDA-based modules with customizable tensor types.

Template Parameters

TInput	Data type of the input tensor elements.
TOutput	Data type of the output tensor elements, defaults to TInput.
TCompute	Data type used for internal calculations, defaults to TOutput.

◆ CudaMultiHeadAttention

template<typename TInput = float, typename TOutput = TInput>

using Mila::Dnn::CudaMultiHeadAttention = typedef MultiHeadAttention<DeviceType::Cuda, TInput, TOutput>

export

Type alias for CUDA-based multi-head attention module with customizable tensor types.

Template Parameters

TInput	Data type of the input tensor elements.
TOutput	Data type of the output tensor elements, defaults to TInput.

◆ CudaResidual

template<typename TInput = float, typename TOutput = TInput>

using Mila::Dnn::CudaResidual = typedef Residual<DeviceType::Cuda, TInput, TOutput>

export

Type alias for CUDA-based residual module with customizable tensor types.

Template Parameters

TInput	Data type of the input tensor elements.
TOutput	Data type of the output tensor elements, defaults to TInput.

◆ CudaSoftmax

template<typename TInput = float, typename TOutput = TInput>

using Mila::Dnn::CudaSoftmax = typedef Softmax<DeviceType::Cuda, TInput, TOutput>

export

Type alias for CUDA-based softmax module with customizable tensor types.

Template Parameters

TInput	Data type of the input tensor elements.
TOutput	Data type of the output tensor elements, defaults to TInput.

◆ CudaTransformerBlock

template<typename TDataType = float>

using Mila::Dnn::CudaTransformerBlock = typedef TransformerBlock<DeviceType::Cuda, TDataType>

export

Type alias for CUDA-based transformer block with customizable tensor type.

Template Parameters

TDataType Data type used for tensor elements throughout the network.

◆ DevicePtr

template<typename T >

using Mila::Dnn::DevicePtr = typedef TensorPtr<T, false>

export

Type alias for device memory pointers.

◆ DeviceTensor

template<class T >

using Mila::Dnn::DeviceTensor = typedef Tensor<T, Compute::CudaMemoryResource>

export

Tensor type that uses device (GPU) memory.

DeviceTensor stores data in GPU memory for optimal performance with CUDA operations. This type is suitable for:

Neural network weights, activations, and gradients
Data used in compute-intensive GPU operations
Performance-critical processing paths

Memory safety:

Cannot be directly accessed from host code
Attempting direct element access will trigger runtime errors
Must use to<HostMemoryResource>() to create a host-accessible copy
Safe for use with CUDA kernels through raw_data() method

Performance considerations:

Fastest for GPU operations
Requires explicit memory transfers for host access
Most efficient when kept on device throughout processing

Template Parameters

TPrecision The data type of the tensor elements.

◆ HostPtr

template<typename T >

using Mila::Dnn::HostPtr = typedef TensorPtr<T, true>

export

Type alias for host memory pointers.

◆ HostTensor

template<typename T >

using Mila::Dnn::HostTensor = typedef Tensor<T, Compute::HostMemoryResource>

export

Tensor type that uses host (CPU) memory.

HostTensor stores data in regular CPU memory that is directly accessible from host code. This type is suitable for:

Data that needs frequent host-side access
Input/output data processing
Debugging and inspection of tensor contents
Operations that primarily run on CPU

Memory safety:

Safe to access directly through data() method, operator[], at() method
Direct dereference operations will work correctly
No memory transfers required for host access

Performance considerations:

Fast host access, but slower for GPU operations
Requires memory transfers when used with GPU operations

Template Parameters

TPrecision The data type of the tensor elements.

◆ PinnedTensor

template<class T >

using Mila::Dnn::PinnedTensor = typedef Tensor<T, Compute::CudaPinnedMemoryResource>

export

Tensor type that uses pinned (page-locked) host memory.

PinnedTensor stores data in page-locked host memory that cannot be swapped to disk. This type is suitable for:

Data that needs to be frequently transferred between CPU and GPU
Input tensors that will be copied to GPU
Output tensors that need to be read back from GPU

Memory safety:

Safe to access directly from host code (data(), operator[], at())
Provides direct dereference operations like HostTensor
No runtime safety issues for host access

Performance considerations:

Faster host-device transfers than regular host memory
Consumes a limited system resource (pinned memory)
Should be used judiciously as excessive use can degrade system performance
Host access is typically slower than regular host memory

Template Parameters

TPrecision The data type of the tensor elements.

◆ UniversalTensor

template<class T >

using Mila::Dnn::UniversalTensor = typedef Tensor<T, Compute::CudaManagedMemoryResource>

export

Tensor type that uses CUDA managed memory accessible from both CPU and GPU.

UniversalTensor uses CUDA's Unified Memory, which is automatically migrated between host and device as needed by the CUDA runtime. This type is suitable for:

Data that needs to be accessed from both host and device code
Prototyping and development where memory management simplicity is preferred
Cases where optimal data placement isn't known in advance

Memory safety:

Safe to access from both host and device code
No explicit memory transfers needed
Provides the simplest programming model with automatic data migration

Performance considerations:

More convenient but typically lower performance than explicit memory management
Access patterns that frequently alternate between CPU and GPU may cause thrashing
Best used with CUDA devices that support hardware page faulting (Pascal or newer)
May incur overhead from the runtime system managing page migrations

Template Parameters

TPrecision The data type of the tensor elements.

Enumeration Type Documentation

◆ ActivationType

enum class Mila::Dnn::ActivationType

exportstrong

Enumeration of supported activation function types.

This enum class defines the different activation functions that can be used throughout the Mila library, particularly in neural network layers.

Enumerator
None	No activation (identity function)
Relu	Rectified Linear Unit: max(0, x)
Gelu	Gaussian Error Linear Unit: x * phi(x) where phi() is the standard Gaussian CDF.
Silu	Sigmoid Linear Unit (Swish): x * sigmoid(x)
Tanh	Hyperbolic Tangent: tanh(x)
Sigmoid	Sigmoid function: 1 / (1 + exp(-x))
LeakyRelu	Leaky ReLU: max(alpha * x, x) where alpha is typically 0.01.
Mish	Mish: x * tanh(softplus(x))

Function Documentation

◆ activationTypeToString()

std::string Mila::Dnn::activationTypeToString ( ActivationType type )

inlineexport

Converts an ActivationType enum value to its string representation.

Parameters

type	The ActivationType to convert

Returns: std::string The string representation of the activation type

Here is the caller graph for this function:

◆ random()

template<typename TElementType , typename MR = Compute::CpuMemoryResource>
requires ValidTensorType<TElementType> && std::is_base_of_v<Compute::MemoryResource, MR>

void Mila::Dnn::random	(	Tensor< TElementType, MR > &	tensor,
		TElementType	min,
		TElementType	max
	)

export

Initializes a tensor with random values within a specified range.

This function populates a tensor with random floating-point values uniformly distributed between the specified minimum and maximum values. It handles both host and device memory resources appropriately, copying data to the host for initialization if needed.

Template Parameters

TElementType	The element data type of the tensor (float, half, etc.)
MR	The memory resource type used by the tensor

Parameters

tensor	The tensor to initialize with random values
min	The minimum value for the random distribution
max	The maximum value for the random distribution

Note: Uses a fixed seed (42) for reproducible results rather than truly random values

Here is the call graph for this function:

◆ raw_pointer_cast()

template<typename T , bool IsHostAccessible>

T * Mila::Dnn::raw_pointer_cast ( const TensorPtr< T, IsHostAccessible > & ptr )

exportnoexcept

Gets a raw pointer from a TensorPtr (similar to thrust::raw_pointer_cast)

Template Parameters

TPrecision	Element type
IsHostAccessible	Whether the pointer is host-accessible

Parameters

ptr	TensorPtr to convert

Returns: TPrecision* Raw pointer

Here is the call graph for this function:

◆ stringToActivationType()

ActivationType Mila::Dnn::stringToActivationType ( const std::string & name )

inlineexport

Converts a string to its corresponding ActivationType enum value.

Parameters

name	The string representation of an activation function

Returns: ActivationType The corresponding enum value

Exceptions

std::invalid_argument if the string doesn't match any known activation function

◆ tensor_type_name()

template<typename T >

constexpr std::string_view Mila::Dnn::tensor_type_name ( )

constexprexport

Get the string representation of a tensor element type.

Template Parameters

T	The tensor element type

Returns: constexpr std::string_view The type name as a string

◆ tensor_type_size()

template<typename T >

constexpr size_t Mila::Dnn::tensor_type_size ( )

constexprexport

Get the size in bytes of a tensor element type.

Template Parameters

T	The tensor element type

Returns: constexpr size_t Size in bytes

◆ xavier()

template<typename TElementType , typename MR >
requires ValidTensorType<TElementType>&& std::is_base_of_v<Compute::MemoryResource, MR>

void Mila::Dnn::xavier	(	Tensor< TElementType, MR > &	tensor,
		size_t	input_size,
		size_t	output_size
	)

export

Initializes a tensor with Xavier/Glorot uniform initialization.

Xavier initialization is a method designed to keep the scale of gradients roughly the same in all layers of a neural network. It initializes weights with values sampled from a uniform distribution with limits calculated as a function of the input and output sizes of the layer.

The distribution range is [-limit, limit] where: limit = sqrt(6 / (input_size + output_size))

Template Parameters

TElementType	The element data type of the tensor (float, half, etc.)
MR	The memory resource type used by the tensor

Parameters

tensor	The tensor to initialize with Xavier initialization
input_size	The size of the input dimension
output_size	The size of the output dimension

Note: Uses a fixed seed (42) for reproducible results rather than truly random values

See also: http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf

Here is the call graph for this function:

Namespaces

Classes

Concepts

Typedefs

Enumerations

Functions

Typedef Documentation

◆ CpuCompositeModule

◆ CpuCrossEntropy

◆ CpuDropout

◆ CpuEncoder

◆ CpuGelu

◆ CpuLayerNorm

◆ CpuLinear

◆ CpuMLP

◆ CpuModel

◆ CpuModule

◆ CpuMultiHeadAttention

◆ CpuResidual

◆ CpuSoftmax

◆ CpuTransformerBlock

◆ CudaCompositeModule

◆ CudaCrossEntropy

◆ CudaDropout

◆ CudaEncoder

◆ CudaGelu

◆ CudaLayerNorm

◆ CudaLinear

◆ CudaMLP

◆ CudaModel

◆ CudaModule

◆ CudaMultiHeadAttention

◆ CudaResidual

◆ CudaSoftmax

◆ CudaTransformerBlock

◆ DevicePtr

◆ DeviceTensor

◆ HostPtr

◆ HostTensor

◆ PinnedTensor

◆ UniversalTensor

Enumeration Type Documentation

◆ ActivationType

Function Documentation

◆ activationTypeToString()

◆ random()

◆ raw_pointer_cast()

◆ stringToActivationType()

◆ tensor_type_name()

◆ tensor_type_size()

◆ xavier()