Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
Mila::Dnn::Compute::Cpu::FillOps Struct Referenceexport

CPU specialization of TensorOps for initialization operations. More...

Inheritance diagram for Mila::Dnn::Compute::Cpu::FillOps:

Public Types

template<TensorDataType TDataType>
using host_value_t = std::conditional_t<TensorDataTypeTraits<TDataType>::is_integer_type, int32_t, float>

Static Public Member Functions

template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
static void fill (Tensor< TDataType, TMemoryResource > &tensor, host_value_t< TDataType > host_value, IExecutionContext *exec_context=nullptr)
 Fill tensor with scalar host value.
template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
static void fill (Tensor< TDataType, TMemoryResource > &tensor, std::span< const host_value_t< TDataType > > host_values, IExecutionContext *exec_context=nullptr)
 Fill tensor with array of host values.

Detailed Description

CPU specialization of TensorOps for initialization operations.

Provides CPU-specific implementations of tensor fill operations using optimized standard library algorithms for host memory. All operations execute synchronously with no device synchronization overhead.

Key features:

  • Direct memory access to CPU tensors
  • STL algorithm optimizations (vectorization, cache efficiency)
  • Automatic type conversion when needed
  • Compile-time dispatch for optimal code generation
  • Accepts ExecutionContext for API consistency (unused on CPU)

Member Typedef Documentation

◆ host_value_t

template<TensorDataType TDataType>
using Mila::Dnn::Compute::Cpu::FillOps::host_value_t = std::conditional_t<TensorDataTypeTraits<TDataType>::is_integer_type, int32_t, float>

Member Function Documentation

◆ fill() [1/2]

template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
void Mila::Dnn::Compute::Cpu::FillOps::fill ( Tensor< TDataType, TMemoryResource > & tensor,
host_value_t< TDataType > host_value,
IExecutionContext * exec_context = nullptr )
inlinestatic

Fill tensor with scalar host value.

Broadcasts a single scalar value to all tensor elements using optimized STL fill algorithm.

Implementation:

  • Single type conversion (scalar -> native type)
  • Optimized std::fill_n (vectorized by compiler)
  • Compile-time type selection
Template Parameters
TDataTypeAbstract tensor data type
Parameters
tensorDestination CPU tensor to fill
host_valueScalar value in canonical host representation
exec_contextOptional execution context (unused for CPU, accepted for API consistency)
Note
Conversion happens once before fill operation
No synchronization needed - operations are synchronous on CPU
ExecutionContext parameter ignored but present for uniform API across devices
Here is the call graph for this function:

◆ fill() [2/2]

template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
void Mila::Dnn::Compute::Cpu::FillOps::fill ( Tensor< TDataType, TMemoryResource > & tensor,
std::span< const host_value_t< TDataType > > host_values,
IExecutionContext * exec_context = nullptr )
inlinestatic

Fill tensor with array of host values.

Copies host values into CPU tensor with automatic type conversion. Uses optimized STL algorithms for performance.

Implementation:

  • Direct copy when types match (zero conversion overhead)
  • Element-wise transform when conversion needed
  • Compile-time dispatch based on type compatibility
Template Parameters
TDataTypeAbstract tensor data type
Parameters
tensorDestination CPU tensor to fill
host_valuesSpan of host values in canonical representation
exec_contextOptional execution context (unused for CPU, accepted for API consistency)
Note
Handles size mismatches gracefully (uses minimum size)
Type conversion handled automatically via static_cast
No synchronization needed - operations are synchronous on CPU
ExecutionContext parameter ignored but present for uniform API across devices
Here is the call graph for this function:

The documentation for this struct was generated from the following file: