Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
Mila::Dnn::Compute::Cpu::MathOps Struct Referenceexport

CPU specialization of TensorOps for mathematical operations. More...

Inheritance diagram for Mila::Dnn::Compute::Cpu::MathOps:

Static Public Member Functions

template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
static void add (const Tensor< TDataType, TMemoryResource > &a, const Tensor< TDataType, TMemoryResource > &b, Tensor< TDataType, TMemoryResource > &result, IExecutionContext *exec_context=nullptr)
 Element-wise addition of two tensors (CPU implementation).
template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
static void divide (const Tensor< TDataType, TMemoryResource > &a, const Tensor< TDataType, TMemoryResource > &b, Tensor< TDataType, TMemoryResource > &result, IExecutionContext *exec_context=nullptr)
 Element-wise division of two tensors (CPU implementation).
template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
static void multiply (const Tensor< TDataType, TMemoryResource > &a, const Tensor< TDataType, TMemoryResource > &b, Tensor< TDataType, TMemoryResource > &result, IExecutionContext *exec_context=nullptr)
 Element-wise multiplication of two tensors (CPU implementation).
template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
static void subtract (const Tensor< TDataType, TMemoryResource > &a, const Tensor< TDataType, TMemoryResource > &b, Tensor< TDataType, TMemoryResource > &result, IExecutionContext *exec_context=nullptr)
 Element-wise subtraction of two tensors (CPU implementation).
template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
static float sum (const Tensor< TDataType, TMemoryResource > &tensor, IExecutionContext *exec_context=nullptr)
 Computes sum of all tensor elements (CPU implementation).

Static Private Member Functions

template<TensorDataType TDataType, typename TMemoryResource, typename TBinaryOp>
requires isValidTensor<TDataType, TMemoryResource>
static void performElementwiseOperation (const Tensor< TDataType, TMemoryResource > &a, const Tensor< TDataType, TMemoryResource > &b, Tensor< TDataType, TMemoryResource > &result, TBinaryOp op)
 Performs element-wise operation using the provided binary function.
template<TensorDataType TDataType, typename TMemoryResource>
static void validateShapeCompatibility (const Tensor< TDataType, TMemoryResource > &a, const Tensor< TDataType, TMemoryResource > &b, const Tensor< TDataType, TMemoryResource > &result, const std::string &operation_name)
 Validates that three tensors have compatible shapes for element-wise operations.

Detailed Description

CPU specialization of TensorOps for mathematical operations.

Implements element-wise operations for CPU tensors using standard library algorithms with optional parallel execution for large tensors.

Key features:

  • Synchronous execution (no stream management needed)
  • Parallel execution for large tensors (>10000 elements)
  • Accepts ExecutionContext for API consistency (unused on CPU)
  • Automatic type conversion via TensorHostTypeMap
  • Zero-copy direct memory access

Member Function Documentation

◆ add()

template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
void Mila::Dnn::Compute::Cpu::MathOps::add ( const Tensor< TDataType, TMemoryResource > & a,
const Tensor< TDataType, TMemoryResource > & b,
Tensor< TDataType, TMemoryResource > & result,
IExecutionContext * exec_context = nullptr )
inlinestatic

Element-wise addition of two tensors (CPU implementation).

Performs element-wise addition a[i] + b[i] for all elements and stores the result in a pre-allocated result tensor. Both input tensors must have identical shapes matching the result tensor shape.

Template Parameters
TDataTypeAbstract tensor data type
TMemoryResourceMemory resource (must be CPU-accessible)
Parameters
aFirst operand tensor
bSecond operand tensor
resultPre-allocated result tensor (must have matching shape)
exec_contextOptional execution context (unused for CPU, accepted for API consistency)
Exceptions
std::invalid_argumentIf tensor shapes don't match or tensors are empty
Note
ExecutionContext parameter ignored but present for uniform API across devices
Uses parallel execution for tensors with >10000 elements
Here is the call graph for this function:

◆ divide()

template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
void Mila::Dnn::Compute::Cpu::MathOps::divide ( const Tensor< TDataType, TMemoryResource > & a,
const Tensor< TDataType, TMemoryResource > & b,
Tensor< TDataType, TMemoryResource > & result,
IExecutionContext * exec_context = nullptr )
inlinestatic

Element-wise division of two tensors (CPU implementation).

Performs element-wise division a[i] / b[i] for all elements and stores the result in a pre-allocated result tensor.

For floating-point types, follows IEEE 754 standards:

  • Division by zero produces infinity or NaN For integer types:
  • Division by zero throws std::runtime_error
Template Parameters
TDataTypeAbstract tensor data type
TMemoryResourceMemory resource (must be CPU-accessible)
Parameters
aFirst operand tensor (dividend)
bSecond operand tensor (divisor)
resultPre-allocated result tensor (must have matching shape)
exec_contextOptional execution context (unused for CPU, accepted for API consistency)
Exceptions
std::invalid_argumentIf tensor shapes don't match or tensors are empty
std::runtime_errorIf division by zero in integer division
Note
ExecutionContext parameter ignored but present for uniform API across devices
Uses parallel execution for tensors with >10000 elements
Here is the call graph for this function:

◆ multiply()

template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
void Mila::Dnn::Compute::Cpu::MathOps::multiply ( const Tensor< TDataType, TMemoryResource > & a,
const Tensor< TDataType, TMemoryResource > & b,
Tensor< TDataType, TMemoryResource > & result,
IExecutionContext * exec_context = nullptr )
inlinestatic

Element-wise multiplication of two tensors (CPU implementation).

Performs element-wise multiplication a[i] * b[i] for all elements and stores the result in a pre-allocated result tensor.

Template Parameters
TDataTypeAbstract tensor data type
TMemoryResourceMemory resource (must be CPU-accessible)
Parameters
aFirst operand tensor
bSecond operand tensor
resultPre-allocated result tensor (must have matching shape)
exec_contextOptional execution context (unused for CPU, accepted for API consistency)
Exceptions
std::invalid_argumentIf tensor shapes don't match or tensors are empty
Note
ExecutionContext parameter ignored but present for uniform API across devices
Uses parallel execution for tensors with >10000 elements
Here is the call graph for this function:

◆ performElementwiseOperation()

template<TensorDataType TDataType, typename TMemoryResource, typename TBinaryOp>
requires isValidTensor<TDataType, TMemoryResource>
void Mila::Dnn::Compute::Cpu::MathOps::performElementwiseOperation ( const Tensor< TDataType, TMemoryResource > & a,
const Tensor< TDataType, TMemoryResource > & b,
Tensor< TDataType, TMemoryResource > & result,
TBinaryOp op )
inlinestaticprivate

Performs element-wise operation using the provided binary function.

Applies a binary operation to corresponding elements of two input tensors and stores the result in the output tensor. Uses parallel execution for improved performance on large tensors (>10000 elements).

Template Parameters
TDataTypeAbstract tensor data type
TMemoryResourceMemory resource type
TBinaryOpBinary operation function type
Parameters
aFirst input tensor
bSecond input tensor
resultOutput tensor (must be pre-allocated with correct shape)
opBinary operation to apply (e.g., std::plus, std::minus)
Here is the call graph for this function:
Here is the caller graph for this function:

◆ subtract()

template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
void Mila::Dnn::Compute::Cpu::MathOps::subtract ( const Tensor< TDataType, TMemoryResource > & a,
const Tensor< TDataType, TMemoryResource > & b,
Tensor< TDataType, TMemoryResource > & result,
IExecutionContext * exec_context = nullptr )
inlinestatic

Element-wise subtraction of two tensors (CPU implementation).

Performs element-wise subtraction a[i] - b[i] for all elements and stores the result in a pre-allocated result tensor.

Template Parameters
TDataTypeAbstract tensor data type
TMemoryResourceMemory resource (must be CPU-accessible)
Parameters
aFirst operand tensor (minuend)
bSecond operand tensor (subtrahend)
resultPre-allocated result tensor (must have matching shape)
exec_contextOptional execution context (unused for CPU, accepted for API consistency)
Exceptions
std::invalid_argumentIf tensor shapes don't match or tensors are empty
Note
ExecutionContext parameter ignored but present for uniform API across devices
Uses parallel execution for tensors with >10000 elements
Here is the call graph for this function:

◆ sum()

template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
float Mila::Dnn::Compute::Cpu::MathOps::sum ( const Tensor< TDataType, TMemoryResource > & tensor,
IExecutionContext * exec_context = nullptr )
inlinestatic

Computes sum of all tensor elements (CPU implementation).

Reduces tensor to a single scalar value representing the sum of all elements. Uses parallel reduction for large tensors.

Template Parameters
TDataTypeAbstract tensor data type
TMemoryResourceMemory resource (must be CPU-accessible)
Parameters
tensorInput tensor
exec_contextOptional execution context (unused for CPU, accepted for API consistency)
Returns
Sum of all elements as float
Note
ExecutionContext parameter ignored but present for uniform API across devices
Uses parallel reduction for tensors with >10000 elements
Always returns after computation (synchronous operation)
Here is the call graph for this function:

◆ validateShapeCompatibility()

template<TensorDataType TDataType, typename TMemoryResource>
void Mila::Dnn::Compute::Cpu::MathOps::validateShapeCompatibility ( const Tensor< TDataType, TMemoryResource > & a,
const Tensor< TDataType, TMemoryResource > & b,
const Tensor< TDataType, TMemoryResource > & result,
const std::string & operation_name )
inlinestaticprivate

Validates that three tensors have compatible shapes for element-wise operations.

Template Parameters
TDataTypeAbstract tensor data type
TMemoryResourceMemory resource type
Parameters
aFirst input tensor
bSecond input tensor
resultResult tensor
operation_nameName of the operation for error reporting
Exceptions
std::invalid_argumentIf shapes don't match or tensors are empty
Here is the call graph for this function:
Here is the caller graph for this function:

The documentation for this struct was generated from the following file: