Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
TensorOps.Math.ixx File Reference

Device-dispatching math helpers for tensor arithmetic operations. More...

#include <concepts>
#include <memory>
import Compute.DeviceType;
import Compute.ExecutionContext;
import Dnn.TensorOps.Base;
import Dnn.TensorDataTypeMap;
import Dnn.TensorDataTypeTraits;
import Dnn.TensorDataType;
import Dnn.Tensor;

Namespaces

namespace  Mila
 Mila main API namespace.
namespace  Mila::Dnn

Functions

template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
void Mila::Dnn::add (const Tensor< TDataType, TMemoryResource > &a, const Tensor< TDataType, TMemoryResource > &b, Tensor< TDataType, TMemoryResource > &result, IExecutionContext *exec_context=nullptr)
 Element-wise addition with optional ExecutionContext (device-dispatched).
template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
void Mila::Dnn::divide (const Tensor< TDataType, TMemoryResource > &a, const Tensor< TDataType, TMemoryResource > &b, Tensor< TDataType, TMemoryResource > &result, IExecutionContext *exec_context=nullptr)
 Element-wise division with optional ExecutionContext (device-dispatched).
template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
void Mila::Dnn::multiply (const Tensor< TDataType, TMemoryResource > &a, const Tensor< TDataType, TMemoryResource > &b, Tensor< TDataType, TMemoryResource > &result, IExecutionContext *exec_context=nullptr)
 Element-wise multiplication with optional ExecutionContext (device-dispatched).
template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
Tensor< TDataType, TMemoryResource > Mila::Dnn::operator* (const Tensor< TDataType, TMemoryResource > &a, const Tensor< TDataType, TMemoryResource > &b)
 Element-wise multiplication operator (always synchronous).
template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
Tensor< TDataType, TMemoryResource > Mila::Dnn::operator+ (const Tensor< TDataType, TMemoryResource > &a, const Tensor< TDataType, TMemoryResource > &b)
 Element-wise addition operator (always synchronous).
template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
Tensor< TDataType, TMemoryResource > Mila::Dnn::operator- (const Tensor< TDataType, TMemoryResource > &a, const Tensor< TDataType, TMemoryResource > &b)
 Element-wise subtraction operator (always synchronous).
template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
Tensor< TDataType, TMemoryResource > Mila::Dnn::operator/ (const Tensor< TDataType, TMemoryResource > &a, const Tensor< TDataType, TMemoryResource > &b)
 Element-wise division operator (always synchronous).
template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
void Mila::Dnn::subtract (const Tensor< TDataType, TMemoryResource > &a, const Tensor< TDataType, TMemoryResource > &b, Tensor< TDataType, TMemoryResource > &result, IExecutionContext *exec_context=nullptr)
 Element-wise subtraction with optional ExecutionContext (device-dispatched).
template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
float Mila::Dnn::sum (const Tensor< TDataType, TMemoryResource > &tensor, IExecutionContext *exec_context=nullptr)
 Sum reduction with optional ExecutionContext (device-dispatched).

Detailed Description

Device-dispatching math helpers for tensor arithmetic operations.

This partition provides the high-level, device-agnostic entry points for tensor math operations (e.g., element-wise addition). Each helper forwards to the device-specific TensorOps<ComputeDeviceTag>::... implementation (see CPU and CUDA specializations).

The templates are constrained with isValidTensor<TDataType, TMemoryResource> to ensure the tensor configuration is valid (memory resource compatibility, type traits available, and device accessibility).

ExecutionContext handling:

  • Optional ExecutionContext parameter for stream control (borrowed, not owned)
  • When provided, operations use the context's stream (caller controls sync)
  • When null, operations use default stream and synchronize before returning
  • Raw pointer semantics ensure zero overhead

Usage:

  • Call add(a, b, result) for element-wise addition of two tensors with the same abstract data type and memory resource. The call is automatically dispatched to the appropriate device implementation.
  • Optionally provide ExecutionContext for explicit stream control: add(a, b, result, ctx.get())

Preconditions:

  • All operands must satisfy isValidTensor and have matching shapes.
  • Result tensor must be pre-allocated with matching shape.
  • Device-specific implementations validate shapes and perform operations efficiently.
  • ExecutionContext (if provided) must outlive the function call.