Device-dispatched TensorOps interface template.
Specialize TensorOps<TDevice> for each supported Compute::DeviceType to provide backend implementations of tensor operations (elementwise, reductions, copy, fill, etc.).
Requirements for specializations:
- Provide the operations used by the framework (static or instance methods), matching the signatures expected by TensorOps callers.
- Use the device's memory resource and execution context types to access device-specific APIs and streams.
- Respect host/device accessibility guarantees: CPU specializations must operate on host-accessible memory, CUDA specializations on device memory.
Usage example:
template<>
{
};
Abstract interface providing essential tensor information and data access.
Definition ITensor.ixx:40
void copy(const Tensor< TSrcDataType, TSrcMemoryResource > &src, Tensor< TDstDataType, TDstMemoryResource > &dst, IExecutionContext *exec_context=nullptr)
Copies tensor data from source to destination tensor with optional ExecutionContext.
Definition TensorOps.Transfer.ixx:88
Device-dispatched TensorOps interface template.
Definition TensorOps-Base.ixx:44
- Template Parameters
-
| TDevice | Compute device type to specialize for (DeviceType::Cpu, DeviceType::Cuda, ...) |