Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
Mila::Dnn::Compute::Cuda::ZeroOps Struct Referenceexport

Static Public Member Functions

template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
static void zero (Dnn::Tensor< TDataType, TMemoryResource > &tensor, IExecutionContext *exec_context=nullptr)
 Zero the contents of a CUDA tensor buffer.

Member Function Documentation

◆ zero()

template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>
void Mila::Dnn::Compute::Cuda::ZeroOps::zero ( Dnn::Tensor< TDataType, TMemoryResource > & tensor,
IExecutionContext * exec_context = nullptr )
inlinestatic

Zero the contents of a CUDA tensor buffer.

  • No-op for empty tensors.
  • Uses cudaMemsetAsync for contiguous device buffers.
  • If exec_context is provided and is a CUDA context, the context's stream is used and the call is non-blocking. If exec_context is null the tensor-provided device is used and the default stream is used synchronously.

Preconditions:

  • The tensor buffer must be a single contiguous allocation (TensorBuffer by design).
Template Parameters
TDataTypeAbstract tensor data type
TMemoryResourceMemory resource type backing the tensor
Parameters
tensorDestination CUDA tensor to zero
exec_contextOptional execution context (borrowed). When provided and recognized as CUDA context, zero is scheduled on that stream.
Here is the call graph for this function:

The documentation for this struct was generated from the following file: