Static Public Member Functions
template<TensorDataType TDataType, typename TMemoryResource> requires isValidTensor<TDataType, TMemoryResource>
static void	zero (Dnn::Tensor< TDataType, TMemoryResource > &tensor, IExecutionContext *exec_context=nullptr)
	Zero the contents of a CUDA tensor buffer.

Member Function Documentation

◆ zero()

template<TensorDataType TDataType, typename TMemoryResource>
requires isValidTensor<TDataType, TMemoryResource>

void Mila::Dnn::Compute::Cuda::ZeroOps::zero	(	Dnn::Tensor< TDataType, TMemoryResource > &	tensor,
		IExecutionContext *	exec_context = nullptr )

inlinestatic

Zero the contents of a CUDA tensor buffer.

No-op for empty tensors.
Uses cudaMemsetAsync for contiguous device buffers.
If exec_context is provided and is a CUDA context, the context's stream is used and the call is non-blocking. If exec_context is null the tensor-provided device is used and the default stream is used synchronously.

Preconditions:

The tensor buffer must be a single contiguous allocation (TensorBuffer by design).

Template Parameters

TDataType	Abstract tensor data type
TMemoryResource	Memory resource type backing the tensor

Parameters

tensor	Destination CUDA tensor to zero
exec_context	Optional execution context (borrowed). When provided and recognized as CUDA context, zero is scheduled on that stream.

Here is the call graph for this function:

The documentation for this struct was generated from the following file:

/__w/Mila/Mila/Mila/Src/Dnn/Compute/Devices/Cuda/Tensors/Operations/CudaTensorOps.Zero.ixx