Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
CudaTensorOps.Zero.ixx File Reference

CUDA fast zeroing partition for tensor buffers. More...

#include <cuda_runtime.h>
#include <cstring>
#include <type_traits>
#include <stdexcept>
import Cuda.Helpers;
import Compute.DeviceId;
import Compute.DeviceType;
import Compute.ExecutionContext;
import Dnn.TensorDataTypeMap;
import Compute.IExecutionContext;
import Dnn.TensorDataType;
import Cuda.Error;
import Dnn.TensorDataTypeTraits;
import Dnn.Tensor;

Classes

struct  Mila::Dnn::Compute::Cuda::ZeroOps

Namespaces

namespace  Mila
 Mila main API namespace.
namespace  Mila::Dnn
namespace  Mila::Dnn::Compute
namespace  Mila::Dnn::Compute::Cuda

Detailed Description

CUDA fast zeroing partition for tensor buffers.

Provides a device-dispatched fast zero() operation that uses cudaMemsetAsync for contiguous CUDA buffers. The operation is allocation-free and accepts an optional execution context to perform non-blocking zeroing on the caller's stream.