|
Mila 0.13.48
Deep Neural Network Library
|
CUDA fast zeroing partition for tensor buffers. More...
#include <cuda_runtime.h>#include <cstring>#include <type_traits>#include <stdexcept>import Cuda.Helpers;import Compute.DeviceId;import Compute.DeviceType;import Compute.ExecutionContext;import Dnn.TensorDataTypeMap;import Compute.IExecutionContext;import Dnn.TensorDataType;import Cuda.Error;import Dnn.TensorDataTypeTraits;import Dnn.Tensor;Classes | |
| struct | Mila::Dnn::Compute::Cuda::ZeroOps |
Namespaces | |
| namespace | Mila |
| Mila main API namespace. | |
| namespace | Mila::Dnn |
| namespace | Mila::Dnn::Compute |
| namespace | Mila::Dnn::Compute::Cuda |
CUDA fast zeroing partition for tensor buffers.
Provides a device-dispatched fast zero() operation that uses cudaMemsetAsync for contiguous CUDA buffers. The operation is allocation-free and accepts an optional execution context to perform non-blocking zeroing on the caller's stream.