Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
CudaTensorOps.Transfer.ixx File Reference

CUDA tensor transfer operations partition. More...

#include <cuda_runtime.h>
#include <memory>
#include <format>
#include <stdexcept>
#include <source_location>
#include <cstring>
#include "Kernels/Transfer.Copy.h"
import Cuda.Error;
import Dnn.Tensor;
import Dnn.TensorDataTypeMap;
import Dnn.TensorDataType;
import Compute.IExecutionContext;
import Compute.CudaManagedMemoryResource;
import Dnn.TensorDataTypeTraits;
import Compute.CudaTensorDataType;
import Dnn.ITensor;
import Compute.CudaPinnedMemoryResource;
import Compute.CudaDeviceMemoryResource;
import Compute.ExecutionContext;
import Compute.DeviceType;
import Serialization.Tensor;

Classes

struct  Mila::Dnn::Compute::Cuda::TransferOps
 CUDA specialization of TensorOps for tensor transfer operations. More...

Namespaces

namespace  Mila
 Mila main API namespace.
namespace  Mila::Dnn
namespace  Mila::Dnn::Compute
namespace  Mila::Dnn::Compute::Cuda

Detailed Description

CUDA tensor transfer operations partition.

Provides tensor transfer operations using ExecutionContext for stream management. TensorOps work with tensor data directly (via data()) and accept optional ExecutionContext for explicit stream control with zero-overhead borrowing semantics.

Implementation strategy:

  • Raw pointer semantics for ExecutionContext (non-owning borrow)
  • Automatic fallback to default CUDA stream when no context provided
  • Stream-based asynchronous execution for pipeline optimization
  • Automatic type conversion using CUDA kernels
  • Memory-efficient staging for host-device transfers with conversion