|
Mila 0.13.48
Deep Neural Network Library
|
CUDA implementation of the Rope (rotary positional embedding) operation. More...
#include <cuda_fp16.h>#include <string>#include <stdexcept>#include <cstdint>#include <format>#include <sstream>#include <iostream>import Cuda.Debug;import Compute.CudaTensorDataType;import Compute.CudaDeviceMemoryResource;import Compute.OperationType;import Compute.ExecutionContext;import Dnn.TensorTypes;import Logging.Logger;import Dnn.TensorDataType;import Compute.OperationRegistrarHelpers;import Compute.CudaRopeOp:Dispatch;import Compute.IExecutionContext;import Compute.IPositionalPairedOp;import Dnn.Component;import Dnn.Tensor;import Compute.PairedOperation;import Dnn.Components.RopeConfig;import Dnn.ITensor;import Dnn.TensorDataTypeTraits;import Compute.DeviceType;Classes | |
| class | Mila::Dnn::Compute::Cuda::Rope::CudaRopeOp< TComputePrecision > |
| CUDA implementation of the Rope (rotary positional embedding) operation. More... | |
| class | Mila::Dnn::Compute::Cuda::Rope::CudaRopeOpRegistrar |
Namespaces | |
| namespace | Mila |
| Mila main API namespace. | |
| namespace | Mila::Dnn |
| namespace | Mila::Dnn::Compute |
| namespace | Mila::Dnn::Compute::Cuda |
| namespace | Mila::Dnn::Compute::Cuda::Rope |
CUDA implementation of the Rope (rotary positional embedding) operation.
Applies RoPE to projected Q and K tensors in preparation for GQA attention. Supports full-sequence forward/backward, chunked prefill with position offset, and single-token decode via IPositionalPairedOp.