|
Mila 0.13.48
Deep Neural Network Library
|
CUDA implementation of the TokenEmbedding operation. More...
#include <cuda_fp16.h>#include <string>#include <stdexcept>#include <cstdint>#include <format>#include <sstream>import Cuda.Debug;import Compute.OperationRegistrarHelpers;import Compute.CudaTensorDataType;import Compute.CudaDeviceMemoryResource;import Compute.CudaTokenEmbeddingOp:Dispatch;import Dnn.TensorDataType;import Compute.ExecutionContext;import Dnn.Tensor;import Dnn.Components.TokenEmbeddingConfig;import Compute.DeviceType;import Dnn.ITensor;import Compute.OperationType;import Dnn.TensorTypes;import Dnn.Component;import Dnn.TensorDataTypeTraits;import Compute.UnaryOperation;import Compute.IExecutionContext;Classes | |
| class | Mila::Dnn::Compute::Cuda::TokenEmbedding::CudaTokenEmbeddingOp< TInput, TPrecision > |
| class | Mila::Dnn::Compute::Cuda::TokenEmbedding::CudaTokenEmbeddingOpRegistrar |
Namespaces | |
| namespace | Mila |
| Mila main API namespace. | |
| namespace | Mila::Dnn |
| namespace | Mila::Dnn::Compute |
| namespace | Mila::Dnn::Compute::Cuda |
| namespace | Mila::Dnn::Compute::Cuda::TokenEmbedding |
CUDA implementation of the TokenEmbedding operation.
Pure vocabulary lookup: output[b,t,:] = wte[X[b,t],:]. No positional information. Positional encoding is handled downstream by a dedicated encoding component (RoPE, ALiBi, or Learned).
| TInput | Data type of token index input (INT32). |
| TPrecision | Precision of embedding output (FP32 or FP16). |