|
Mila 0.13.48
Deep Neural Network Library
|


Public Types | |
| using | ConfigType = TokenEmbeddingConfig |
| using | CudaExecutionContext = ExecutionContext<DeviceType::Cuda> |
| using | MR = CudaDeviceMemoryResource |
| using | NativeType = typename Mila::Dnn::Compute::Cuda::TensorDataTypeMap<TPrecision>::device_type |
| using | TensorType = Tensor<TPrecision, MR> |
| using | UnaryOperationBase = UnaryOperation<DeviceType::Cuda, TInput, TPrecision> |
| Public Types inherited from Mila::Dnn::Compute::UnaryOperation< DeviceType::Cuda, TInput, TInput > | |
| using | MR |
| using | TensorInputType |
| using | TensorOutputType |
| Public Types inherited from Mila::Dnn::Compute::Operation< TDeviceType, TPrecision > | |
| using | DataTypeTraits |
Public Member Functions | |
| CudaTokenEmbeddingOp (IExecutionContext *context, const TokenEmbeddingConfig &config) | |
| void | backward (const ITensor &input, const ITensor &output_grad, ITensor &input_grad) const override |
| Backward pass accumulating gradients into wte (hot path). | |
| void | build (const BuildContext &config) override |
| Prepare the operation for a concrete input shape (cold path). | |
| void | decode (const ITensor &input, ITensor &output) const |
| Single-token decode pass (hot path). | |
| void | forward (const ITensor &input, ITensor &output) const override |
| Full-sequence forward pass (hot path). | |
| std::string | getName () const override |
| Human-readable operation name. | |
| OperationType | getOperationType () const override |
| Operation type identifier. | |
| void | setGradients (ITensor *wte_grad, ITensor *) override |
| Bind the wte gradient tensor for training (module retains ownership). | |
| void | setParameters (ITensor *wte, ITensor *) override |
| Bind the wte parameter tensor (module retains ownership). | |
| Public Member Functions inherited from Mila::Dnn::Compute::UnaryOperation< DeviceType::Cuda, TInput, TInput > | |
| virtual | ~UnaryOperation ()=default |
| Public Member Functions inherited from Mila::Dnn::Compute::Operation< TDeviceType, TPrecision > | |
| virtual | ~Operation ()=default |
| virtual void | clearGradients () noexcept |
| Clear any cached gradient pointers held by the operation. | |
| virtual TensorDataType | getDataType () const |
| Tensor data type for this operation. | |
| virtual DeviceType | getDeviceType () const |
| Device type for this operation. | |
| virtual std::size_t | getStateMemorySize () const |
| Returns the number of bytes of state memory allocated by this operation. | |
| virtual bool | isBuilt () const |
| Whether build() completed successfully for a concrete input shape. | |
| virtual bool | isEvalMode () const |
| Query whether operation is configured for training. | |
| virtual void | setTrainingMode (TrainingMode training_mode) |
| Configure operation training-mode behavior. | |
Private Member Functions | |
| void | validateInputShape (const shape_t &shape) const |
| void | validateRuntimeShape (int B, int T) const |
Private Attributes | |
| int | batch_size_ { 0 } |
| TokenEmbeddingConfig | config_ |
| CudaExecutionContext * | context_ |
| int | embedding_dim_ { 0 } |
| int | seq_length_ { 0 } |
| int | vocab_size_ { 0 } |
| NativeType * | wte_ { nullptr } |
| NativeType * | wte_grad_ { nullptr } |
Additional Inherited Members | |
| Static Public Attributes inherited from Mila::Dnn::Compute::Operation< TDeviceType, TPrecision > | |
| static constexpr TensorDataType | data_type |
| static constexpr DeviceType | device_type |
| Static Protected Member Functions inherited from Mila::Dnn::Compute::UnaryOperation< DeviceType::Cuda, TInput, TInput > | |
| static const TensorInputType & | asInputTensor (const ITensor &t) |
| static TensorOutputType & | asOutputTensor (ITensor &t) |
| Protected Attributes inherited from Mila::Dnn::Compute::Operation< TDeviceType, TPrecision > | |
| bool | is_built_ |
| TrainingMode | training_mode_ |
| using Mila::Dnn::Compute::Cuda::TokenEmbedding::CudaTokenEmbeddingOp< TInput, TPrecision >::ConfigType = TokenEmbeddingConfig |
| using Mila::Dnn::Compute::Cuda::TokenEmbedding::CudaTokenEmbeddingOp< TInput, TPrecision >::CudaExecutionContext = ExecutionContext<DeviceType::Cuda> |
| using Mila::Dnn::Compute::Cuda::TokenEmbedding::CudaTokenEmbeddingOp< TInput, TPrecision >::MR = CudaDeviceMemoryResource |
| using Mila::Dnn::Compute::Cuda::TokenEmbedding::CudaTokenEmbeddingOp< TInput, TPrecision >::NativeType = typename Mila::Dnn::Compute::Cuda::TensorDataTypeMap<TPrecision>::device_type |
| using Mila::Dnn::Compute::Cuda::TokenEmbedding::CudaTokenEmbeddingOp< TInput, TPrecision >::TensorType = Tensor<TPrecision, MR> |
| using Mila::Dnn::Compute::Cuda::TokenEmbedding::CudaTokenEmbeddingOp< TInput, TPrecision >::UnaryOperationBase = UnaryOperation<DeviceType::Cuda, TInput, TPrecision> |
|
inline |
|
inlineoverridevirtual |
Backward pass accumulating gradients into wte (hot path).
Token indices are non-differentiable; input_grad is unused.
| input | Token indices used in forward [B, T] (INT32). |
| output_grad | Upstream embedding gradient [B, T, C]. |
| input_grad | Unused (non-differentiable input). |
Implements Mila::Dnn::Compute::UnaryOperation< DeviceType::Cuda, TInput, TInput >.
|
inlineoverridevirtual |
Prepare the operation for a concrete input shape (cold path).
| input_shape | Token index input shape [B, T]. |
| std::runtime_error | if wte is not bound. |
| std::invalid_argument | if input shape is invalid. |
Reimplemented from Mila::Dnn::Compute::Operation< TDeviceType, TPrecision >.
|
inline |
Single-token decode pass (hot path).
Computes output[b,:] = wte[X[b,0],:] for each batch element. No position argument — positional encoding is handled downstream.
| input | Single-token indices [B, 1] (INT32). |
| output | Pre-allocated output buffer [B, C]. |
|
inlineoverridevirtual |
Full-sequence forward pass (hot path).
For each (b, t): output[b,t,:] = wte[X[b,t],:].
| input | Token indices [B, T] (INT32). |
| output | Pre-allocated embeddings [B, T, C]. |
Implements Mila::Dnn::Compute::UnaryOperation< DeviceType::Cuda, TInput, TInput >.
|
inlineoverridevirtual |
Human-readable operation name.
Implements Mila::Dnn::Compute::Operation< TDeviceType, TPrecision >.
|
inlineoverridevirtual |
Operation type identifier.
Implements Mila::Dnn::Compute::Operation< TDeviceType, TPrecision >.
|
inlineoverridevirtual |
Bind the wte gradient tensor for training (module retains ownership).
| wte_grad | Gradient buffer for wte — CUDA tensor of shape [vocab_size, C]. |
| std::invalid_argument | on null or non-CUDA tensor. |
Reimplemented from Mila::Dnn::Compute::Operation< TDeviceType, TPrecision >.
|
inlineoverridevirtual |
Bind the wte parameter tensor (module retains ownership).
| wte | Token embedding table — CUDA tensor of shape [vocab_size, C]. |
| std::invalid_argument | on null, non-CUDA, or shape-mismatched tensor. |
Reimplemented from Mila::Dnn::Compute::Operation< TDeviceType, TPrecision >.
|
inlineprivate |

|
inlineprivate |

|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |