|
Mila 0.13.48
Deep Neural Network Library
|
CPU implementation of Layer Normalization using abstract TensorDataType API. More...


Public Types | |
| using | CpuExecutionContext = ExecutionContext<DeviceType::Cpu> |
| using | MR = CpuMemoryResource |
| using | TensorType = Tensor<TensorDataType::FP32, MR> |
| using | UnaryOperationBase = UnaryOperation<DeviceType::Cpu, TensorDataType::FP32> |
| Public Types inherited from Mila::Dnn::Compute::UnaryOperation< DeviceType::Cpu, TensorDataType::FP32 > | |
| using | MR |
| using | TensorInputType |
| using | TensorOutputType |
| Public Types inherited from Mila::Dnn::Compute::Operation< TDeviceType, TInput > | |
| using | DataTypeTraits |
Public Member Functions | |
| CpuLayerNormOp (IExecutionContext *context, const LayerNormConfig &config) | |
| void | backward (const ITensor &input, const ITensor &output_grad, ITensor &input_grad) const override |
| Backward pass - compute gradients for input and parameters. | |
| void | build (const BuildContext &config) override |
| Build the operation for a concrete input shape. | |
| void | forward (const ITensor &input, ITensor &output) const override |
| Forward pass - normalize input and apply learned affine transform. | |
| std::string | getName () const override |
| Human-readable operation name. | |
| OperationType | getOperationType () const override |
| Operation type identifier. | |
| void | setGradients (ITensor *weight_grad, ITensor *bias_grad) override |
| Set parameter gradient tensor references for training. | |
| void | setParameters (ITensor *weight, ITensor *bias) override |
| Set parameter tensor references (module remains owner). | |
| Public Member Functions inherited from Mila::Dnn::Compute::UnaryOperation< DeviceType::Cpu, TensorDataType::FP32 > | |
| virtual | ~UnaryOperation ()=default |
| Public Member Functions inherited from Mila::Dnn::Compute::Operation< TDeviceType, TInput > | |
| virtual | ~Operation ()=default |
| virtual void | clearGradients () noexcept |
| Clear any cached gradient pointers held by the operation. | |
| virtual TensorDataType | getDataType () const |
| Tensor data type for this operation. | |
| virtual DeviceType | getDeviceType () const |
| Device type for this operation. | |
| virtual std::size_t | getStateMemorySize () const |
| Returns the number of bytes of state memory allocated by this operation. | |
| virtual bool | isBuilt () const |
| Whether build() completed successfully for a concrete input shape. | |
| virtual bool | isEvalMode () const |
| Query whether operation is configured for training. | |
| virtual void | setTrainingMode (TrainingMode training_mode) |
| Configure operation training-mode behavior. | |
Private Member Functions | |
| void | allocateStatisticsTensors () |
| Allocate backend-owned statistics tensors (mean and reciprocal std dev). | |
| void | computeAxisPartitioning (const shape_t &input_shape) |
| Compute axis partitioning for statistics computation. | |
| void | validateInputShape (const shape_t &input_shape) const |
| Validate input shape matches configuration. | |
| void | validateShapeConsistency (const shape_t &shape) const |
| Validate input shape consistency with cached build-time dimensions. | |
Private Attributes | |
| int64_t | axis_ { -1 } |
| float * | bias_ { nullptr } |
| float * | bias_grad_ { nullptr } |
| LayerNormConfig | config_ |
| IExecutionContext * | context_ { nullptr } |
| int64_t | dim_size_ { 1 } |
| int64_t | expected_slices_ { 0 } |
| int64_t | inner_size_ { 1 } |
| std::shared_ptr< TensorType > | mean_ { nullptr } |
| int64_t | outer_size_ { 1 } |
| std::shared_ptr< TensorType > | rstd_ { nullptr } |
| float * | weight_ { nullptr } |
| float * | weight_grad_ { nullptr } |
Additional Inherited Members | |
| Static Public Attributes inherited from Mila::Dnn::Compute::Operation< TDeviceType, TInput > | |
| static constexpr TensorDataType | data_type |
| static constexpr DeviceType | device_type |
| Static Protected Member Functions inherited from Mila::Dnn::Compute::UnaryOperation< DeviceType::Cpu, TensorDataType::FP32 > | |
| static const TensorInputType & | asInputTensor (const ITensor &t) |
| static TensorOutputType & | asOutputTensor (ITensor &t) |
| Protected Attributes inherited from Mila::Dnn::Compute::Operation< TDeviceType, TInput > | |
| bool | is_built_ |
| TrainingMode | training_mode_ |
CPU implementation of Layer Normalization using abstract TensorDataType API.
Uses proper Tensor instances for all internal state including statistics (mean/rstd), ensuring architectural consistency with the rest of the framework.
| using Mila::Dnn::Compute::CpuLayerNormOp::UnaryOperationBase = UnaryOperation<DeviceType::Cpu, TensorDataType::FP32> |
|
inline |
|
inlineprivate |
Allocate backend-owned statistics tensors (mean and reciprocal std dev).

|
inlineoverridevirtual |
Backward pass - compute gradients for input and parameters.
Parameter gradients are written directly to the pointers provided via setGradients() (weight_grad_, bias_grad_).
Implements Mila::Dnn::Compute::UnaryOperation< DeviceType::Cpu, TensorDataType::FP32 >.

|
inlineoverridevirtual |
Build the operation for a concrete input shape.
Requires:
Allocates backend-owned tensor storage for mean/rstd statistics sized to the outer grouping implied by the input shape and normalized axes.
Reimplemented from Mila::Dnn::Compute::Operation< TDeviceType, TInput >.

|
inlineprivate |
Compute axis partitioning for statistics computation.


|
inlineoverridevirtual |
Forward pass - normalize input and apply learned affine transform.
Uses cached parameter raw pointers (weight_, bias_) and backend-owned mean/rstd tensor storage allocated during build().
Implements Mila::Dnn::Compute::UnaryOperation< DeviceType::Cpu, TensorDataType::FP32 >.

|
inlineoverridevirtual |
Human-readable operation name.
Implements Mila::Dnn::Compute::Operation< TDeviceType, TInput >.
|
inlineoverridevirtual |
Operation type identifier.
Implements Mila::Dnn::Compute::Operation< TDeviceType, TInput >.
|
inlineoverridevirtual |
Set parameter gradient tensor references for training.
The operation caches native gradient pointers for hot-path write access during backward(). Weight gradient is required; bias gradient is bound only when the LayerNorm config indicates a bias is present.
| weight_grad | Gradient tensor for weight parameter |
| bias_grad | Gradient tensor for bias parameter (optional based on config) |
| std::invalid_argument | If weight_grad is null |
| std::invalid_argument | If bias_grad is null when config requires bias |
Reimplemented from Mila::Dnn::Compute::Operation< TDeviceType, TInput >.

|
inlineoverridevirtual |
Set parameter tensor references (module remains owner).
The operation caches native data pointers for hot-path access. The weight tensor is required; bias is bound only when the LayerNorm config indicates a bias is present.
Note: build() requires parameters to be bound before it is called.
Reimplemented from Mila::Dnn::Compute::Operation< TDeviceType, TInput >.

|
inlineprivate |
Validate input shape matches configuration.


|
inlineprivate |
Validate input shape consistency with cached build-time dimensions.


|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |