|
Mila 0.13.48
Deep Neural Network Library
|
Device-templated Layer Normalization component. More...


Public Types | |
| using | ComponentBase = Component<TDeviceType, TPrecision> |
| using | MR = typename DeviceTypeTraits<TDeviceType>::memory_resource |
| using | TensorType = Tensor<TPrecision, MR> |
Public Member Functions | |
| LayerNorm (const std::string &name, const LayerNormConfig &config, std::optional< DeviceId > device_id=std::nullopt) | |
| Construct LayerNorm with optional ExecutionContext ownership. | |
| ~LayerNorm () override=default | |
| TensorType & | backward (const TensorType &input, const TensorType &output_grad) |
| Run backward pass and return a reference to the component-owned input-gradient tensor. | |
| TensorType & | forward (const TensorType &input) |
| Run forward pass and return a reference to the component-owned output tensor. | |
| DeviceId | getDeviceId () const override |
| Get the compute device id associated with this component. | |
| std::vector< ITensor * > | getGradients () const override |
| Return non-owning pointers to parameter gradient tensors. | |
| MemoryStats | getMemoryStats () const override |
| Return the current memory allocation breakdown for this component. | |
| std::vector< ITensor * > | getParameters () const override |
| Return non-owning pointers to parameter tensors. | |
| const ComponentType | getType () const override |
| Get the component type identifier. | |
| void | loadParameter (const std::string &name, const ITensorBlob &blob) override |
| Load a parameter from serialized tensor data. | |
| size_t | parameterCount () const override |
| Return number of trainable parameters. | |
| void | save_ (ModelArchive &archive, SerializationMode mode) const override |
| void | synchronize () override |
| Wait for outstanding device work submitted by this component. | |
| std::string | toString () const override |
| Produce a short, human-readable description of the component. | |
| void | zeroGradients () override |
| Clear all model-owned gradients for this component. | |
| Public Member Functions inherited from Mila::Dnn::Component< TDeviceType, TPrecision > | |
| Component (const std::string &name) | |
| Construct component with required name identifier. | |
| virtual | ~Component ()=default |
| virtual void | build (const BuildContext &context) final |
| Build the component with the provided BuildContext (canonical overload). | |
| const std::string | getName () const |
| Get the component's name identifier. | |
| virtual std::vector< std::string > | getParameterNames () const |
| List all available parameter names for this component. | |
| RuntimeMode | getRuntimeMode () const noexcept |
| Convenience accessor — true if currently in Eval mode. | |
| TrainingMode | getTrainingMode () const noexcept |
| The current runtime behavioral mode of this Component. | |
| virtual bool | isBuilt () const final |
| Returns true if build() has completed successfully. | |
| bool | isInferenceMode () const noexcept |
| bool | isTrainingMode () const noexcept |
| void | setTrainingMode (TrainingMode mode) |
| Set the runtime behavioral mode for this Component. | |
Protected Member Functions | |
| void | onBuilding (const BuildContext &context) override |
| Hook invoked during build() to initialize component with input shape. | |
| void | onExecutionContextSet () override |
| Hook invoked after ExecutionContext is set. | |
| void | onTrainingModeChanging (TrainingMode training_mode) override |
| Hook invoked when training mode is about to change. | |
| Protected Member Functions inherited from Mila::Dnn::Component< TDeviceType, TPrecision > | |
| IExecutionContext * | getExecutionContext () const |
| Get the shared execution context. | |
| bool | hasExecutionContext () const noexcept |
| Check if execution context has been set. | |
| template<TensorDataType TParameterPrecision, typename TMemoryResource> | |
| void | loadParameterFromBlob (const std::string ¶m_name, const Serialization::ITensorBlob &blob, Tensor< TParameterPrecision, TMemoryResource > &target, const shape_t &expected_shape) |
| Load a tensor blob into a parameter tensor with validation. | |
| void | setExecutionContext (IExecutionContext *context) |
| Set the execution context for this component. | |
Private Member Functions | |
| dim_t | computeNormalizedFeatureCount (const shape_t &input_shape) const |
| void | createOperation () |
| void | initializeGradients () |
| void | initializeParameters (const shape_t &input_shape) |
| Single parameter allocation routine. | |
| void | validateBuildContext (const BuildContext &context) const |
| void | validateInputShape (const shape_t &input_shape) const |
Private Attributes | |
| std::shared_ptr< TensorType > | bias_ { nullptr } |
| std::shared_ptr< TensorType > | bias_grad_ { nullptr } |
| LayerNormConfig | config_ |
| std::unique_ptr< TensorType > | input_grad_ { nullptr } |
| std::shared_ptr< UnaryOperation< TDeviceType, TPrecision > > | operation_ { nullptr } |
| std::unique_ptr< TensorType > | output_ { nullptr } |
| std::optional< TensorType > | output_view_ |
| std::unique_ptr< IExecutionContext > | owned_exec_context_ { nullptr } |
| std::shared_ptr< TensorType > | weight_ { nullptr } |
| std::shared_ptr< TensorType > | weight_grad_ { nullptr } |
Additional Inherited Members | |
| Static Public Member Functions inherited from Mila::Dnn::Component< TDeviceType, TPrecision > | |
| static constexpr DeviceType | getDeviceType () |
| Compile-time device type for this component instance. | |
| static constexpr TensorDataType | getPrecision () noexcept |
| Compile-time tensor precision for this component instance. | |
| Protected Attributes inherited from Mila::Dnn::Component< TDeviceType, TPrecision > | |
| BuildContext | build_context_ { shape_t{ 1 }, RuntimeMode::Training } |
| The BuildContext stored at build time. | |
Device-templated Layer Normalization component.
Provides forward and backward APIs that operate on concrete Tensor types. Delegates heavy compute to a UnaryOperation backend. Parameters (weight/bias) and parameter gradients are owned by the component.
|
inlineexplicitexport |
Construct LayerNorm with optional ExecutionContext ownership.
| name | Component name (used for tensor names). |
| config | LayerNorm configuration (normalized_shape, axis, epsilon, bias). |
| device_id | If provided, component creates and owns an ExecutionContext bound to this device; otherwise a parent must supply one before building. |
| std::invalid_argument | if provided device_id type does not match template. |

|
overrideexportdefault |
|
inlineexport |
Run backward pass and return a reference to the component-owned input-gradient tensor.
The returned reference refers to a Tensor owned by this component. The backend operation_->backward will write/accumulate into the provided input-gradient tensor.
Preconditions:
| input | Original forward input tensor (device-bound). |
| output_grad | Gradient with respect to the component output (device-bound). |
| std::runtime_error | on precondition violations. |

|
inlineexportprivate |


|
inlineexportprivate |


|
inlineexport |
Run forward pass and return a reference to the component-owned output tensor.
The returned reference refers to a Tensor owned by this component. The backend operation_->forward will write into the provided output tensor.
Preconditions:
| input | Input Tensor bound to the component device. |
| std::runtime_error | on precondition violations. |

|
inlineoverrideexportvirtual |
Get the compute device id associated with this component.
Must return the device on which parameters and operations execute.
Implements Mila::Dnn::Component< TDeviceType, TPrecision >.

|
inlineoverrideexportvirtual |
Return non-owning pointers to parameter gradient tensors.
Only valid when isTraining() is true.
| std::runtime_error | if called when not in training mode or before the component has been built. |
Implements Mila::Dnn::Component< TDeviceType, TPrecision >.
|
inlineoverrideexportvirtual |
Return the current memory allocation breakdown for this component.
Reflects allocations at the moment of the call. The returned stats naturally track the component lifecycle:
After construction — parameters only After build( Inference ) — parameters + T=1 state buffers After build( Training ) — parameters + T=full state buffers After setEvaluation( false ) — parameters + state + gradients
For CompositeComponent and Network, the returned stats are the recursive aggregate of all child components.
May be called at any time — no lifecycle preconditions.
Implements Mila::Dnn::Component< TDeviceType, TPrecision >.
|
inlineoverrideexportvirtual |
Return non-owning pointers to parameter tensors.
The returned tensor pointers remain valid for the lifetime of the component. Order should be canonical (weights before biases).
Implements Mila::Dnn::Component< TDeviceType, TPrecision >.
|
inlineoverrideexportvirtual |
Get the component type identifier.
Used for serialization and runtime type identification.
Implements Mila::Dnn::Component< TDeviceType, TPrecision >.
|
inlineexportprivate |


|
inlineexportprivate |
Single parameter allocation routine.
If input_shape is provided the allocator will compute channel count and outer_shape for axis-mode or normalized-shape-mode. If only normalized_shape is available, channels are computed from that shape and outer_shape is left empty.
| input_shape | Optional pointer to the build-time input shape. |


|
inlineoverrideexportvirtual |
Load a parameter from serialized tensor data.
Loads raw tensor bytes directly into an existing parameter tensor, handling precision conversion and device upload as needed.
The component validates that the blob's shape matches the parameter's expected shape, then delegates to the backend to perform:
| name | Parameter name used to locate the target tensor. |
| blob | Serialized tensor metadata and raw bytes. |
| std::runtime_error | if component has no parameters to load. |
| std::runtime_error | if blob shape doesn't match parameter shape. |
Reimplemented from Mila::Dnn::Component< TDeviceType, TPrecision >.


|
inlineoverrideexportprotectedvirtual |
Hook invoked during build() to initialize component with input shape.
Validates input shape, allocates parameters if needed, binds parameters to the backend operation, triggers backend build, and allocates the component-owned forward output and input-gradient tensors.
Output buffer — allocated at the full input shape.
LayerNorm is a general component with no knowledge of sequence dimensions or inference decode paths. The parent Network or Transformer is responsible for passing the correct input shape via BuildContext:
Training — full sequence shape e.g. [B, T, features] Inference — decode shape e.g. [1, 1, features] for decode path or prefill shape e.g. [1, T_chunk, features] for prefill
In all cases LayerNorm simply allocates at inputShape() — no special casing for inference or sequence dimensions.
Reimplemented from Mila::Dnn::Component< TDeviceType, TPrecision >.

|
inlineoverrideexportprotectedvirtual |
Hook invoked after ExecutionContext is set.
Creates the backend operation and performs any eager parameter allocation if normalized_shape was supplied at construction time.
Reimplemented from Mila::Dnn::Component< TDeviceType, TPrecision >.

|
inlineoverrideexportprotectedvirtual |
Hook invoked when training mode is about to change.
Propagates training state to the backend operation and allocates or clears parameter gradient buffers as appropriate.
Reimplemented from Mila::Dnn::Component< TDeviceType, TPrecision >.
|
inlineoverrideexportvirtual |
Return number of trainable parameters.
For leaf components this is the element count of owned parameter tensors. CompositeComponent and Network implementations should return the recursive aggregate across all children.
Implements Mila::Dnn::Component< TDeviceType, TPrecision >.

|
inlineoverrideexportvirtual |
Implements Mila::Dnn::Component< TDeviceType, TPrecision >.
|
inlineoverrideexportvirtual |
Wait for outstanding device work submitted by this component.
On CPU this may be a no-op. Use to ensure results are visible to the host or to measure synchronous timings.
Implements Mila::Dnn::Component< TDeviceType, TPrecision >.

|
inlineoverrideexportvirtual |
Produce a short, human-readable description of the component.
Implementations should keep output concise and avoid throwing.
Implements Mila::Dnn::Component< TDeviceType, TPrecision >.

|
inlineexportprivate |


|
inlineexportprivate |

|
inlineoverrideexportvirtual |
Clear all model-owned gradients for this component.
Default implementation is a no-op. Composite components should override to recurse to children. Leaf components should override to zero their parameter and activation gradients using device-aware helpers.
Reimplemented from Mila::Dnn::Component< TDeviceType, TPrecision >.
