Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
Mila::Dnn::Compute::CpuLayerNormOp Class Reference

CPU implementation of Layer Normalization using abstract TensorDataType API. More...

Inheritance diagram for Mila::Dnn::Compute::CpuLayerNormOp:
Collaboration diagram for Mila::Dnn::Compute::CpuLayerNormOp:

Public Types

using CpuExecutionContext = ExecutionContext<DeviceType::Cpu>
using MR = CpuMemoryResource
using TensorType = Tensor<TensorDataType::FP32, MR>
using UnaryOperationBase = UnaryOperation<DeviceType::Cpu, TensorDataType::FP32>
Public Types inherited from Mila::Dnn::Compute::UnaryOperation< DeviceType::Cpu, TensorDataType::FP32 >
using MR
using TensorInputType
using TensorOutputType
Public Types inherited from Mila::Dnn::Compute::Operation< TDeviceType, TInput >
using DataTypeTraits

Public Member Functions

 CpuLayerNormOp (IExecutionContext *context, const LayerNormConfig &config)
void backward (const ITensor &input, const ITensor &output_grad, ITensor &input_grad) const override
 Backward pass - compute gradients for input and parameters.
void build (const BuildContext &config) override
 Build the operation for a concrete input shape.
void forward (const ITensor &input, ITensor &output) const override
 Forward pass - normalize input and apply learned affine transform.
std::string getName () const override
 Human-readable operation name.
OperationType getOperationType () const override
 Operation type identifier.
void setGradients (ITensor *weight_grad, ITensor *bias_grad) override
 Set parameter gradient tensor references for training.
void setParameters (ITensor *weight, ITensor *bias) override
 Set parameter tensor references (module remains owner).
Public Member Functions inherited from Mila::Dnn::Compute::UnaryOperation< DeviceType::Cpu, TensorDataType::FP32 >
virtual ~UnaryOperation ()=default
Public Member Functions inherited from Mila::Dnn::Compute::Operation< TDeviceType, TInput >
virtual ~Operation ()=default
virtual void clearGradients () noexcept
 Clear any cached gradient pointers held by the operation.
virtual TensorDataType getDataType () const
 Tensor data type for this operation.
virtual DeviceType getDeviceType () const
 Device type for this operation.
virtual std::size_t getStateMemorySize () const
 Returns the number of bytes of state memory allocated by this operation.
virtual bool isBuilt () const
 Whether build() completed successfully for a concrete input shape.
virtual bool isEvalMode () const
 Query whether operation is configured for training.
virtual void setTrainingMode (TrainingMode training_mode)
 Configure operation training-mode behavior.

Private Member Functions

void allocateStatisticsTensors ()
 Allocate backend-owned statistics tensors (mean and reciprocal std dev).
void computeAxisPartitioning (const shape_t &input_shape)
 Compute axis partitioning for statistics computation.
void validateInputShape (const shape_t &input_shape) const
 Validate input shape matches configuration.
void validateShapeConsistency (const shape_t &shape) const
 Validate input shape consistency with cached build-time dimensions.

Private Attributes

int64_t axis_ { -1 }
float * bias_ { nullptr }
float * bias_grad_ { nullptr }
LayerNormConfig config_
IExecutionContextcontext_ { nullptr }
int64_t dim_size_ { 1 }
int64_t expected_slices_ { 0 }
int64_t inner_size_ { 1 }
std::shared_ptr< TensorTypemean_ { nullptr }
int64_t outer_size_ { 1 }
std::shared_ptr< TensorTyperstd_ { nullptr }
float * weight_ { nullptr }
float * weight_grad_ { nullptr }

Additional Inherited Members

Static Public Attributes inherited from Mila::Dnn::Compute::Operation< TDeviceType, TInput >
static constexpr TensorDataType data_type
static constexpr DeviceType device_type
Static Protected Member Functions inherited from Mila::Dnn::Compute::UnaryOperation< DeviceType::Cpu, TensorDataType::FP32 >
static const TensorInputTypeasInputTensor (const ITensor &t)
static TensorOutputTypeasOutputTensor (ITensor &t)
Protected Attributes inherited from Mila::Dnn::Compute::Operation< TDeviceType, TInput >
bool is_built_
TrainingMode training_mode_

Detailed Description

CPU implementation of Layer Normalization using abstract TensorDataType API.

Uses proper Tensor instances for all internal state including statistics (mean/rstd), ensuring architectural consistency with the rest of the framework.

Member Typedef Documentation

◆ CpuExecutionContext

◆ MR

◆ TensorType

◆ UnaryOperationBase

Constructor & Destructor Documentation

◆ CpuLayerNormOp()

Mila::Dnn::Compute::CpuLayerNormOp::CpuLayerNormOp ( IExecutionContext * context,
const LayerNormConfig & config )
inline

Member Function Documentation

◆ allocateStatisticsTensors()

void Mila::Dnn::Compute::CpuLayerNormOp::allocateStatisticsTensors ( )
inlineprivate

Allocate backend-owned statistics tensors (mean and reciprocal std dev).

Here is the caller graph for this function:

◆ backward()

void Mila::Dnn::Compute::CpuLayerNormOp::backward ( const ITensor & input,
const ITensor & output_grad,
ITensor & input_grad ) const
inlineoverridevirtual

Backward pass - compute gradients for input and parameters.

Parameter gradients are written directly to the pointers provided via setGradients() (weight_grad_, bias_grad_).

Implements Mila::Dnn::Compute::UnaryOperation< DeviceType::Cpu, TensorDataType::FP32 >.

Here is the call graph for this function:

◆ build()

void Mila::Dnn::Compute::CpuLayerNormOp::build ( const BuildContext & config)
inlineoverridevirtual

Build the operation for a concrete input shape.

Requires:

  • setParameters() has already been called so weight/bias pointers are available
  • configuration (axis or normalized_shape) is final

Allocates backend-owned tensor storage for mean/rstd statistics sized to the outer grouping implied by the input shape and normalized axes.

Reimplemented from Mila::Dnn::Compute::Operation< TDeviceType, TInput >.

Here is the call graph for this function:

◆ computeAxisPartitioning()

void Mila::Dnn::Compute::CpuLayerNormOp::computeAxisPartitioning ( const shape_t & input_shape)
inlineprivate

Compute axis partitioning for statistics computation.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ forward()

void Mila::Dnn::Compute::CpuLayerNormOp::forward ( const ITensor & input,
ITensor & output ) const
inlineoverridevirtual

Forward pass - normalize input and apply learned affine transform.

Uses cached parameter raw pointers (weight_, bias_) and backend-owned mean/rstd tensor storage allocated during build().

Implements Mila::Dnn::Compute::UnaryOperation< DeviceType::Cpu, TensorDataType::FP32 >.

Here is the call graph for this function:

◆ getName()

std::string Mila::Dnn::Compute::CpuLayerNormOp::getName ( ) const
inlineoverridevirtual

Human-readable operation name.

Implements Mila::Dnn::Compute::Operation< TDeviceType, TInput >.

◆ getOperationType()

OperationType Mila::Dnn::Compute::CpuLayerNormOp::getOperationType ( ) const
inlineoverridevirtual

◆ setGradients()

void Mila::Dnn::Compute::CpuLayerNormOp::setGradients ( ITensor * weight_grad,
ITensor * bias_grad )
inlineoverridevirtual

Set parameter gradient tensor references for training.

The operation caches native gradient pointers for hot-path write access during backward(). Weight gradient is required; bias gradient is bound only when the LayerNorm config indicates a bias is present.

Parameters
weight_gradGradient tensor for weight parameter
bias_gradGradient tensor for bias parameter (optional based on config)
Exceptions
std::invalid_argumentIf weight_grad is null
std::invalid_argumentIf bias_grad is null when config requires bias

Reimplemented from Mila::Dnn::Compute::Operation< TDeviceType, TInput >.

Here is the call graph for this function:

◆ setParameters()

void Mila::Dnn::Compute::CpuLayerNormOp::setParameters ( ITensor * weight,
ITensor * bias )
inlineoverridevirtual

Set parameter tensor references (module remains owner).

The operation caches native data pointers for hot-path access. The weight tensor is required; bias is bound only when the LayerNorm config indicates a bias is present.

Note: build() requires parameters to be bound before it is called.

Reimplemented from Mila::Dnn::Compute::Operation< TDeviceType, TInput >.

Here is the call graph for this function:

◆ validateInputShape()

void Mila::Dnn::Compute::CpuLayerNormOp::validateInputShape ( const shape_t & input_shape) const
inlineprivate

Validate input shape matches configuration.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ validateShapeConsistency()

void Mila::Dnn::Compute::CpuLayerNormOp::validateShapeConsistency ( const shape_t & shape) const
inlineprivate

Validate input shape consistency with cached build-time dimensions.

Here is the call graph for this function:
Here is the caller graph for this function:

Member Data Documentation

◆ axis_

int64_t Mila::Dnn::Compute::CpuLayerNormOp::axis_ { -1 }
private

◆ bias_

float* Mila::Dnn::Compute::CpuLayerNormOp::bias_ { nullptr }
private

◆ bias_grad_

float* Mila::Dnn::Compute::CpuLayerNormOp::bias_grad_ { nullptr }
private

◆ config_

LayerNormConfig Mila::Dnn::Compute::CpuLayerNormOp::config_
private

◆ context_

IExecutionContext* Mila::Dnn::Compute::CpuLayerNormOp::context_ { nullptr }
private

◆ dim_size_

int64_t Mila::Dnn::Compute::CpuLayerNormOp::dim_size_ { 1 }
private

◆ expected_slices_

int64_t Mila::Dnn::Compute::CpuLayerNormOp::expected_slices_ { 0 }
private

◆ inner_size_

int64_t Mila::Dnn::Compute::CpuLayerNormOp::inner_size_ { 1 }
private

◆ mean_

std::shared_ptr<TensorType> Mila::Dnn::Compute::CpuLayerNormOp::mean_ { nullptr }
private

◆ outer_size_

int64_t Mila::Dnn::Compute::CpuLayerNormOp::outer_size_ { 1 }
private

◆ rstd_

std::shared_ptr<TensorType> Mila::Dnn::Compute::CpuLayerNormOp::rstd_ { nullptr }
private

◆ weight_

float* Mila::Dnn::Compute::CpuLayerNormOp::weight_ { nullptr }
private

◆ weight_grad_

float* Mila::Dnn::Compute::CpuLayerNormOp::weight_grad_ { nullptr }
private

The documentation for this class was generated from the following file: