Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
Mila::Dnn::Residual< TDeviceType, TPrecision > Class Template Referenceexport

Device-templated Residual connection component. More...

Inheritance diagram for Mila::Dnn::Residual< TDeviceType, TPrecision >:
Collaboration diagram for Mila::Dnn::Residual< TDeviceType, TPrecision >:

Public Types

using ComponentBase = Component<TDeviceType, TPrecision>
using MR = typename DeviceTypeTraits<TDeviceType>::memory_resource
using TensorType = Tensor<TPrecision, MR>

Public Member Functions

 Residual (const std::string &name, const ResidualConfig &config, std::optional< DeviceId > device_id=std::nullopt)
 Construct Residual component with optional ExecutionContext ownership.
 ~Residual () override=default
std::pair< TensorType &, TensorType & > backward (const TensorType &input_a, const TensorType &input_b, const TensorType &output_grad)
 Execute the backward pass and return component-owned gradient for input_a.
TensorTypeforward (const TensorType &input_a, const TensorType &input_b)
 Execute the forward pass and return component-owned output tensor.
DeviceId getDeviceId () const override
 Get the compute device id associated with this component.
std::vector< ITensor * > getGradients () const override
 Return non-owning pointers to parameter gradient tensors.
MemoryStats getMemoryStats () const override
 Return the current memory allocation breakdown for this component.
std::vector< ITensor * > getParameters () const override
 Return non-owning pointers to parameter tensors.
const ComponentType getType () const override
 Get the component type identifier.
size_t parameterCount () const override
 Number of trainable parameters.
void save_ (ModelArchive &archive, SerializationMode mode) const override
 Serialize component parameters into the provided archive.
void synchronize () override
 Block until all device operations submitted by this component complete.
std::string toString () const override
 Return a human-readable description of the component.
Public Member Functions inherited from Mila::Dnn::Component< TDeviceType, TPrecision >
 Component (const std::string &name)
 Construct component with required name identifier.
virtual ~Component ()=default
virtual void build (const BuildContext &context) final
 Build the component with the provided BuildContext (canonical overload).
const std::string getName () const
 Get the component's name identifier.
virtual std::vector< std::string > getParameterNames () const
 List all available parameter names for this component.
RuntimeMode getRuntimeMode () const noexcept
 Convenience accessor — true if currently in Eval mode.
TrainingMode getTrainingMode () const noexcept
 The current runtime behavioral mode of this Component.
virtual bool isBuilt () const final
 Returns true if build() has completed successfully.
bool isInferenceMode () const noexcept
bool isTrainingMode () const noexcept
virtual void loadParameter (const std::string &name, const Serialization::ITensorBlob &blob)
 Load a parameter from serialized tensor data.
void setTrainingMode (TrainingMode mode)
 Set the runtime behavioral mode for this Component.
virtual void zeroGradients ()
 Clear all model-owned gradients for this component.

Protected Member Functions

void onBuilding (const BuildContext &build_config) override
 Build the Residual component from the provided BuildContext.
void onExecutionContextSet () override
 Hook invoked after ExecutionContext is set on the base Component.
void onTrainingModeChanging (TrainingMode training_mode) override
 Hook invoked when training mode is about to change.
Protected Member Functions inherited from Mila::Dnn::Component< TDeviceType, TPrecision >
IExecutionContextgetExecutionContext () const
 Get the shared execution context.
bool hasExecutionContext () const noexcept
 Check if execution context has been set.
template<TensorDataType TParameterPrecision, typename TMemoryResource>
void loadParameterFromBlob (const std::string &param_name, const Serialization::ITensorBlob &blob, Tensor< TParameterPrecision, TMemoryResource > &target, const shape_t &expected_shape)
 Load a tensor blob into a parameter tensor with validation.
void setExecutionContext (IExecutionContext *context)
 Set the execution context for this component.

Private Types

using OpType = typename OperationTraits<OperationType::ResidualOp, TDeviceType, TPrecision>::type

Private Member Functions

void createOperation ()
 Create backend BinaryOperation from OperationRegistry.

Private Attributes

ResidualConfig config_
std::unique_ptr< TensorTypeinput_a_grad_ { nullptr }
std::unique_ptr< TensorTypeinput_b_grad_ { nullptr }
shape_t leading_shape_
std::shared_ptr< OpTypeoperation_ { nullptr }
std::unique_ptr< TensorTypeoutput_ { nullptr }
std::unique_ptr< TensorTypeoutput_view_ { nullptr }
std::unique_ptr< IExecutionContextowned_exec_context_ { nullptr }

Additional Inherited Members

Static Public Member Functions inherited from Mila::Dnn::Component< TDeviceType, TPrecision >
static constexpr DeviceType getDeviceType ()
 Compile-time device type for this component instance.
static constexpr TensorDataType getPrecision () noexcept
 Compile-time tensor precision for this component instance.
Protected Attributes inherited from Mila::Dnn::Component< TDeviceType, TPrecision >
BuildContext build_context_ { shape_t{ 1 }, RuntimeMode::Training }
 The BuildContext stored at build time.

Detailed Description

template<DeviceType TDeviceType, TensorDataType TPrecision>
requires PrecisionSupportedOnDevice<TPrecision, TDeviceType>
class Mila::Dnn::Residual< TDeviceType, TPrecision >

Device-templated Residual connection component.

Delegates binary residual computation to a device-specific backend operation. Parameters (if any) and any projection tensors are stored as Tensor instances bound to the associated execution context.

New API:

  • forward(...) returns pointer to a component-owned output ITensor
  • backward(...) returns pointer to a component-owned input-gradient for the first input. The component also owns the gradient for the second input which can be accessed via getInputBGrad().
Template Parameters
TDeviceTypeDevice type (DeviceType::Cpu or DeviceType::Cuda).
TPrecisionAbstract tensor precision (TensorDataType).

Constructor & Destructor Documentation

◆ Residual()

template<DeviceType TDeviceType, TensorDataType TPrecision>
Mila::Dnn::Residual< TDeviceType, TPrecision >::Residual ( const std::string & name,
const ResidualConfig & config,
std::optional< DeviceId > device_id = std::nullopt )
inlineexplicitexport

Construct Residual component with optional ExecutionContext ownership.

Supports two construction modes:

  • Standalone mode (device_id provided): creates and owns an ExecutionContext.
  • Shared mode (no device_id): parent must call setExecutionContext() prior to build().
Parameters
nameComponent name identifier (mandatory).
build_configResidual configuration.
device_idOptional device identifier to create owned ExecutionContext.
Exceptions
std::invalid_argumentif build_config is invalid or device type mismatches.
std::runtime_errorif ExecutionContext creation fails (standalone mode).
Here is the call graph for this function:

◆ ~Residual()

template<DeviceType TDeviceType, TensorDataType TPrecision>
Mila::Dnn::Residual< TDeviceType, TPrecision >::~Residual ( )
overrideexportdefault

Member Function Documentation

◆ backward()

template<DeviceType TDeviceType, TensorDataType TPrecision>
std::pair< TensorType &, TensorType & > Mila::Dnn::Residual< TDeviceType, TPrecision >::backward ( const TensorType & input_a,
const TensorType & input_b,
const TensorType & output_grad )
inlineexport

Execute the backward pass and return component-owned gradient for input_a.

Component owns both input gradients (for input_a and input_b). The method returns the gradient tensor for input_a. The gradient for input_b can be accessed via getInputBGrad() after calling backward().

Parameters
input_aLeft forward input.
input_bRight forward input.
output_gradGradient with respect to the component output.
Returns
Pointer to component-owned ITensor containing gradient w.r.t. input_a.
Exceptions
std::runtime_errorif backend not initialized or if component not built/training.
Here is the call graph for this function:

◆ createOperation()

template<DeviceType TDeviceType, TensorDataType TPrecision>
void Mila::Dnn::Residual< TDeviceType, TPrecision >::createOperation ( )
inlineexportprivate

Create backend BinaryOperation from OperationRegistry.

Called by onExecutionContextSet(). Looks up "ResidualOp" in the OperationRegistry and creates a device-specific implementation.

Exceptions
std::runtime_errorif operation creation fails.
Here is the call graph for this function:
Here is the caller graph for this function:

◆ forward()

template<DeviceType TDeviceType, TensorDataType TPrecision>
TensorType & Mila::Dnn::Residual< TDeviceType, TPrecision >::forward ( const TensorType & input_a,
const TensorType & input_b )
inlineexport

Execute the forward pass and return component-owned output tensor.

The returned pointer is owned by the component and is valid until the component is destroyed or rebuilt. The backend BinaryOperation signature is unchanged; the component provides the owned output tensor when invoking the backend.

Parameters
input_aLeft input tensor.
input_bRight input tensor.
Returns
Pointer to component-owned ITensor containing the forward result.
Exceptions
std::runtime_errorif component has not been built or backend missing.
Here is the call graph for this function:

◆ getDeviceId()

template<DeviceType TDeviceType, TensorDataType TPrecision>
DeviceId Mila::Dnn::Residual< TDeviceType, TPrecision >::getDeviceId ( ) const
inlineoverrideexportvirtual

Get the compute device id associated with this component.

Must return the device on which parameters and operations execute.

Implements Mila::Dnn::Component< TDeviceType, TPrecision >.

Here is the call graph for this function:

◆ getGradients()

template<DeviceType TDeviceType, TensorDataType TPrecision>
std::vector< ITensor * > Mila::Dnn::Residual< TDeviceType, TPrecision >::getGradients ( ) const
inlineoverrideexportvirtual

Return non-owning pointers to parameter gradient tensors.

Only valid in training mode. Residual has no trainable parameters by default.

Implements Mila::Dnn::Component< TDeviceType, TPrecision >.

◆ getMemoryStats()

template<DeviceType TDeviceType, TensorDataType TPrecision>
MemoryStats Mila::Dnn::Residual< TDeviceType, TPrecision >::getMemoryStats ( ) const
inlineoverrideexportvirtual

Return the current memory allocation breakdown for this component.

Reflects allocations at the moment of the call. The returned stats naturally track the component lifecycle:

After construction — parameters only After build( Inference ) — parameters + T=1 state buffers After build( Training ) — parameters + T=full state buffers After setEvaluation( false ) — parameters + state + gradients

For CompositeComponent and Network, the returned stats are the recursive aggregate of all child components.

May be called at any time — no lifecycle preconditions.

Returns
MemoryStats reflecting current allocations.

Implements Mila::Dnn::Component< TDeviceType, TPrecision >.

◆ getParameters()

template<DeviceType TDeviceType, TensorDataType TPrecision>
std::vector< ITensor * > Mila::Dnn::Residual< TDeviceType, TPrecision >::getParameters ( ) const
inlineoverrideexportvirtual

Return non-owning pointers to parameter tensors.

Residual has no trainable parameter tensors by default; return empty list.

Implements Mila::Dnn::Component< TDeviceType, TPrecision >.

◆ getType()

template<DeviceType TDeviceType, TensorDataType TPrecision>
const ComponentType Mila::Dnn::Residual< TDeviceType, TPrecision >::getType ( ) const
inlineoverrideexportvirtual

Get the component type identifier.

Used for serialization and runtime type identification.

Returns
Component type enum value.

Implements Mila::Dnn::Component< TDeviceType, TPrecision >.

◆ onBuilding()

template<DeviceType TDeviceType, TensorDataType TPrecision>
void Mila::Dnn::Residual< TDeviceType, TPrecision >::onBuilding ( const BuildContext & build_config)
inlineoverrideexportprotectedvirtual

Build the Residual component from the provided BuildContext.

Allocates the component-owned output buffer and, for Training-mode builds, the gradient buffers for both inputs.

Output buffer

The output buffer matches the full input shape. Residual is a pure elementwise addition with no sequence dimension concern — RuntimeMode does not influence output buffer allocation.

Gradient buffers

input_a_grad_ and input_b_grad_ are allocated only for RuntimeMode::Training builds. Both are the same shape as the input — Residual is elementwise addition and both inputs are always symmetric in shape.

Parameters
build_configFull input shape and RuntimeMode for this build.

Reimplemented from Mila::Dnn::Component< TDeviceType, TPrecision >.

Here is the call graph for this function:

◆ onExecutionContextSet()

template<DeviceType TDeviceType, TensorDataType TPrecision>
void Mila::Dnn::Residual< TDeviceType, TPrecision >::onExecutionContextSet ( )
inlineoverrideexportprotectedvirtual

Hook invoked after ExecutionContext is set on the base Component.

Create the device-specific BinaryOperation backend via the OperationRegistry.

Reimplemented from Mila::Dnn::Component< TDeviceType, TPrecision >.

Here is the call graph for this function:

◆ onTrainingModeChanging()

template<DeviceType TDeviceType, TensorDataType TPrecision>
void Mila::Dnn::Residual< TDeviceType, TPrecision >::onTrainingModeChanging ( TrainingMode training_mode)
inlineoverrideexportprotectedvirtual

Hook invoked when training mode is about to change.

Inform backend operation of the new training mode. When leaving training, explicitly unbind any parameter-gradient pointers on the backend to avoid accidental use or pinned memory.

Called with Component's training mutex held; do not call setTraining() here.

Reimplemented from Mila::Dnn::Component< TDeviceType, TPrecision >.

◆ parameterCount()

template<DeviceType TDeviceType, TensorDataType TPrecision>
size_t Mila::Dnn::Residual< TDeviceType, TPrecision >::parameterCount ( ) const
inlineoverrideexportvirtual

Number of trainable parameters.

Residual has no trainable parameters.

Returns
0

Implements Mila::Dnn::Component< TDeviceType, TPrecision >.

◆ save_()

template<DeviceType TDeviceType, TensorDataType TPrecision>
void Mila::Dnn::Residual< TDeviceType, TPrecision >::save_ ( ModelArchive & archive,
SerializationMode mode ) const
inlineoverrideexportvirtual

Serialize component parameters into the provided archive.

Placeholder; concrete implementations should write named parameter tensors into the archive.

Implements Mila::Dnn::Component< TDeviceType, TPrecision >.

◆ synchronize()

template<DeviceType TDeviceType, TensorDataType TPrecision>
void Mila::Dnn::Residual< TDeviceType, TPrecision >::synchronize ( )
inlineoverrideexportvirtual

Block until all device operations submitted by this component complete.

Exceptions
std::runtime_errorif ExecutionContext has not been set.

Implements Mila::Dnn::Component< TDeviceType, TPrecision >.

Here is the call graph for this function:

◆ toString()

template<DeviceType TDeviceType, TensorDataType TPrecision>
std::string Mila::Dnn::Residual< TDeviceType, TPrecision >::toString ( ) const
inlineoverrideexportvirtual

Return a human-readable description of the component.

Includes configured name, training/built state, backend presence, device information and parameter count to aid debugging and logging.

Implements Mila::Dnn::Component< TDeviceType, TPrecision >.

Here is the call graph for this function:

The documentation for this class was generated from the following file:
  • /__w/Mila/Mila/Mila/Src/Dnn/Components/Connections/Residual.ixx