Mila
Deep Neural Network Library
Loading...
Searching...
No Matches
Mila::Dnn::Linear< TDeviceType, TInput, TOutput > Class Template Referenceexport

A class representing a linear transformation module. More...

Inheritance diagram for Mila::Dnn::Linear< TDeviceType, TInput, TOutput >:
Collaboration diagram for Mila::Dnn::Linear< TDeviceType, TInput, TOutput >:

Public Types

using ModuleBase = Module< TDeviceType, TInput, TOutput >
 Alias for base module type.
 
using MR = std::conditional_t< TDeviceType==DeviceType::Cuda, CudaMemoryResource, CpuMemoryResource >
 Memory resource type used for tensors, selected based on device type.
 
- Public Types inherited from Mila::Dnn::Module< TDeviceType, TInput, TOutput >
using MR = std::conditional_t< TDeviceType==DeviceType::Cuda, CudaMemoryResource, CpuMemoryResource >
 

Public Member Functions

 Linear (const std::string &device_name, const LinearConfig &config)
 Constructs a new Linear module with a device name.
 
 Linear (std::shared_ptr< DeviceContext > device_context, const LinearConfig &config)
 Constructs a new Linear module with a provided device context.
 
void backward (const Tensor< TInput, MR > &input, const Tensor< TOutput, MR > &output_grad, Tensor< TInput, MR > &input_grad)
 Performs the backward pass of the Linear operation.
 
void forward (const Tensor< TInput, MR > &input, Tensor< TOutput, MR > &output)
 Performs the forward pass of the Linear operation.
 
std::optional< std::shared_ptr< Tensor< TOutput, MR > > > getBias ()
 Retrieves the bias tensor if present.
 
std::shared_ptr< Tensor< TOutput, MR > > getWeight ()
 Retrieves the weight tensor for this linear layer.
 
bool hasBias () const
 Checks whether the module has a bias tensor.
 
void load (ModelArchive &archive) override
 Deserializes the module state from a ZIP archive.
 
size_t parameterCount () const override
 Gets the number of trainable parameters in this module.
 
void save (ModelArchive &zip) const override
 Serializes the module state to a ZIP archive.
 
std::string toString () const override
 Converts the module information to a human-readable string.
 
- Public Member Functions inherited from Mila::Dnn::Module< TDeviceType, TInput, TOutput >
 Module (const std::string &device_name, const ComponentConfig &config)
 Constructor with device name.
 
 Module (std::shared_ptr< DeviceContext > context, const ComponentConfig &config)
 Constructor with a specific device context.
 
virtual ~Module ()=default
 Virtual destructor for proper cleanup in derived classes.
 
std::shared_ptr< Compute::DeviceContextgetDeviceContext () const
 Get the device context for this module.
 
Compute::DeviceType getDeviceType () const
 Get the device type of the current device context.
 
std::string getName () const
 Get the name of the module.
 
const auto & getParameterTensors () const
 Get the parameter tensors of this module.
 
const ComputePrecision::PolicygetPrecision () const
 
const auto & getStateTensors () const
 Get the state tensors of this module.
 
bool isTraining () const
 Check if the module is in training mode.
 
virtual void setTraining (bool is_training)
 Set the training mode of this module.
 

Private Member Functions

void createOperation ()
 Creates the appropriate Linear operation based on the current device context.
 
void initializeParameterGradients ()
 Initializes gradient tensors for parameters.
 
void initializeParameters ()
 Initializes the tensors needed for the Linear operation.
 

Private Attributes

std::shared_ptr< Tensor< TOutput, MR > > bias_ { nullptr }
 The bias tensor added after the matrix multiplication.
 
LinearConfig config_
 Configuration for the Linear module.
 
std::shared_ptr< UnaryOperation< TDeviceType, TInput, TOutput > > operation_ { nullptr }
 The underlying operation that implements the Linear transformation.
 
std::vector< std::shared_ptr< Tensor< TOutput, MR > > > output_state_
 Cache of intermediate tensors needed for backward pass.
 
std::vector< std::shared_ptr< Tensor< TOutput, MR > > > parameter_grads_
 Gradients for the parameters of this module.
 
std::vector< std::shared_ptr< Tensor< TOutput, MR > > > parameters_
 Collection of trainable parameters for this module.
 
OperationAttributes properties_
 Additional configuration options for the linear operation.
 
std::shared_ptr< Tensor< TOutput, MR > > weight_ { nullptr }
 The weight tensor for the linear transformation.
 

Additional Inherited Members

- Protected Member Functions inherited from Mila::Dnn::Module< TDeviceType, TInput, TOutput >
const std::string parametersToString () const
 Helper method to convert parameters to string representation.
 
const std::string stateToString () const
 Helper method to convert state tensors to string representation.
 
- Protected Attributes inherited from Mila::Dnn::Module< TDeviceType, TInput, TOutput >
std::unordered_map< std::string, std::shared_ptr< Tensor< TOutput, MR > > > parameter_map_ = {}
 Map of parameter names to parameter tensors.
 
std::unordered_map< std::string, std::shared_ptr< Tensor< TOutput, MR > > > state_map_ = {}
 Map of state names to state tensors.
 

Detailed Description

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
requires ValidFloatTensorTypes<TInput, TOutput>
class Mila::Dnn::Linear< TDeviceType, TInput, TOutput >

A class representing a linear transformation module.

The linear module (also known as fully-connected or dense layer) performs a linear transformation of the input data. The operation is defined as: output = input * weight + bias

This is a fundamental building block in neural networks that connects every input neuron to every output neuron with learnable weights.

Template Parameters
TDeviceTypeThe device type (CPU or CUDA) on which to perform computations.
TInputThe data type of the input tensor elements.
TOutputThe data type of the output tensor elements, defaults to TInput.

Member Typedef Documentation

◆ ModuleBase

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
using Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::ModuleBase = Module<TDeviceType, TInput, TOutput>
export

Alias for base module type.

◆ MR

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
using Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::MR = std::conditional_t<TDeviceType == DeviceType::Cuda, CudaMemoryResource, CpuMemoryResource>
export

Memory resource type used for tensors, selected based on device type.

Uses CudaMemoryResource for CUDA devices and CpuMemoryResource for CPU.

Constructor & Destructor Documentation

◆ Linear() [1/2]

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::Linear ( const std::string &  device_name,
const LinearConfig config 
)
inlineexplicitexport

Constructs a new Linear module with a device name.

Creates a new DeviceContext internally using the provided device name. This constructor is useful for creating standalone modules without pre-existing device contexts.

Parameters
device_nameThe name of the device to use (e.g., "CPU", "CUDA:0").
configConfiguration parameters for the Linear module.
Exceptions
std::invalid_argumentIf the device name is invalid or the configuration is invalid
std::runtime_errorIf device type doesn't match template parameter TDeviceType
Here is the call graph for this function:

◆ Linear() [2/2]

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::Linear ( std::shared_ptr< DeviceContext device_context,
const LinearConfig config 
)
inlineexplicitexport

Constructs a new Linear module with a provided device context.

Uses a pre-existing DeviceContext instance. This constructor is useful when integrating the module into a larger network that shares device contexts across modules.

Parameters
device_contextThe device context to use for this module.
configConfiguration parameters for the Linear module.
Exceptions
std::invalid_argumentIf device_context is null or configuration is invalid
std::runtime_errorIf device context type doesn't match template parameter TDeviceType
Here is the call graph for this function:

Member Function Documentation

◆ backward()

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
void Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::backward ( const Tensor< TInput, MR > &  input,
const Tensor< TOutput, MR > &  output_grad,
Tensor< TInput, MR > &  input_grad 
)
inlineexport

Performs the backward pass of the Linear operation.

Computes gradients for the input tensor and parameters based on the output gradients.

Parameters
inputThe input tensor from the forward pass.
output_gradThe gradient of loss with respect to the output.
input_gradThe tensor to store gradients with respect to input.

◆ createOperation()

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
void Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::createOperation ( )
inlineexportprivate

Creates the appropriate Linear operation based on the current device context.

This method initializes the operation_ member with the appropriate implementation of Linear for either CPU or CUDA, based on the current device context. It also passes the compute precision policy to the operation.

Exceptions
std::runtime_errorIf the operation creation fails.
Here is the call graph for this function:
Here is the caller graph for this function:

◆ forward()

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
void Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::forward ( const Tensor< TInput, MR > &  input,
Tensor< TOutput, MR > &  output 
)
inlineexport

Performs the forward pass of the Linear operation.

Applies the linear transformation to the input tensor: output = input * weight + bias (if bias is enabled)

Parameters
inputThe input tensor to be transformed.
outputThe output tensor where the results will be stored.

◆ getBias()

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
std::optional< std::shared_ptr< Tensor< TOutput, MR > > > Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::getBias ( )
inlineexport

Retrieves the bias tensor if present.

The bias tensor has shape [output_features] and is initialized to zeros if bias is enabled in the layer configuration.

Returns
std::optional<std::shared_ptr<Tensor<TOutput, MR>>> An optional containing the bias tensor if bias is enabled, otherwise std::nullopt.
Here is the call graph for this function:

◆ getWeight()

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
std::shared_ptr< Tensor< TOutput, MR > > Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::getWeight ( )
inlineexport

Retrieves the weight tensor for this linear layer.

The weight tensor has shape [output_features, input_features] and is initialized with Xavier/Glorot uniform distribution.

Returns
std::shared_ptr<Tensor<TOutput, MR>> The weight tensor used in the linear transformation.

◆ hasBias()

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
bool Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::hasBias ( ) const
inlineexport

Checks whether the module has a bias tensor.

Returns
bool True if the module has a bias tensor, false otherwise.
Here is the call graph for this function:

◆ initializeParameterGradients()

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
void Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::initializeParameterGradients ( )
inlineexportprivate

Initializes gradient tensors for parameters.

Creates tensors to store gradients for weights and biases (if present). These tensors will be populated during backpropagation.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ initializeParameters()

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
void Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::initializeParameters ( )
inlineexportprivate

Initializes the tensors needed for the Linear operation.

Creates and initializes:

  • weight tensor (initialized with Xavier/Glorot uniform distribution)
  • bias tensor (initialized to zeros if has_bias_ is true)

The tensors are created on the appropriate device (CPU or CUDA) based on the current device context.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ load()

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
void Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::load ( ModelArchive archive)
inlineoverrideexportvirtual

Deserializes the module state from a ZIP archive.

Loads the trainable parameters (weight, bias) from the provided archive. Note: This method is currently a placeholder and needs implementation.

Parameters
zipThe ZIP archive to load the module state from.
Exceptions
std::runtime_errorIf the deserialization fails.

Implements Mila::Dnn::Module< TDeviceType, TInput, TOutput >.

Here is the call graph for this function:

◆ parameterCount()

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
size_t Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::parameterCount ( ) const
inlineoverrideexportvirtual

Gets the number of trainable parameters in this module.

Counts the total number of trainable parameters, which includes the weight tensor and, if present, the bias tensor.

Returns
size_t The total number of parameters.

Implements Mila::Dnn::Module< TDeviceType, TInput, TOutput >.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ save()

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
void Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::save ( ModelArchive zip) const
inlineoverrideexportvirtual

Serializes the module state to a ZIP archive.

Saves the trainable parameters (weight, bias) to the provided archive. Note: This method is currently a placeholder and needs implementation.

Parameters
zipThe ZIP archive to save the module state to.
Exceptions
std::runtime_errorIf the serialization fails.

Implements Mila::Dnn::Module< TDeviceType, TInput, TOutput >.

Here is the call graph for this function:

◆ toString()

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
std::string Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::toString ( ) const
inlineoverrideexportvirtual

Converts the module information to a human-readable string.

Includes detailed information about the module configuration including:

  • Module name
  • Input/output features
  • Device type
  • Precision policy
  • Parameter information
Returns
std::string A string representation of the module information.

Implements Mila::Dnn::Module< TDeviceType, TInput, TOutput >.

Here is the call graph for this function:

Member Data Documentation

◆ bias_

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
std::shared_ptr<Tensor<TOutput, MR> > Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::bias_ { nullptr }
exportprivate

The bias tensor added after the matrix multiplication.

Shape is [output_features]. This tensor is only used if has_bias_ is true.

◆ config_

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
LinearConfig Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::config_
exportprivate

Configuration for the Linear module.

◆ operation_

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
std::shared_ptr<UnaryOperation<TDeviceType, TInput, TOutput> > Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::operation_ { nullptr }
exportprivate

The underlying operation that implements the Linear transformation.

This operation performs the actual computation for the linear layer, with different implementations for CPU and CUDA devices.

◆ output_state_

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
std::vector<std::shared_ptr<Tensor<TOutput, MR> > > Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::output_state_
exportprivate

Cache of intermediate tensors needed for backward pass.

Stores tensors that are computed during the forward pass and are needed for gradient computation during backpropagation.

◆ parameter_grads_

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
std::vector<std::shared_ptr<Tensor<TOutput, MR> > > Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::parameter_grads_
exportprivate

Gradients for the parameters of this module.

Contains gradients for the weight tensor and optionally the bias tensor. These are computed during the backward pass.

◆ parameters_

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
std::vector<std::shared_ptr<Tensor<TOutput, MR> > > Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::parameters_
exportprivate

Collection of trainable parameters for this module.

Contains the weight tensor and optionally the bias tensor if has_bias_ is true. These parameters are passed to the underlying operation during forward pass.

◆ properties_

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
OperationAttributes Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::properties_
exportprivate

Additional configuration options for the linear operation.

These attributes can modify the behavior of the underlying operation implementation without changing the API.

◆ weight_

template<DeviceType TDeviceType = DeviceType::Cuda, typename TInput = float, typename TOutput = TInput>
std::shared_ptr<Tensor<TOutput, MR> > Mila::Dnn::Linear< TDeviceType, TInput, TOutput >::weight_ { nullptr }
exportprivate

The weight tensor for the linear transformation.

Shape is [output_features, input_features] to transform input features to output features through matrix multiplication.


The documentation for this class was generated from the following file: