|
Mila
Deep Neural Network Library
|
A class representing a linear transformation module. More...


Public Types | |
| using | ModuleBase = Module< TDeviceType, TInput, TOutput > |
| Alias for base module type. | |
| using | MR = std::conditional_t< TDeviceType==DeviceType::Cuda, CudaMemoryResource, CpuMemoryResource > |
| Memory resource type used for tensors, selected based on device type. | |
Public Types inherited from Mila::Dnn::Module< TDeviceType, TInput, TOutput > | |
| using | MR = std::conditional_t< TDeviceType==DeviceType::Cuda, CudaMemoryResource, CpuMemoryResource > |
Public Member Functions | |
| Linear (const std::string &device_name, const LinearConfig &config) | |
| Constructs a new Linear module with a device name. | |
| Linear (std::shared_ptr< DeviceContext > device_context, const LinearConfig &config) | |
| Constructs a new Linear module with a provided device context. | |
| void | backward (const Tensor< TInput, MR > &input, const Tensor< TOutput, MR > &output_grad, Tensor< TInput, MR > &input_grad) |
| Performs the backward pass of the Linear operation. | |
| void | forward (const Tensor< TInput, MR > &input, Tensor< TOutput, MR > &output) |
| Performs the forward pass of the Linear operation. | |
| std::optional< std::shared_ptr< Tensor< TOutput, MR > > > | getBias () |
| Retrieves the bias tensor if present. | |
| std::shared_ptr< Tensor< TOutput, MR > > | getWeight () |
| Retrieves the weight tensor for this linear layer. | |
| bool | hasBias () const |
| Checks whether the module has a bias tensor. | |
| void | load (ModelArchive &archive) override |
| Deserializes the module state from a ZIP archive. | |
| size_t | parameterCount () const override |
| Gets the number of trainable parameters in this module. | |
| void | save (ModelArchive &zip) const override |
| Serializes the module state to a ZIP archive. | |
| std::string | toString () const override |
| Converts the module information to a human-readable string. | |
Public Member Functions inherited from Mila::Dnn::Module< TDeviceType, TInput, TOutput > | |
| Module (const std::string &device_name, const ComponentConfig &config) | |
| Constructor with device name. | |
| Module (std::shared_ptr< DeviceContext > context, const ComponentConfig &config) | |
| Constructor with a specific device context. | |
| virtual | ~Module ()=default |
| Virtual destructor for proper cleanup in derived classes. | |
| std::shared_ptr< Compute::DeviceContext > | getDeviceContext () const |
| Get the device context for this module. | |
| Compute::DeviceType | getDeviceType () const |
| Get the device type of the current device context. | |
| std::string | getName () const |
| Get the name of the module. | |
| const auto & | getParameterTensors () const |
| Get the parameter tensors of this module. | |
| const ComputePrecision::Policy & | getPrecision () const |
| const auto & | getStateTensors () const |
| Get the state tensors of this module. | |
| bool | isTraining () const |
| Check if the module is in training mode. | |
| virtual void | setTraining (bool is_training) |
| Set the training mode of this module. | |
Private Member Functions | |
| void | createOperation () |
| Creates the appropriate Linear operation based on the current device context. | |
| void | initializeParameterGradients () |
| Initializes gradient tensors for parameters. | |
| void | initializeParameters () |
| Initializes the tensors needed for the Linear operation. | |
Private Attributes | |
| std::shared_ptr< Tensor< TOutput, MR > > | bias_ { nullptr } |
| The bias tensor added after the matrix multiplication. | |
| LinearConfig | config_ |
| Configuration for the Linear module. | |
| std::shared_ptr< UnaryOperation< TDeviceType, TInput, TOutput > > | operation_ { nullptr } |
| The underlying operation that implements the Linear transformation. | |
| std::vector< std::shared_ptr< Tensor< TOutput, MR > > > | output_state_ |
| Cache of intermediate tensors needed for backward pass. | |
| std::vector< std::shared_ptr< Tensor< TOutput, MR > > > | parameter_grads_ |
| Gradients for the parameters of this module. | |
| std::vector< std::shared_ptr< Tensor< TOutput, MR > > > | parameters_ |
| Collection of trainable parameters for this module. | |
| OperationAttributes | properties_ |
| Additional configuration options for the linear operation. | |
| std::shared_ptr< Tensor< TOutput, MR > > | weight_ { nullptr } |
| The weight tensor for the linear transformation. | |
Additional Inherited Members | |
Protected Member Functions inherited from Mila::Dnn::Module< TDeviceType, TInput, TOutput > | |
| const std::string | parametersToString () const |
| Helper method to convert parameters to string representation. | |
| const std::string | stateToString () const |
| Helper method to convert state tensors to string representation. | |
Protected Attributes inherited from Mila::Dnn::Module< TDeviceType, TInput, TOutput > | |
| std::unordered_map< std::string, std::shared_ptr< Tensor< TOutput, MR > > > | parameter_map_ = {} |
| Map of parameter names to parameter tensors. | |
| std::unordered_map< std::string, std::shared_ptr< Tensor< TOutput, MR > > > | state_map_ = {} |
| Map of state names to state tensors. | |
A class representing a linear transformation module.
The linear module (also known as fully-connected or dense layer) performs a linear transformation of the input data. The operation is defined as: output = input * weight + bias
This is a fundamental building block in neural networks that connects every input neuron to every output neuron with learnable weights.
| TDeviceType | The device type (CPU or CUDA) on which to perform computations. |
| TInput | The data type of the input tensor elements. |
| TOutput | The data type of the output tensor elements, defaults to TInput. |
|
export |
Alias for base module type.
|
export |
Memory resource type used for tensors, selected based on device type.
Uses CudaMemoryResource for CUDA devices and CpuMemoryResource for CPU.
|
inlineexplicitexport |
Constructs a new Linear module with a device name.
Creates a new DeviceContext internally using the provided device name. This constructor is useful for creating standalone modules without pre-existing device contexts.
| device_name | The name of the device to use (e.g., "CPU", "CUDA:0"). |
| config | Configuration parameters for the Linear module. |
| std::invalid_argument | If the device name is invalid or the configuration is invalid |
| std::runtime_error | If device type doesn't match template parameter TDeviceType |

|
inlineexplicitexport |
Constructs a new Linear module with a provided device context.
Uses a pre-existing DeviceContext instance. This constructor is useful when integrating the module into a larger network that shares device contexts across modules.
| device_context | The device context to use for this module. |
| config | Configuration parameters for the Linear module. |
| std::invalid_argument | If device_context is null or configuration is invalid |
| std::runtime_error | If device context type doesn't match template parameter TDeviceType |

|
inlineexport |
Performs the backward pass of the Linear operation.
Computes gradients for the input tensor and parameters based on the output gradients.
| input | The input tensor from the forward pass. |
| output_grad | The gradient of loss with respect to the output. |
| input_grad | The tensor to store gradients with respect to input. |
|
inlineexportprivate |
Creates the appropriate Linear operation based on the current device context.
This method initializes the operation_ member with the appropriate implementation of Linear for either CPU or CUDA, based on the current device context. It also passes the compute precision policy to the operation.
| std::runtime_error | If the operation creation fails. |


|
inlineexport |
Performs the forward pass of the Linear operation.
Applies the linear transformation to the input tensor: output = input * weight + bias (if bias is enabled)
| input | The input tensor to be transformed. |
| output | The output tensor where the results will be stored. |
|
inlineexport |
Retrieves the bias tensor if present.
The bias tensor has shape [output_features] and is initialized to zeros if bias is enabled in the layer configuration.

|
inlineexport |
Retrieves the weight tensor for this linear layer.
The weight tensor has shape [output_features, input_features] and is initialized with Xavier/Glorot uniform distribution.
|
inlineexport |
Checks whether the module has a bias tensor.

|
inlineexportprivate |
Initializes gradient tensors for parameters.
Creates tensors to store gradients for weights and biases (if present). These tensors will be populated during backpropagation.


|
inlineexportprivate |
Initializes the tensors needed for the Linear operation.
Creates and initializes:
The tensors are created on the appropriate device (CPU or CUDA) based on the current device context.


|
inlineoverrideexportvirtual |
Deserializes the module state from a ZIP archive.
Loads the trainable parameters (weight, bias) from the provided archive. Note: This method is currently a placeholder and needs implementation.
| zip | The ZIP archive to load the module state from. |
| std::runtime_error | If the deserialization fails. |
Implements Mila::Dnn::Module< TDeviceType, TInput, TOutput >.

|
inlineoverrideexportvirtual |
Gets the number of trainable parameters in this module.
Counts the total number of trainable parameters, which includes the weight tensor and, if present, the bias tensor.
Implements Mila::Dnn::Module< TDeviceType, TInput, TOutput >.


|
inlineoverrideexportvirtual |
Serializes the module state to a ZIP archive.
Saves the trainable parameters (weight, bias) to the provided archive. Note: This method is currently a placeholder and needs implementation.
| zip | The ZIP archive to save the module state to. |
| std::runtime_error | If the serialization fails. |
Implements Mila::Dnn::Module< TDeviceType, TInput, TOutput >.

|
inlineoverrideexportvirtual |
Converts the module information to a human-readable string.
Includes detailed information about the module configuration including:
Implements Mila::Dnn::Module< TDeviceType, TInput, TOutput >.

|
exportprivate |
The bias tensor added after the matrix multiplication.
Shape is [output_features]. This tensor is only used if has_bias_ is true.
|
exportprivate |
Configuration for the Linear module.
|
exportprivate |
The underlying operation that implements the Linear transformation.
This operation performs the actual computation for the linear layer, with different implementations for CPU and CUDA devices.
|
exportprivate |
Cache of intermediate tensors needed for backward pass.
Stores tensors that are computed during the forward pass and are needed for gradient computation during backpropagation.
|
exportprivate |
Gradients for the parameters of this module.
Contains gradients for the weight tensor and optionally the bias tensor. These are computed during the backward pass.
|
exportprivate |
Collection of trainable parameters for this module.
Contains the weight tensor and optionally the bias tensor if has_bias_ is true. These parameters are passed to the underlying operation during forward pass.
|
exportprivate |
Additional configuration options for the linear operation.
These attributes can modify the behavior of the underlying operation implementation without changing the API.
|
exportprivate |
The weight tensor for the linear transformation.
Shape is [output_features, input_features] to transform input features to output features through matrix multiplication.