Mila
Deep Neural Network Library
|
A class representing a linear transformation module. More...
Public Types | |
using | ModuleBase = Module< TDeviceType, TInput, TOutput > |
Alias for base module type. | |
using | MR = std::conditional_t< TDeviceType==DeviceType::Cuda, CudaMemoryResource, CpuMemoryResource > |
Memory resource type used for tensors, selected based on device type. | |
![]() | |
using | MR = std::conditional_t< TDeviceType==DeviceType::Cuda, CudaMemoryResource, CpuMemoryResource > |
Public Member Functions | |
Linear (const std::string &device_name, const LinearConfig &config) | |
Constructs a new Linear module with a device name. | |
Linear (std::shared_ptr< DeviceContext > device_context, const LinearConfig &config) | |
Constructs a new Linear module with a provided device context. | |
void | backward (const Tensor< TInput, MR > &input, const Tensor< TOutput, MR > &output_grad, Tensor< TInput, MR > &input_grad) |
Performs the backward pass of the Linear operation. | |
void | forward (const Tensor< TInput, MR > &input, Tensor< TOutput, MR > &output) |
Performs the forward pass of the Linear operation. | |
std::optional< std::shared_ptr< Tensor< TOutput, MR > > > | getBias () |
Retrieves the bias tensor if present. | |
std::shared_ptr< Tensor< TOutput, MR > > | getWeight () |
Retrieves the weight tensor for this linear layer. | |
bool | hasBias () const |
Checks whether the module has a bias tensor. | |
void | load (ModelArchive &archive) override |
Deserializes the module state from a ZIP archive. | |
size_t | parameterCount () const override |
Gets the number of trainable parameters in this module. | |
void | save (ModelArchive &zip) const override |
Serializes the module state to a ZIP archive. | |
std::string | toString () const override |
Converts the module information to a human-readable string. | |
![]() | |
Module (const std::string &device_name, const ComponentConfig &config) | |
Constructor with device name. | |
Module (std::shared_ptr< DeviceContext > context, const ComponentConfig &config) | |
Constructor with a specific device context. | |
virtual | ~Module ()=default |
Virtual destructor for proper cleanup in derived classes. | |
std::shared_ptr< Compute::DeviceContext > | getDeviceContext () const |
Get the device context for this module. | |
Compute::DeviceType | getDeviceType () const |
Get the device type of the current device context. | |
std::string | getName () const |
Get the name of the module. | |
const auto & | getParameterTensors () const |
Get the parameter tensors of this module. | |
const ComputePrecision::Policy & | getPrecision () const |
const auto & | getStateTensors () const |
Get the state tensors of this module. | |
bool | isTraining () const |
Check if the module is in training mode. | |
virtual void | setTraining (bool is_training) |
Set the training mode of this module. | |
Private Member Functions | |
void | createOperation () |
Creates the appropriate Linear operation based on the current device context. | |
void | initializeParameterGradients () |
Initializes gradient tensors for parameters. | |
void | initializeParameters () |
Initializes the tensors needed for the Linear operation. | |
Private Attributes | |
std::shared_ptr< Tensor< TOutput, MR > > | bias_ { nullptr } |
The bias tensor added after the matrix multiplication. | |
LinearConfig | config_ |
Configuration for the Linear module. | |
std::shared_ptr< UnaryOperation< TDeviceType, TInput, TOutput > > | operation_ { nullptr } |
The underlying operation that implements the Linear transformation. | |
std::vector< std::shared_ptr< Tensor< TOutput, MR > > > | output_state_ |
Cache of intermediate tensors needed for backward pass. | |
std::vector< std::shared_ptr< Tensor< TOutput, MR > > > | parameter_grads_ |
Gradients for the parameters of this module. | |
std::vector< std::shared_ptr< Tensor< TOutput, MR > > > | parameters_ |
Collection of trainable parameters for this module. | |
OperationAttributes | properties_ |
Additional configuration options for the linear operation. | |
std::shared_ptr< Tensor< TOutput, MR > > | weight_ { nullptr } |
The weight tensor for the linear transformation. | |
Additional Inherited Members | |
![]() | |
const std::string | parametersToString () const |
Helper method to convert parameters to string representation. | |
const std::string | stateToString () const |
Helper method to convert state tensors to string representation. | |
![]() | |
std::unordered_map< std::string, std::shared_ptr< Tensor< TOutput, MR > > > | parameter_map_ = {} |
Map of parameter names to parameter tensors. | |
std::unordered_map< std::string, std::shared_ptr< Tensor< TOutput, MR > > > | state_map_ = {} |
Map of state names to state tensors. | |
A class representing a linear transformation module.
The linear module (also known as fully-connected or dense layer) performs a linear transformation of the input data. The operation is defined as: output = input * weight + bias
This is a fundamental building block in neural networks that connects every input neuron to every output neuron with learnable weights.
TDeviceType | The device type (CPU or CUDA) on which to perform computations. |
TInput | The data type of the input tensor elements. |
TOutput | The data type of the output tensor elements, defaults to TInput. |
|
export |
Alias for base module type.
|
export |
Memory resource type used for tensors, selected based on device type.
Uses CudaMemoryResource for CUDA devices and CpuMemoryResource for CPU.
|
inlineexplicitexport |
Constructs a new Linear module with a device name.
Creates a new DeviceContext internally using the provided device name. This constructor is useful for creating standalone modules without pre-existing device contexts.
device_name | The name of the device to use (e.g., "CPU", "CUDA:0"). |
config | Configuration parameters for the Linear module. |
std::invalid_argument | If the device name is invalid or the configuration is invalid |
std::runtime_error | If device type doesn't match template parameter TDeviceType |
|
inlineexplicitexport |
Constructs a new Linear module with a provided device context.
Uses a pre-existing DeviceContext instance. This constructor is useful when integrating the module into a larger network that shares device contexts across modules.
device_context | The device context to use for this module. |
config | Configuration parameters for the Linear module. |
std::invalid_argument | If device_context is null or configuration is invalid |
std::runtime_error | If device context type doesn't match template parameter TDeviceType |
|
inlineexport |
Performs the backward pass of the Linear operation.
Computes gradients for the input tensor and parameters based on the output gradients.
input | The input tensor from the forward pass. |
output_grad | The gradient of loss with respect to the output. |
input_grad | The tensor to store gradients with respect to input. |
|
inlineexportprivate |
Creates the appropriate Linear operation based on the current device context.
This method initializes the operation_ member with the appropriate implementation of Linear for either CPU or CUDA, based on the current device context. It also passes the compute precision policy to the operation.
std::runtime_error | If the operation creation fails. |
|
inlineexport |
Performs the forward pass of the Linear operation.
Applies the linear transformation to the input tensor: output = input * weight + bias (if bias is enabled)
input | The input tensor to be transformed. |
output | The output tensor where the results will be stored. |
|
inlineexport |
Retrieves the bias tensor if present.
The bias tensor has shape [output_features] and is initialized to zeros if bias is enabled in the layer configuration.
|
inlineexport |
Retrieves the weight tensor for this linear layer.
The weight tensor has shape [output_features, input_features] and is initialized with Xavier/Glorot uniform distribution.
|
inlineexport |
Checks whether the module has a bias tensor.
|
inlineexportprivate |
Initializes gradient tensors for parameters.
Creates tensors to store gradients for weights and biases (if present). These tensors will be populated during backpropagation.
|
inlineexportprivate |
Initializes the tensors needed for the Linear operation.
Creates and initializes:
The tensors are created on the appropriate device (CPU or CUDA) based on the current device context.
|
inlineoverrideexportvirtual |
Deserializes the module state from a ZIP archive.
Loads the trainable parameters (weight, bias) from the provided archive. Note: This method is currently a placeholder and needs implementation.
zip | The ZIP archive to load the module state from. |
std::runtime_error | If the deserialization fails. |
Implements Mila::Dnn::Module< TDeviceType, TInput, TOutput >.
|
inlineoverrideexportvirtual |
Gets the number of trainable parameters in this module.
Counts the total number of trainable parameters, which includes the weight tensor and, if present, the bias tensor.
Implements Mila::Dnn::Module< TDeviceType, TInput, TOutput >.
|
inlineoverrideexportvirtual |
Serializes the module state to a ZIP archive.
Saves the trainable parameters (weight, bias) to the provided archive. Note: This method is currently a placeholder and needs implementation.
zip | The ZIP archive to save the module state to. |
std::runtime_error | If the serialization fails. |
Implements Mila::Dnn::Module< TDeviceType, TInput, TOutput >.
|
inlineoverrideexportvirtual |
Converts the module information to a human-readable string.
Includes detailed information about the module configuration including:
Implements Mila::Dnn::Module< TDeviceType, TInput, TOutput >.
|
exportprivate |
The bias tensor added after the matrix multiplication.
Shape is [output_features]. This tensor is only used if has_bias_ is true.
|
exportprivate |
Configuration for the Linear module.
|
exportprivate |
The underlying operation that implements the Linear transformation.
This operation performs the actual computation for the linear layer, with different implementations for CPU and CUDA devices.
|
exportprivate |
Cache of intermediate tensors needed for backward pass.
Stores tensors that are computed during the forward pass and are needed for gradient computation during backpropagation.
|
exportprivate |
Gradients for the parameters of this module.
Contains gradients for the weight tensor and optionally the bias tensor. These are computed during the backward pass.
|
exportprivate |
Collection of trainable parameters for this module.
Contains the weight tensor and optionally the bias tensor if has_bias_ is true. These parameters are passed to the underlying operation during forward pass.
|
exportprivate |
Additional configuration options for the linear operation.
These attributes can modify the behavior of the underlying operation implementation without changing the API.
|
exportprivate |
The weight tensor for the linear transformation.
Shape is [output_features, input_features] to transform input features to output features through matrix multiplication.