Mila
Deep Neural Network Library
|
Softmax module for neural networks. More...
Public Types | |
using | ModuleBase = Module< TDeviceType, TInput, TOutput > |
Alias for base module type. | |
using | MR = std::conditional_t< TDeviceType==DeviceType::Cuda, CudaMemoryResource, CpuMemoryResource > |
Memory resource type used for tensors, selected based on device type. | |
![]() | |
using | MR = std::conditional_t< TDeviceType==DeviceType::Cuda, CudaMemoryResource, CpuMemoryResource > |
Public Member Functions | |
Softmax (const std::string &device_name, const SoftmaxConfig &config) | |
Constructs a new Softmax module with a device name. | |
Softmax (std::shared_ptr< DeviceContext > device_context, const SoftmaxConfig &config) | |
Constructs a new Softmax module with a provided device context. | |
void | backward (const Tensor< TInput, MR > &input, const Tensor< TOutput, MR > &output_grad, Tensor< TInput, MR > &input_grad) |
Performs the backward pass of the Softmax operation. | |
void | forward (const Tensor< TInput, MR > &input, Tensor< TOutput, MR > &output) |
Performs the forward pass of the softmax operation. | |
int64_t | getAxis () const |
Gets the axis used for softmax computation. | |
void | load (ModelArchive &archive) override |
Deserializes the module state from a ZIP archive. | |
size_t | parameterCount () const override |
Gets the number of trainable parameters in this module. | |
void | save (ModelArchive &zip) const override |
Serializes the module state to a ZIP archive. | |
std::string | toString () const override |
Generates a string representation of this module's configuration. | |
![]() | |
Module (const std::string &device_name, const ComponentConfig &config) | |
Constructor with device name. | |
Module (std::shared_ptr< DeviceContext > context, const ComponentConfig &config) | |
Constructor with a specific device context. | |
virtual | ~Module ()=default |
Virtual destructor for proper cleanup in derived classes. | |
std::shared_ptr< Compute::DeviceContext > | getDeviceContext () const |
Get the device context for this module. | |
Compute::DeviceType | getDeviceType () const |
Get the device type of the current device context. | |
std::string | getName () const |
Get the name of the module. | |
const auto & | getParameterTensors () const |
Get the parameter tensors of this module. | |
const ComputePrecision::Policy & | getPrecision () const |
const auto & | getStateTensors () const |
Get the state tensors of this module. | |
bool | isTraining () const |
Check if the module is in training mode. | |
virtual void | setTraining (bool is_training) |
Set the training mode of this module. | |
Private Member Functions | |
void | createOperation () |
Creates the appropriate softmax operation for the current device. | |
Private Attributes | |
OperationAttributes | attributes_ |
Operation attributes and configuration. | |
SoftmaxConfig | config_ |
Configuration for the Softmax module. | |
std::shared_ptr< UnaryOperation< TDeviceType, TInput, TOutput > > | operation_ { nullptr } |
The operation that implements the softmax calculation. | |
std::vector< std::shared_ptr< Tensor< TOutput, MR > > > | output_state_ |
Collection of output state tensors for caching. | |
std::vector< std::shared_ptr< Tensor< TInput, MR > > > | parameters_ |
Collection of parameters for this module (empty for Softmax). | |
Additional Inherited Members | |
![]() | |
const std::string | parametersToString () const |
Helper method to convert parameters to string representation. | |
const std::string | stateToString () const |
Helper method to convert state tensors to string representation. | |
![]() | |
std::unordered_map< std::string, std::shared_ptr< Tensor< TOutput, MR > > > | parameter_map_ = {} |
Map of parameter names to parameter tensors. | |
std::unordered_map< std::string, std::shared_ptr< Tensor< TOutput, MR > > > | state_map_ = {} |
Map of state names to state tensors. | |
Softmax module for neural networks.
This class implements the softmax function, which is often used in the final layer of a neural network to convert raw scores into probabilities. The softmax operation normalizes the input values by applying:
softmax(x_i) = exp(x_i) / sum(exp(x_j)) for all j
where the sum is computed over the specified axis. This normalization ensures all values sum to 1, allowing them to be interpreted as probabilities for classification tasks.
TDeviceType | The device type (CPU or CUDA) on which to perform computations. |
TInput | The data type of the input tensor elements. |
TOutput | The data type of the output tensor elements, defaults to TInput. |
|
export |
Alias for base module type.
|
export |
Memory resource type used for tensors, selected based on device type.
|
inlineexplicitexport |
Constructs a new Softmax module with a device name.
Creates a new DeviceContext internally using the provided device name. This constructor is useful for creating standalone modules without pre-existing device contexts.
device_name | The name of the device to use (e.g., "CPU", "CUDA:0"). |
config | Configuration parameters for the Softmax module. |
std::invalid_argument | If the device name is invalid or the configuration is invalid |
std::runtime_error | If device type doesn't match template parameter TDeviceType |
|
inlineexplicitexport |
Constructs a new Softmax module with a provided device context.
Uses a pre-existing DeviceContext instance. This constructor is useful when integrating the module into a larger network that shares device contexts across modules.
device_context | The device context to use for this module. |
config | Configuration parameters for the Softmax module. |
std::invalid_argument | If device_context is null or configuration is invalid |
std::runtime_error | If device context type doesn't match template parameter TDeviceType |
|
inlineexport |
Performs the backward pass of the Softmax operation.
Computes the gradient of the softmax function with respect to its inputs. The gradient of softmax is more complex than most activations because each output depends on all inputs in the same dimension.
input | The input tensor from the forward pass. |
output_grad | The gradient of loss with respect to the output. |
input_grad | The tensor to store gradients with respect to input. |
|
inlineexportprivate |
Creates the appropriate softmax operation for the current device.
Instantiates either a CPU or CUDA softmax operation based on the device type. Sets the axis attribute needed by the operation to properly apply softmax along the specified dimension.
|
inlineexport |
Performs the forward pass of the softmax operation.
Computes the softmax of the input tensor along the specified axis and writes the result to the output tensor. The operation exponentiates each element and then normalizes by the sum of all exponentiated values along the specified axis.
input | The input tensor to apply softmax to. |
output | The tensor where softmax results will be stored. |
|
inlineexport |
Gets the axis used for softmax computation.
|
inlineoverrideexportvirtual |
Deserializes the module state from a ZIP archive.
Implementation of the Module interface for deserialization. Since Softmax has no learnable parameters, this is a no-op implementation.
zip | ZIP archive for deserialization |
Implements Mila::Dnn::Module< TDeviceType, TInput, TOutput >.
|
inlineoverrideexportvirtual |
Gets the number of trainable parameters in this module.
The Softmax module has no trainable parameters as it's a fixed mathematical operation.
Implements Mila::Dnn::Module< TDeviceType, TInput, TOutput >.
|
inlineoverrideexportvirtual |
Serializes the module state to a ZIP archive.
Implementation of the Module interface for serialization. Since Softmax has no learnable parameters, this is a no-op implementation.
zip | ZIP archive for serialization |
Implements Mila::Dnn::Module< TDeviceType, TInput, TOutput >.
|
inlineoverrideexportvirtual |
Generates a string representation of this module's configuration.
Implements Mila::Dnn::Module< TDeviceType, TInput, TOutput >.
|
exportprivate |
Operation attributes and configuration.
|
exportprivate |
Configuration for the Softmax module.
|
exportprivate |
The operation that implements the softmax calculation.
|
exportprivate |
Collection of output state tensors for caching.
|
exportprivate |
Collection of parameters for this module (empty for Softmax).