|
Mila 0.13.48
Deep Neural Network Library
|
Softmax activation module (device-templated). More...


Public Types | |
| using | ComponentBase = Component<TDeviceType, TPrecision> |
| using | MR = typename DeviceTypeTraits<TDeviceType>::memory_resource |
| using | TensorType = Tensor<TPrecision, MR> |
Public Member Functions | |
| Softmax (const std::string &name, const SoftmaxConfig &config, std::optional< DeviceId > device_id=std::nullopt) | |
| ~Softmax () override=default | |
| void | backward (const ITensor &input, const ITensor &output_grad, ITensor &input_grad) |
| Backward pass - delegates to backend operation. | |
| void | forward (const ITensor &input, ITensor &output) |
| Forward pass - delegates to backend operation. | |
| int64_t | getAxis () const noexcept |
| Get the softmax axis. | |
| DeviceId | getDeviceId () const override |
| Get the device identifier for this module. | |
| std::vector< ITensor * > | getGradients () const override |
| Get parameter gradient tensors. | |
| MemoryStats | getMemoryStats () const override |
| Return the current memory allocation breakdown for this component. | |
| std::vector< ITensor * > | getParameters () const override |
| Get trainable parameter tensors. | |
| const ComponentType | getType () const override |
| Get the component type identifier. | |
| size_t | parameterCount () const override |
| Number of trainable parameters. | |
| void | save_ (ModelArchive &archive, SerializationMode mode) const override |
| Persist module state to archive. | |
| void | synchronize () override |
| Wait for all asynchronous work submitted by this module to complete. | |
| std::string | toString () const override |
| Generate human-readable description of the module. | |
| Public Member Functions inherited from Mila::Dnn::Component< TDeviceType, TPrecision > | |
| Component (const std::string &name) | |
| Construct component with required name identifier. | |
| virtual | ~Component ()=default |
| virtual void | build (const BuildContext &context) final |
| Build the component with the provided BuildContext (canonical overload). | |
| const std::string | getName () const |
| Get the component's name identifier. | |
| virtual std::vector< std::string > | getParameterNames () const |
| List all available parameter names for this component. | |
| RuntimeMode | getRuntimeMode () const noexcept |
| Convenience accessor — true if currently in Eval mode. | |
| TrainingMode | getTrainingMode () const noexcept |
| The current runtime behavioral mode of this Component. | |
| virtual bool | isBuilt () const final |
| Returns true if build() has completed successfully. | |
| bool | isInferenceMode () const noexcept |
| bool | isTrainingMode () const noexcept |
| virtual void | loadParameter (const std::string &name, const Serialization::ITensorBlob &blob) |
| Load a parameter from serialized tensor data. | |
| void | setTrainingMode (TrainingMode mode) |
| Set the runtime behavioral mode for this Component. | |
| virtual void | zeroGradients () |
| Clear all model-owned gradients for this component. | |
Protected Member Functions | |
| void | onBuilding (const BuildContext &build_config) override |
| Hook invoked during build() to initialize component with input shape. | |
| void | onExecutionContextSet () override |
| Get the configuration. | |
| void | onTrainingModeChanging (TrainingMode training_mode) override |
| Hook invoked when training mode changes. | |
| Protected Member Functions inherited from Mila::Dnn::Component< TDeviceType, TPrecision > | |
| IExecutionContext * | getExecutionContext () const |
| Get the shared execution context. | |
| bool | hasExecutionContext () const noexcept |
| Check if execution context has been set. | |
| template<TensorDataType TParameterPrecision, typename TMemoryResource> | |
| void | loadParameterFromBlob (const std::string ¶m_name, const Serialization::ITensorBlob &blob, Tensor< TParameterPrecision, TMemoryResource > &target, const shape_t &expected_shape) |
| Load a tensor blob into a parameter tensor with validation. | |
| void | setExecutionContext (IExecutionContext *context) |
| Set the execution context for this component. | |
Private Types | |
| using | OpType = typename OperationTraits<OperationType::SoftmaxOp, TDeviceType, TPrecision>::type |
Private Member Functions | |
| void | createOperation () |
| Create the backend compute operation. | |
| void | validateInputShape (const ITensor &input) const |
| Validate input shape for softmax operation. | |
| void | validateInputShape (const shape_t &input_shape) const |
| Validate input shape for softmax operation. | |
Private Attributes | |
| SoftmaxConfig | config_ |
| std::shared_ptr< OpType > | operation_ { nullptr } |
| std::unique_ptr< IExecutionContext > | owned_exec_context_ { nullptr } |
Additional Inherited Members | |
| Static Public Member Functions inherited from Mila::Dnn::Component< TDeviceType, TPrecision > | |
| static constexpr DeviceType | getDeviceType () |
| Compile-time device type for this component instance. | |
| static constexpr TensorDataType | getPrecision () noexcept |
| Compile-time tensor precision for this component instance. | |
| Protected Attributes inherited from Mila::Dnn::Component< TDeviceType, TPrecision > | |
| BuildContext | build_context_ { shape_t{ 1 }, RuntimeMode::Training } |
| The BuildContext stored at build time. | |
Softmax activation module (device-templated).
Delegates computation to a device-specific UnaryOperation implementation registered in the OperationRegistry.
Softmax is a stateless activation function with no trainable parameters. The operation computes: softmax(x) = exp(x - max(x)) / sum(exp(x - max(x))) across a specified axis.
Construction Modes:
Ownership:
| TDeviceType | Device type (DeviceType::Cpu or DeviceType::Cuda) |
| TPrecision | Abstract tensor precision (TensorDataType) |
|
inlineexplicitexport |

|
overrideexportdefault |
|
inlineexport |
Backward pass - delegates to backend operation.
Computes gradient: dX = Y * (dY - dot(Y, dY)) where Y is the softmax output.

|
inlineexportprivate |
Create the backend compute operation.
Uses the shared ExecutionContext from the base class to request a device-specific UnaryOperation from the OperationRegistry.


|
inlineexport |
Forward pass - delegates to backend operation.
Computes softmax activation across the configured axis.

|
inlineexportnoexcept |
Get the softmax axis.
|
inlineoverrideexportvirtual |
Get the device identifier for this module.
Returns the DeviceId from the ExecutionContext. In standalone mode, this is the device specified at construction. In shared mode, this is the parent's device.
Implements Mila::Dnn::Component< TDeviceType, TPrecision >.

|
inlineoverrideexportvirtual |
Get parameter gradient tensors.
Softmax has no trainable parameters, therefore no gradients.
Implements Mila::Dnn::Component< TDeviceType, TPrecision >.
|
inlineoverrideexportvirtual |
Return the current memory allocation breakdown for this component.
Reflects allocations at the moment of the call. The returned stats naturally track the component lifecycle:
After construction — parameters only After build( Inference ) — parameters + T=1 state buffers After build( Training ) — parameters + T=full state buffers After setEvaluation( false ) — parameters + state + gradients
For CompositeComponent and Network, the returned stats are the recursive aggregate of all child components.
May be called at any time — no lifecycle preconditions.
Implements Mila::Dnn::Component< TDeviceType, TPrecision >.
|
inlineoverrideexportvirtual |
Get trainable parameter tensors.
Softmax has no trainable parameters.
Implements Mila::Dnn::Component< TDeviceType, TPrecision >.
|
inlineoverrideexportvirtual |
Get the component type identifier.
Used for serialization and runtime type identification.
Implements Mila::Dnn::Component< TDeviceType, TPrecision >.
|
inlineoverrideexportprotectedvirtual |
Hook invoked during build() to initialize component with input shape.
Softmax is stateless and has no parameters to allocate. This method validates the input shape and delegates to the backend operation's build method to cache dimension computations.
| input_shape | Expected shape for input tensors. |
| std::invalid_argument | if input_shape is invalid or axis out of bounds. |
| std::runtime_error | if backend build fails. |
Reimplemented from Mila::Dnn::Component< TDeviceType, TPrecision >.

|
inlineoverrideexportprotectedvirtual |
Get the configuration.
Hook invoked after ExecutionContext is set.
Called by Component::setExecutionContext() after the context is registered. Creates the backend UnaryOperation using the OperationRegistry.
This hook is triggered in two scenarios:
| std::runtime_error | if operation creation fails. |
Reimplemented from Mila::Dnn::Component< TDeviceType, TPrecision >.

|
inlineoverrideexportprotectedvirtual |
Hook invoked when training mode changes.
Propagates training mode to the backend operation. Called by Component::setTraining() with the training mutex held.
| is_training | New training mode state. |
Reimplemented from Mila::Dnn::Component< TDeviceType, TPrecision >.
|
inlineoverrideexportvirtual |
Number of trainable parameters.
Softmax is stateless and exposes no trainable parameters.
Implements Mila::Dnn::Component< TDeviceType, TPrecision >.
|
inlineoverrideexportvirtual |
Persist module state to archive.
Softmax is stateless (no trainable tensors) but persists:
| archive | Archive to write to. |
| mode | Serialization mode (currently unused for stateless components). |
Implements Mila::Dnn::Component< TDeviceType, TPrecision >.
|
inlineoverrideexportvirtual |
Wait for all asynchronous work submitted by this module to complete.
Synchronizes the underlying ExecutionContext. On CPU implementations this may be a no-op. Use to ensure results are visible on the host or to measure synchronous timings.
Implements Mila::Dnn::Component< TDeviceType, TPrecision >.

|
inlineoverrideexportvirtual |
Generate human-readable description of the module.
Produces a multi-line string showing:
Implements Mila::Dnn::Component< TDeviceType, TPrecision >.

|
inlineexportprivate |
Validate input shape for softmax operation.
Ensures the input has valid rank and the configured axis is within bounds.


|
inlineexportprivate |
Validate input shape for softmax operation.
Ensures the input has valid rank and the configured axis is within bounds.
