|
Mila 0.13.48
Deep Neural Network Library
|
CPU implementation of the cross entropy loss operation for neural networks. More...


Public Types | |
| using | MR = typename CpuDevice::MR |
| using | OperationBase = UnaryOperation<DeviceType::Cpu, int, float> |
| Public Types inherited from Mila::Dnn::Compute::UnaryOperation< DeviceType::Cpu, int, float > | |
| using | MR |
| using | TensorInputType |
| using | TensorOutputType |
| Public Types inherited from Mila::Dnn::Compute::Operation< TDeviceType, TPrecision > | |
| using | DataTypeTraits |
Public Member Functions | |
| CpuCrossEntropyOp (const CrossEntropyConfig &config) | |
| Constructs a new CPU Cross Entropy operation with the default device context. | |
| void | backward (const Tensor< int, MR > &input, const Tensor< float, MR > &output, const Tensor< float, MR > &output_gradient, const std::vector< std::shared_ptr< ITensor > > ¶meters, std::vector< std::shared_ptr< Tensor< float, MR > > > ¶meter_gradients, Tensor< int, MR > &input_gradient, const std::vector< std::shared_ptr< Tensor< float, MR > > > &output_state) const |
| Performs the backward pass of the cross entropy operation. | |
| void | backward_impl (float *dlogits, const float *dlosses, const float *probs, const Tensor< int, CpuMemoryResource > &targets, int B, int T, int V, int Vp) const |
| Helper method for the backward pass implementation. | |
| void | forward (const Tensor< int, MR > &input, const std::vector< std::shared_ptr< ITensor > > ¶meters, Tensor< float, MR > &output, std::vector< std::shared_ptr< Tensor< float, MR > > > &output_state) const override |
| Performs the forward pass of the cross entropy operation. | |
| std::string | getName () const override |
| Gets the name of this operation. | |
| Public Member Functions inherited from Mila::Dnn::Compute::UnaryOperation< DeviceType::Cpu, int, float > | |
| virtual | ~UnaryOperation ()=default |
| virtual void | backward (const ITensor &input, const ITensor &output_grad, ITensor &input_grad) const=0 |
| Backward pass: compute gradient wrt input given output gradient. | |
| virtual void | forward (const ITensor &input, ITensor &output) const=0 |
| Forward pass: compute output = f(input). | |
| Public Member Functions inherited from Mila::Dnn::Compute::Operation< TDeviceType, TPrecision > | |
| virtual | ~Operation ()=default |
| virtual void | build (const BuildContext &build_context) |
| Prepare the operation for a concrete input shape. | |
| virtual void | clearGradients () noexcept |
| Clear any cached gradient pointers held by the operation. | |
| virtual TensorDataType | getDataType () const |
| Tensor data type for this operation. | |
| virtual DeviceType | getDeviceType () const |
| Device type for this operation. | |
| virtual OperationType | getOperationType () const=0 |
| Operation type identifier. | |
| virtual std::size_t | getStateMemorySize () const |
| Returns the number of bytes of state memory allocated by this operation. | |
| virtual bool | isBuilt () const |
| Whether build() completed successfully for a concrete input shape. | |
| virtual bool | isEvalMode () const |
| Query whether operation is configured for training. | |
| virtual void | setGradients (ITensor *weight_grad, ITensor *bias_grad) |
| Bind module-owned gradient tensors to the operation. | |
| virtual void | setParameters (ITensor *weight, ITensor *bias) |
| Bind module-owned parameter tensors to the operation. | |
| virtual void | setTrainingMode (TrainingMode training_mode) |
| Configure operation training-mode behavior. | |
Private Attributes | |
| CrossEntropyConfig | config_ |
| Configuration for the CrossEntropy operation. | |
Additional Inherited Members | |
| Static Public Attributes inherited from Mila::Dnn::Compute::Operation< TDeviceType, TPrecision > | |
| static constexpr TensorDataType | data_type |
| static constexpr DeviceType | device_type |
| Static Protected Member Functions inherited from Mila::Dnn::Compute::UnaryOperation< DeviceType::Cpu, int, float > | |
| static const TensorInputType & | asInputTensor (const ITensor &t) |
| static TensorOutputType & | asOutputTensor (ITensor &t) |
| Protected Attributes inherited from Mila::Dnn::Compute::Operation< TDeviceType, TPrecision > | |
| bool | is_built_ |
| TrainingMode | training_mode_ |
CPU implementation of the cross entropy loss operation for neural networks.
This class provides a CPU-based implementation of the cross entropy loss function, which is commonly used in classification tasks. It computes the negative log likelihood of the correct class given the predicted probabilities.
| TInput | The data type of the input tensor elements (typically int for class indices). |
| TDataType | The data type used for computation and output (typically float). |
| using Mila::Dnn::Compute::CpuCrossEntropyOp::MR = typename CpuDevice::MR |
| using Mila::Dnn::Compute::CpuCrossEntropyOp::OperationBase = UnaryOperation<DeviceType::Cpu, int, float> |
|
inline |
Constructs a new CPU Cross Entropy operation with the default device context.
Initializes the operation with a CPU device context.
|
inline |
Performs the backward pass of the cross entropy operation.
Computes gradients with respect to inputs and probabilities.
| input | Input tensor from the forward pass (target indices). |
| output | Output tensor from the forward pass (loss values). |
| output_gradient | Gradient of the loss with respect to the output. |
| parameters | Parameters tensor from forward pass (probabilities). |
| parameter_gradients | Gradients for parameters (probabilities). |
| input_gradient | Gradient of the loss with respect to the input (unused for integer targets). |
| attributes | Additional attributes for the operation. |
| output_state | Cache tensors from forward pass. |

|
inline |
Helper method for the backward pass implementation.
Computes gradients for the combined softmax and cross entropy operation.
| dlogits | Gradient buffer for logits/probabilities. |
| dlosses | Gradient buffer from output loss. |
| probs | Original probability values. |
| targets | Target class indices. |
| B | Batch size. |
| TDataType | Sequence length. |
| V | Vocabulary size (without padding). |
| Vp | Padded vocabulary size. |


|
inlineoverride |
Performs the forward pass of the cross entropy operation.
Computes the negative log likelihood of the correct class for each sample.
| input | Input tensor containing target class indices of shape [B, TDataType]. |
| parameters | Parameters tensor containing probabilities of shape [B, TDataType, V]. |
| attributes | Additional attributes for the operation. |
| output | Output tensor to store the cross entropy losses of shape [B, TDataType]. |
| output_state | Cache for storing intermediate results (used in backward pass). |

|
inlineoverridevirtual |
Gets the name of this operation.
Implements Mila::Dnn::Compute::Operation< TDeviceType, TPrecision >.
|
private |
Configuration for the CrossEntropy operation.