Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
CudaSoftmaxCrossEntropyOp.ixx File Reference

Fused CUDA implementation of Softmax + CrossEntropy loss operation. More...

#include <cuda_runtime.h>
#include <cuda_fp16.h>
#include <vector>
#include <memory>
#include <string>
#include <stdexcept>
#include <cstdint>
#include <type_traits>
#include "Kernels/SoftmaxCrossEntropy.cuh"
import Compute.CudaDeviceMemoryResource;
import Compute.MemoryResource;
import Dnn.Components.CrossEntropyConfig;
import Dnn.Component;
import Dnn.ComponentConfig;
import Dnn.Tensor;
import Dnn.TensorDataType;
import Compute.BinaryOperation;
import Compute.CudaTensorDataType;
import Dnn.TensorDataTypeTraits;
import Dnn.ITensor;
import Compute.OperationBase;
import Dnn.TensorTypes;
import Compute.OperationRegistry;
import Compute.DeviceType;
import Compute.ExecutionContext;
import Compute.OperationType;

Classes

struct  Mila::Dnn::Compute::Cuda::SoftmaxCrossEntropy::Detail::cuda_softmax_crossentropy_impl< float >
struct  Mila::Dnn::Compute::Cuda::SoftmaxCrossEntropy::Detail::cuda_softmax_crossentropy_impl< half >
class  Mila::Dnn::Compute::Cuda::SoftmaxCrossEntropy::CudaSoftmaxCrossEntropyOp< TPrecision, TLogits, TTargets >
 Fused CUDA implementation of Softmax + CrossEntropy using abstract TensorDataType API. More...
class  Mila::Dnn::Compute::Cuda::SoftmaxCrossEntropy::CudaSoftmaxCrossEntropyOpRegistrar
 Registrar for fused Softmax+CrossEntropy CUDA operation. More...

Namespaces

namespace  Mila
 Mila main API namespace.
namespace  Mila::Dnn
namespace  Mila::Dnn::Compute
namespace  Mila::Dnn::Compute::Cuda
namespace  Mila::Dnn::Compute::Cuda::SoftmaxCrossEntropy
namespace  Mila::Dnn::Compute::Cuda::SoftmaxCrossEntropy::Detail
 Namespace for CUDA fused softmax cross entropy implementation details.

Detailed Description

Fused CUDA implementation of Softmax + CrossEntropy loss operation.

Combines softmax and cross-entropy into a single numerically stable operation following the ExecutionContext / TensorDataType BinaryOperation interface.

Key advantages over separate Softmax + CrossEntropy:

  • Numerical stability: Uses log-sum-exp trick throughout
  • Performance: Single GPU kernel pass, no materialized probability distribution
  • Simplified gradient: dL/dlogits = softmax(logits) - one_hot(targets)
  • Memory efficiency: No intermediate probability tensor exposed in API