Device-templated fused SoftmaxCrossEntropy loss module. More...

#include <memory>
#include <vector>
#include <string>
#include <iostream>
#include <sstream>
#include <type_traits>
#include <cstdint>
#include <stdexcept>
import Serialization.Mode;
import Serialization.ModelArchive;
import Compute.DeviceTypeTraits;
import Compute.CpuMemoryResource;
import Compute.MemoryResource;
import Dnn.Components.CrossEntropyConfig;
import Dnn.Component;
import Dnn.Tensor;
import Dnn.ITensor;
import Dnn.TensorDataTypeTraits;
import Compute.ExecutionContext;
import Dnn.TensorTypes;
import Compute.OperationRegistry;
import Dnn.TensorDataType;
import Compute.DeviceId;
import Compute.DeviceType;
import Compute.Device;
import Compute.BinaryOperation;

Classes
class	Mila::Dnn::SoftmaxCrossEntropy< TDeviceType, TLogits, TTargets, TPrecision >
	Fused SoftmaxCrossEntropy loss module (device-templated). More...

Namespaces
namespace	Mila
	Mila main API namespace.
namespace	Mila::Dnn

Detailed Description

Device-templated fused SoftmaxCrossEntropy loss module.

Delegates compute to a BinaryOperation backend that implements the fused softmax + cross-entropy operation for numerical stability and performance.

STATUS: Work in progress. The component shell and CPU/CUDA operation stubs exist but are not wired into the build (CpuSoftmaxCrossEntropyOp and CudaSoftmaxCrossEntropyOp are excluded from CpuOperations.ixx / CudaOperations.ixx). Completion is targeted for Llama training support. The GPT reference implementation uses the host-based CpuCrossEntropyOp instead.

Classes

Namespaces

Detailed Description