|
Mila 0.13.48
Deep Neural Network Library
|
CPU implementation of GELU activation operation using abstract TensorDataType. More...


Public Types | |
| using | CpuExecutionContext = ExecutionContext<DeviceType::Cpu> |
| using | MR = CpuMemoryResource |
| using | TensorType = Tensor<TensorDataType::FP32, MR> |
| using | UnaryOperationBase = UnaryOperation<DeviceType::Cpu, TensorDataType::FP32> |
| Public Types inherited from Mila::Dnn::Compute::UnaryOperation< DeviceType::Cpu, TensorDataType::FP32 > | |
| using | MR |
| using | TensorInputType |
| using | TensorOutputType |
| Public Types inherited from Mila::Dnn::Compute::Operation< TDeviceType, TInput > | |
| using | DataTypeTraits |
Public Member Functions | |
| CpuGeluOp (IExecutionContext *context, const GeluConfig &config) | |
| Constructs a new CpuGeluOp with a specific execution context. | |
| void | backward (const ITensor &input, const ITensor &output_grad, ITensor &input_grad) const override |
| Performs the backward pass of the GELU activation function. | |
| void | build (const BuildContext &config) override |
| Prepare the operation for a concrete input shape. | |
| void | forward (const ITensor &input, ITensor &output) const override |
| Performs the forward pass of the GELU activation function. | |
| std::string | getName () const override |
| Gets the name of this operation. | |
| OperationType | getOperationType () const override |
| Operation type identifier. | |
| Public Member Functions inherited from Mila::Dnn::Compute::UnaryOperation< DeviceType::Cpu, TensorDataType::FP32 > | |
| virtual | ~UnaryOperation ()=default |
| Public Member Functions inherited from Mila::Dnn::Compute::Operation< TDeviceType, TInput > | |
| virtual | ~Operation ()=default |
| virtual void | clearGradients () noexcept |
| Clear any cached gradient pointers held by the operation. | |
| virtual TensorDataType | getDataType () const |
| Tensor data type for this operation. | |
| virtual DeviceType | getDeviceType () const |
| Device type for this operation. | |
| virtual std::size_t | getStateMemorySize () const |
| Returns the number of bytes of state memory allocated by this operation. | |
| virtual bool | isBuilt () const |
| Whether build() completed successfully for a concrete input shape. | |
| virtual bool | isEvalMode () const |
| Query whether operation is configured for training. | |
| virtual void | setGradients (ITensor *weight_grad, ITensor *bias_grad) |
| Bind module-owned gradient tensors to the operation. | |
| virtual void | setParameters (ITensor *weight, ITensor *bias) |
| Bind module-owned parameter tensors to the operation. | |
| virtual void | setTrainingMode (TrainingMode training_mode) |
| Configure operation training-mode behavior. | |
Private Attributes | |
| GeluConfig | config_ |
| Configuration for the GELU operation (approximation method, etc.). | |
| IExecutionContext * | context_ { nullptr } |
Additional Inherited Members | |
| Static Public Attributes inherited from Mila::Dnn::Compute::Operation< TDeviceType, TInput > | |
| static constexpr TensorDataType | data_type |
| static constexpr DeviceType | device_type |
| Static Protected Member Functions inherited from Mila::Dnn::Compute::UnaryOperation< DeviceType::Cpu, TensorDataType::FP32 > | |
| static const TensorInputType & | asInputTensor (const ITensor &t) |
| static TensorOutputType & | asOutputTensor (ITensor &t) |
| Protected Attributes inherited from Mila::Dnn::Compute::Operation< TDeviceType, TInput > | |
| bool | is_built_ |
| TrainingMode | training_mode_ |
CPU implementation of GELU activation operation using abstract TensorDataType.
Implements the Gaussian Error Linear Unit (GELU) activation function for CPU devices. Supports multiple approximation methods as configured via GeluConfig:
Key features:
| TPrecision | Abstract compute precision (TensorDataType enum) |
| using Mila::Dnn::Compute::CpuGeluOp::UnaryOperationBase = UnaryOperation<DeviceType::Cpu, TensorDataType::FP32> |
|
inline |
Constructs a new CpuGeluOp with a specific execution context.
| context | The execution context to use for this operation. |
| config | Configuration for GELU operation. |
| std::runtime_error | If the context is not for a CPU device. |
|
inlineoverridevirtual |
Performs the backward pass of the GELU activation function.
Computes the gradient of the GELU function with respect to its input.
| input | Original input tensor from forward pass (may be scalar). |
| output_grad | Gradient from next layer (dL/doutput, may be scalar). |
| parameters | Parameter tensors (not used in GELU). |
| parameter_grads | Parameter gradients (not used in GELU). |
| input_grad | Gradient to propagate to previous layer (dL/dinput, may be scalar). |
| output_state | Cached tensors from forward pass (not used currently). |
Implements Mila::Dnn::Compute::UnaryOperation< DeviceType::Cpu, TensorDataType::FP32 >.

|
inlineoverridevirtual |
Prepare the operation for a concrete input shape.
Default implementation is a no-op. Operations requiring shape-dependent setup should override this method.
Reimplemented from Mila::Dnn::Compute::Operation< TDeviceType, TInput >.
|
inlineoverridevirtual |
Performs the forward pass of the GELU activation function.
Implements the Gaussian Error Linear Unit (GELU) activation function using the tanh approximation: GELU(x) = 0.5 * x * (1 + tanh(sqrt(2/pi) * (x + 0.044715 * x^3)))
Tensor shape handling:
| input | The input tensor (may be scalar, rank 0). |
| parameters | Parameter tensors (not used in GELU). |
| output | The output tensor (resized to match input shape). |
| output_state | Cache for intermediate results (not used in current implementation). |
Implements Mila::Dnn::Compute::UnaryOperation< DeviceType::Cpu, TensorDataType::FP32 >.

|
inlineoverridevirtual |
Gets the name of this operation.
Implements Mila::Dnn::Compute::Operation< TDeviceType, TInput >.
|
inlineoverridevirtual |
Operation type identifier.
Implements Mila::Dnn::Compute::Operation< TDeviceType, TInput >.
|
private |
Configuration for the GELU operation (approximation method, etc.).
|
private |