|
Mila
Deep Neural Network Library
|
Implementation of the Gaussian Error Linear Unit (GELU) activation function. More...
#include <memory>#include <vector>#include <string>#include <iostream>#include <sstream>#include <type_traits>import Serialization.ModelArchive;import Compute.CudaMemoryResource;import Compute.CpuMemoryResource;import Dnn.Modules.Gelu:Config;import Dnn.Module;import Compute.ComputeDevice;import Compute.OperationAttributes;import Compute.DeviceType;import Compute.Precision;import Compute.DeviceContext;import Compute.OperationRegistry;import Dnn.TensorTraits;import Compute.UnaryOperation;import Dnn.Tensor;import Compute.OperationBase;import Compute.MemoryResource;Classes | |
| class | Mila::Dnn::Gelu< TDeviceType, TDataType > |
| Gaussian Error Linear Unit (GELU) activation function module. More... | |
Namespaces | |
| namespace | Mila |
| namespace | Mila::Dnn |
Typedefs | |
| template<typename TDataType = float> | |
| using | Mila::Dnn::CpuGelu = Gelu< DeviceType::Cpu, TDataType > |
| Type alias for CPU-specific GELU module. | |
| template<typename TDataType = float> | |
| using | Mila::Dnn::CudaGelu = Gelu< DeviceType::Cuda, TDataType > |
| Type alias for CUDA-specific GELU module. | |
Implementation of the Gaussian Error Linear Unit (GELU) activation function.
This module implements the GELU activation function as described in: "Gaussian Error Linear Units (GELUs)" by Hendrycks and Gimpel (2016). https://arxiv.org/abs/1606.08415
GELU activation has become a standard component in transformer architectures like BERT, GPT, and their derivatives, often replacing traditional ReLU activations.