Mila
Deep Neural Network Library
Loading...
Searching...
No Matches
Gelu.ixx File Reference

Implementation of the Gaussian Error Linear Unit (GELU) activation function. More...

#include <memory>
#include <vector>
#include <string>
#include <iostream>
#include <sstream>
#include <type_traits>
import Serialization.ModelArchive;
import Compute.CudaMemoryResource;
import Compute.CpuMemoryResource;
import Dnn.Modules.Gelu:Config;
import Dnn.Module;
import Compute.ComputeDevice;
import Compute.OperationAttributes;
import Compute.DeviceType;
import Compute.Precision;
import Compute.DeviceContext;
import Compute.OperationRegistry;
import Dnn.TensorTraits;
import Compute.UnaryOperation;
import Dnn.Tensor;
import Compute.OperationBase;
import Compute.MemoryResource;

Classes

class  Mila::Dnn::Gelu< TDeviceType, TDataType >
 Gaussian Error Linear Unit (GELU) activation function module. More...
 

Namespaces

namespace  Mila
 
namespace  Mila::Dnn
 

Typedefs

template<typename TDataType = float>
using Mila::Dnn::CpuGelu = Gelu< DeviceType::Cpu, TDataType >
 Type alias for CPU-specific GELU module.
 
template<typename TDataType = float>
using Mila::Dnn::CudaGelu = Gelu< DeviceType::Cuda, TDataType >
 Type alias for CUDA-specific GELU module.
 

Detailed Description

Implementation of the Gaussian Error Linear Unit (GELU) activation function.

This module implements the GELU activation function as described in: "Gaussian Error Linear Units (GELUs)" by Hendrycks and Gimpel (2016). https://arxiv.org/abs/1606.08415

GELU activation has become a standard component in transformer architectures like BERT, GPT, and their derivatives, often replacing traditional ReLU activations.