Configuration class for GPT transformer blocks. More...

Inheritance diagram for Mila::Dnn::GptBlockConfig:

Collaboration diagram for Mila::Dnn::GptBlockConfig:

Public Member Functions
	GptBlockConfig (dim_t model_dim, dim_t num_heads)
	Construct a GPT block configuration.
void	fromMetadata (const SerializationMetadata &meta)
	Populate configuration from provided metadata.
ActivationType	getActivationType () const noexcept
dim_t	getEffectiveHiddenDimension () const noexcept
	Get effective hidden dimension with default fallback.
dim_t	getHiddenSize () const noexcept
dim_t	getModelDim () const noexcept
dim_t	getNumHeads () const noexcept
float	getResidualScale () const noexcept
SerializationMetadata	toMetadata () const
	Convert configuration into a SerializationMetadata object.
std::string	toString () const override
	Produce a short, human-readable summary of the configuration.
bool	useBias () const noexcept
void	validate () const override
	Validate configuration parameters.
template<typename Self>
decltype(auto)	withActivation (this Self &&self, ActivationType activation_type)
template<typename Self>
decltype(auto)	withBias (this Self &&self, bool use_bias)
template<typename Self>
decltype(auto)	withHiddenSize (this Self &&self, dim_t hidden_dim)
template<typename Self>
decltype(auto)	withMaxSequenceLength (this Self &&self, dim_t max_seq_len)
	Set maximum sequence length for block-level positional handling.
template<typename Self>
decltype(auto)	withResidualScale (this Self &&self, float scale)
Public Member Functions inherited from Mila::Dnn::ComponentConfig
virtual	~ComponentConfig ()=default
	Virtual destructor for polymorphic base.

Private Attributes
ActivationType	activation_type_ = ActivationType::Gelu
dim_t	hidden_dim_ = 0
dim_t	max_seq_len_ = 2048
dim_t	model_dim_
dim_t	num_heads_
float	residual_scale_ = 1.0f
bool	use_bias_ = false

Detailed Description

Configuration class for GPT transformer blocks.

Holds the model dimension, attention head count, MLP/attention options, and basic activation/residual settings.

Constructor & Destructor Documentation

◆ GptBlockConfig()

Mila::Dnn::GptBlockConfig::GptBlockConfig	(	dim_t	model_dim,
		dim_t	num_heads )

inline

Construct a GPT block configuration.

Parameters

model_dim	Model dimension. Must be > 0.
num_heads	Number of query attention heads. Must be > 0 and must divide model_dim evenly.

Member Function Documentation

◆ fromMetadata()

void Mila::Dnn::GptBlockConfig::fromMetadata ( const SerializationMetadata & meta )

inlinevirtual

Populate configuration from provided metadata.

Implementations should read available keys and leave missing keys at their current/default values to preserve forward/backward compatibility.

Parameters

meta	Metadata to read configuration values from.

Implements Mila::Dnn::ComponentConfig.

Here is the call graph for this function:

◆ getActivationType()

ActivationType Mila::Dnn::GptBlockConfig::getActivationType ( ) const

inlinenoexcept

◆ getEffectiveHiddenDimension()

dim_t Mila::Dnn::GptBlockConfig::getEffectiveHiddenDimension ( ) const

inlinenoexcept

Get effective hidden dimension with default fallback.

Returns: Hidden dimension, defaulting to 4x model_dim if not set.

Here is the caller graph for this function:

◆ getHiddenSize()

dim_t Mila::Dnn::GptBlockConfig::getHiddenSize ( ) const

inlinenoexcept

◆ getModelDim()

dim_t Mila::Dnn::GptBlockConfig::getModelDim ( ) const

inlinenoexcept

◆ getNumHeads()

dim_t Mila::Dnn::GptBlockConfig::getNumHeads ( ) const

inlinenoexcept

◆ getResidualScale()

float Mila::Dnn::GptBlockConfig::getResidualScale ( ) const

inlinenoexcept

◆ toMetadata()

SerializationMetadata Mila::Dnn::GptBlockConfig::toMetadata ( ) const

inlinevirtual

Convert configuration into a SerializationMetadata object.

Implementations should include any fields required to fully reconstruct the configuration via fromMetadata.

Returns: SerializationMetadata Metadata representation of the config.

Implements Mila::Dnn::ComponentConfig.

Here is the call graph for this function:

◆ toString()

std::string Mila::Dnn::GptBlockConfig::toString ( ) const

inlineoverridevirtual

Produce a short, human-readable summary of the configuration.

Implementations should return a compact, single-line description suitable for logging and debugging.

Returns: std::string Human-readable summary of the configuration.

Implements Mila::Dnn::ComponentConfig.

Here is the call graph for this function:

◆ useBias()

bool Mila::Dnn::GptBlockConfig::useBias ( ) const

inlinenoexcept

◆ validate()

void Mila::Dnn::GptBlockConfig::validate ( ) const

inlineoverridevirtual

Validate configuration parameters.

Called by callers to ensure the configuration represents a valid, constructible component. Implementations must throw std::invalid_argument (or a derived exception) when validation fails.

Exceptions

std::invalid_argument If the configuration is invalid.

Implements Mila::Dnn::ComponentConfig.

◆ withActivation()

template<typename Self>

decltype(auto) Mila::Dnn::GptBlockConfig::withActivation	(	this Self &&	self,
		ActivationType	activation_type )

inline

◆ withBias()

template<typename Self>

decltype(auto) Mila::Dnn::GptBlockConfig::withBias	(	this Self &&	self,
		bool	use_bias )

inline

◆ withHiddenSize()

template<typename Self>

decltype(auto) Mila::Dnn::GptBlockConfig::withHiddenSize	(	this Self &&	self,
		dim_t	hidden_dim )

inline

Here is the caller graph for this function:

◆ withMaxSequenceLength()

template<typename Self>

decltype(auto) Mila::Dnn::GptBlockConfig::withMaxSequenceLength	(	this Self &&	self,
		dim_t	max_seq_len )

inline

Set maximum sequence length for block-level positional handling.

Parameters

max_seq_len Maximum sequence length (must be > 0).

◆ withResidualScale()

template<typename Self>

decltype(auto) Mila::Dnn::GptBlockConfig::withResidualScale	(	this Self &&	self,
		float	scale )

inline

Member Data Documentation

◆ activation_type_

ActivationType Mila::Dnn::GptBlockConfig::activation_type_ = ActivationType::Gelu

private

◆ hidden_dim_

dim_t Mila::Dnn::GptBlockConfig::hidden_dim_ = 0

private

◆ max_seq_len_

dim_t Mila::Dnn::GptBlockConfig::max_seq_len_ = 2048

private

◆ model_dim_

dim_t Mila::Dnn::GptBlockConfig::model_dim_

private

◆ num_heads_

dim_t Mila::Dnn::GptBlockConfig::num_heads_

private

◆ residual_scale_

float Mila::Dnn::GptBlockConfig::residual_scale_ = 1.0f

private

◆ use_bias_

bool Mila::Dnn::GptBlockConfig::use_bias_ = false

private

The documentation for this class was generated from the following file:

/__w/Mila/Mila/Mila/Src/Dnn/Components/Transformers/Gpt/GptBlock.Config.ixx

Public Member Functions

Private Attributes

Detailed Description

Constructor & Destructor Documentation

◆ GptBlockConfig()

Member Function Documentation

◆ fromMetadata()

◆ getActivationType()

◆ getEffectiveHiddenDimension()

◆ getHiddenSize()

◆ getModelDim()

◆ getNumHeads()

◆ getResidualScale()

◆ toMetadata()

◆ toString()

◆ useBias()

◆ validate()

◆ withActivation()

◆ withBias()

◆ withHiddenSize()

◆ withMaxSequenceLength()

◆ withResidualScale()

Member Data Documentation

◆ activation_type_

◆ hidden_dim_

◆ max_seq_len_

◆ model_dim_

◆ num_heads_

◆ residual_scale_

◆ use_bias_