Network-level configuration for LLaMA-style transformer networks. More...

Inheritance diagram for Mila::Dnn::LlamaConfig:

Collaboration diagram for Mila::Dnn::LlamaConfig:

Public Member Functions
	LlamaConfig (dim_t embedding_dim, dim_t num_layers)
	Construct a LLaMA network configuration.
void	fromMetadata (const SerializationMetadata &meta)
	Populate configuration from provided metadata.
dim_t	getHiddenDimension () const noexcept
dim_t	getMaxSequenceLength () const noexcept
dim_t	getModelDim () const noexcept
dim_t	getNumHeads () const noexcept
dim_t	getNumKVHeads () const noexcept
dim_t	getNumLayers () const noexcept
float	getRMSNormEpsilon () const noexcept
float	getRoPEScalingFactor () const noexcept
float	getRoPETheta () const noexcept
dim_t	getVocabSize () const noexcept
SerializationMetadata	toMetadata () const
	Convert configuration into a SerializationMetadata object.
std::string	toString () const override
	Produce a short, human-readable summary of the configuration.
bool	useBias () const noexcept
void	validate () const override
	Validate configuration parameters.
template<typename Self>
decltype(auto)	withBias (this Self &&self, bool use_bias)
template<typename Self>
decltype(auto)	withHiddenDimension (this Self &&self, dim_t hidden_dim)
template<typename Self>
decltype(auto)	withMaxSequenceLength (this Self &&self, dim_t max_seq_len)
	Sets the trained maximum sequence length for this model.
template<typename Self>
decltype(auto)	withNumHeads (this Self &&self, dim_t num_heads)
template<typename Self>
decltype(auto)	withNumKVHeads (this Self &&self, dim_t num_kv_heads)
template<typename Self>
decltype(auto)	withRoPEScalingFactor (this Self &&self, float scale_factor)
template<typename Self>
decltype(auto)	withRoPETheta (this Self &&self, float theta)
template<typename Self>
decltype(auto)	withVocabularyLength (this Self &&self, dim_t vocab_size)
Public Member Functions inherited from Mila::Dnn::ComponentConfig
virtual	~ComponentConfig ()=default
	Virtual destructor for polymorphic base.

Private Attributes
dim_t	embedding_dim_ = 4096
dim_t	hidden_dim_ = 14336
dim_t	max_seq_len_ = 8192
dim_t	num_heads_ = 32
dim_t	num_kv_heads_ = 8
dim_t	num_layers_ = 32
float	rms_norm_eps_ = 1e-5f
float	rope_scaling_factor_ = 1.0f
float	rope_theta_ = 500000.0f
bool	use_bias_ = false
dim_t	vocab_size_ = 128256

Detailed Description

Network-level configuration for LLaMA-style transformer networks.

Exposes only the settings needed at network scope: vocabulary, number of layers, embedding dimension, and max sequence length.

Constructor & Destructor Documentation

◆ LlamaConfig()

Mila::Dnn::LlamaConfig::LlamaConfig	(	dim_t	embedding_dim,
		dim_t	num_layers )

inline

Construct a LLaMA network configuration.

Parameters

embedding_dim	Model embedding dimension. Must be > 0.
num_layers	Number of transformer layers. Must be > 0.

Member Function Documentation

◆ fromMetadata()

void Mila::Dnn::LlamaConfig::fromMetadata ( const SerializationMetadata & meta )

inlinevirtual

Populate configuration from provided metadata.

Implementations should read available keys and leave missing keys at their current/default values to preserve forward/backward compatibility.

Parameters

meta	Metadata to read configuration values from.

Implements Mila::Dnn::ComponentConfig.

Here is the call graph for this function:

◆ getHiddenDimension()

dim_t Mila::Dnn::LlamaConfig::getHiddenDimension ( ) const

inlinenoexcept

◆ getMaxSequenceLength()

dim_t Mila::Dnn::LlamaConfig::getMaxSequenceLength ( ) const

inlinenoexcept

Here is the caller graph for this function:

◆ getModelDim()

dim_t Mila::Dnn::LlamaConfig::getModelDim ( ) const

inlinenoexcept

◆ getNumHeads()

dim_t Mila::Dnn::LlamaConfig::getNumHeads ( ) const

inlinenoexcept

◆ getNumKVHeads()

dim_t Mila::Dnn::LlamaConfig::getNumKVHeads ( ) const

inlinenoexcept

Here is the caller graph for this function:

◆ getNumLayers()

dim_t Mila::Dnn::LlamaConfig::getNumLayers ( ) const

inlinenoexcept

◆ getRMSNormEpsilon()

float Mila::Dnn::LlamaConfig::getRMSNormEpsilon ( ) const

inlinenoexcept

◆ getRoPEScalingFactor()

float Mila::Dnn::LlamaConfig::getRoPEScalingFactor ( ) const

inlinenoexcept

◆ getRoPETheta()

float Mila::Dnn::LlamaConfig::getRoPETheta ( ) const

inlinenoexcept

◆ getVocabSize()

dim_t Mila::Dnn::LlamaConfig::getVocabSize ( ) const

inlinenoexcept

◆ toMetadata()

SerializationMetadata Mila::Dnn::LlamaConfig::toMetadata ( ) const

inlinevirtual

Convert configuration into a SerializationMetadata object.

Implementations should include any fields required to fully reconstruct the configuration via fromMetadata.

Returns: SerializationMetadata Metadata representation of the config.

Implements Mila::Dnn::ComponentConfig.

Here is the call graph for this function:

◆ toString()

std::string Mila::Dnn::LlamaConfig::toString ( ) const

inlineoverridevirtual

Produce a short, human-readable summary of the configuration.

Implementations should return a compact, single-line description suitable for logging and debugging.

Returns: std::string Human-readable summary of the configuration.

Implements Mila::Dnn::ComponentConfig.

Here is the call graph for this function:

◆ useBias()

bool Mila::Dnn::LlamaConfig::useBias ( ) const

inlinenoexcept

◆ validate()

void Mila::Dnn::LlamaConfig::validate ( ) const

inlineoverridevirtual

Validate configuration parameters.

Called by callers to ensure the configuration represents a valid, constructible component. Implementations must throw std::invalid_argument (or a derived exception) when validation fails.

Exceptions

std::invalid_argument If the configuration is invalid.

Implements Mila::Dnn::ComponentConfig.

◆ withBias()

template<typename Self>

decltype(auto) Mila::Dnn::LlamaConfig::withBias	(	this Self &&	self,
		bool	use_bias )

inline

◆ withHiddenDimension()

template<typename Self>

decltype(auto) Mila::Dnn::LlamaConfig::withHiddenDimension	(	this Self &&	self,
		dim_t	hidden_dim )

inline

◆ withMaxSequenceLength()

template<typename Self>

decltype(auto) Mila::Dnn::LlamaConfig::withMaxSequenceLength	(	this Self &&	self,
		dim_t	max_seq_len )

inline

Sets the trained maximum sequence length for this model.

This value is sourced from the pretrained model metadata ( max_position_embeddings in HuggingFace configs ) and represents the architectural ceiling on context length — the furthest position for which RoPE embeddings were trained.

This is not a deployment parameter. The runtime context length is a deployment decision carried by BuildContext, and must not exceed this value. LlamaModel::fromPretrained() enforces that invariant.

Template Parameters

Self	Deduced type of the builder ( supports both lvalue and rvalue chains ).

Parameters

max_seq_len The trained maximum sequence length. Must be > 0.

Exceptions

std::invalid_argument if max_seq_len is zero or negative.

◆ withNumHeads()

template<typename Self>

decltype(auto) Mila::Dnn::LlamaConfig::withNumHeads	(	this Self &&	self,
		dim_t	num_heads )

inline

Here is the caller graph for this function:

◆ withNumKVHeads()

template<typename Self>

decltype(auto) Mila::Dnn::LlamaConfig::withNumKVHeads	(	this Self &&	self,
		dim_t	num_kv_heads )

inline

◆ withRoPEScalingFactor()

template<typename Self>

decltype(auto) Mila::Dnn::LlamaConfig::withRoPEScalingFactor	(	this Self &&	self,
		float	scale_factor )

inline

◆ withRoPETheta()

template<typename Self>

decltype(auto) Mila::Dnn::LlamaConfig::withRoPETheta	(	this Self &&	self,
		float	theta )

inline

◆ withVocabularyLength()

template<typename Self>

decltype(auto) Mila::Dnn::LlamaConfig::withVocabularyLength	(	this Self &&	self,
		dim_t	vocab_size )

inline

Here is the caller graph for this function:

Member Data Documentation

◆ embedding_dim_

dim_t Mila::Dnn::LlamaConfig::embedding_dim_ = 4096

private

◆ hidden_dim_

dim_t Mila::Dnn::LlamaConfig::hidden_dim_ = 14336

private

◆ max_seq_len_

dim_t Mila::Dnn::LlamaConfig::max_seq_len_ = 8192

private

◆ num_heads_

dim_t Mila::Dnn::LlamaConfig::num_heads_ = 32

private

◆ num_kv_heads_

dim_t Mila::Dnn::LlamaConfig::num_kv_heads_ = 8

private

◆ num_layers_

dim_t Mila::Dnn::LlamaConfig::num_layers_ = 32

private

◆ rms_norm_eps_

float Mila::Dnn::LlamaConfig::rms_norm_eps_ = 1e-5f

private

◆ rope_scaling_factor_

float Mila::Dnn::LlamaConfig::rope_scaling_factor_ = 1.0f

private

◆ rope_theta_

float Mila::Dnn::LlamaConfig::rope_theta_ = 500000.0f

private

◆ use_bias_

bool Mila::Dnn::LlamaConfig::use_bias_ = false

private

◆ vocab_size_

dim_t Mila::Dnn::LlamaConfig::vocab_size_ = 128256

private

The documentation for this class was generated from the following file:

/__w/Mila/Mila/Mila/Src/Dnn/Components/Transformers/LlaMa/Llama.Config.ixx

Public Member Functions

Private Attributes

Detailed Description

Constructor & Destructor Documentation

◆ LlamaConfig()

Member Function Documentation

◆ fromMetadata()

◆ getHiddenDimension()

◆ getMaxSequenceLength()

◆ getModelDim()

◆ getNumHeads()

◆ getNumKVHeads()

◆ getNumLayers()

◆ getRMSNormEpsilon()

◆ getRoPEScalingFactor()

◆ getRoPETheta()

◆ getVocabSize()

◆ toMetadata()

◆ toString()

◆ useBias()

◆ validate()

◆ withBias()

◆ withHiddenDimension()

◆ withMaxSequenceLength()

◆ withNumHeads()

◆ withNumKVHeads()

◆ withRoPEScalingFactor()

◆ withRoPETheta()

◆ withVocabularyLength()

Member Data Documentation

◆ embedding_dim_

◆ hidden_dim_

◆ max_seq_len_

◆ num_heads_

◆ num_kv_heads_

◆ num_layers_

◆ rms_norm_eps_

◆ rope_scaling_factor_

◆ rope_theta_

◆ use_bias_

◆ vocab_size_