Configuration class for TransformerBlock. More...

Inheritance diagram for Mila::Dnn::TransformerBlockConfig:

Collaboration diagram for Mila::Dnn::TransformerBlockConfig:

Public Member Functions
	TransformerBlockConfig (const std::vector< size_t > &input_shape, size_t num_heads)
	Constructor with required parameters.

ActivationType	getActivationType () const
	Get the activation type for the MLP.

float	getDropout () const
	Get the dropout rate.

size_t	getHiddenDimension () const
	Get the hidden dimension for the feed-forward network.

const std::vector< size_t > &	getInputShape () const
	Get the input shape.

size_t	getNumHeads () const
	Get the number of attention heads.

bool	useBias () const
	Check if bias is enabled.

bool	usePreLayerNorm () const
	Check if using pre-layer normalization.

void	validate () const
	Validate configuration parameters.

TransformerBlockConfig &	withActivation (ActivationType activation_type)
	Configure the activation function for the MLP.

TransformerBlockConfig &	withBias (bool use_bias)
	Configure whether to use bias in attention and feedforward layers.

TransformerBlockConfig &	withDropout (float dropout)
	Configure the dropout rate.

TransformerBlockConfig &	withHiddenDimension (size_t hidden_dim)
	Configure the hidden dimension for the feed-forward network.

TransformerBlockConfig &	withPreLayerNorm (bool use_pre_ln)
	Configure whether to use pre-layer normalization architecture.

Public Member Functions inherited from Mila::Dnn::ComponentConfig
virtual	~ComponentConfig ()=default
	Virtual destructor to support proper polymorphic destruction.

const std::string &	getName () const
	Gets the configured component name.

ComputePrecision::Policy	getPrecision () const
	Gets the configured precision policy.

bool	isTraining () const
	Gets the configured training mode.

template<typename Self >
auto &	withName (this Self &&self, std::string name)
	Sets the name of the component with fluent interface.

template<typename Self >
auto &	withPrecision (this Self &&self, ComputePrecision::Policy policy)
	Sets the compute precision policy with fluent interface.

template<typename Self >
auto &	withTraining (this Self &&self, bool is_training)
	Sets the training mode with fluent interface.

Private Attributes
ActivationType	activation_type_ = ActivationType::Gelu

float	dropout_ = 0.0f

size_t	hidden_dim_ = 0

std::vector< size_t >	input_shape_

size_t	num_heads_

bool	use_bias_ = true

bool	use_pre_ln_ = true

Additional Inherited Members
Protected Attributes inherited from Mila::Dnn::ComponentConfig
bool	is_training_ = false
	Training mode flag, defaults to false (inference mode)

std::string	name_ = "unnamed"
	Component name, defaults to "unnamed" if not explicitly set.

ComputePrecision::Policy	precision_ = ComputePrecision::Policy::Auto
	Precision policy for computation, defaults to Auto.

Detailed Description

Configuration class for TransformerBlock.

Provides a type-safe fluent interface for configuring TransformerBlock modules.

Constructor & Destructor Documentation

◆ TransformerBlockConfig()

Mila::Dnn::TransformerBlockConfig::TransformerBlockConfig	(	const std::vector< size_t > &	input_shape,
		size_t	num_heads
	)

inline

Constructor with required parameters.

Parameters

input_shape	The shape of the input tensor [batch_size, sequence_length, embedding_dim]
num_heads	The number of attention heads

Member Function Documentation

◆ getActivationType()

ActivationType Mila::Dnn::TransformerBlockConfig::getActivationType ( ) const

inline

Get the activation type for the MLP.

Here is the caller graph for this function:

◆ getDropout()

float Mila::Dnn::TransformerBlockConfig::getDropout ( ) const

inline

Get the dropout rate.

Here is the caller graph for this function:

◆ getHiddenDimension()

size_t Mila::Dnn::TransformerBlockConfig::getHiddenDimension ( ) const

inline

Get the hidden dimension for the feed-forward network.

Here is the caller graph for this function:

◆ getInputShape()

const std::vector< size_t > & Mila::Dnn::TransformerBlockConfig::getInputShape ( ) const

inline

Get the input shape.

Here is the caller graph for this function:

◆ getNumHeads()

size_t Mila::Dnn::TransformerBlockConfig::getNumHeads ( ) const

inline

Get the number of attention heads.

Here is the caller graph for this function:

◆ useBias()

bool Mila::Dnn::TransformerBlockConfig::useBias ( ) const

inline

Check if bias is enabled.

Here is the caller graph for this function:

◆ usePreLayerNorm()

bool Mila::Dnn::TransformerBlockConfig::usePreLayerNorm ( ) const

inline

Check if using pre-layer normalization.

Here is the caller graph for this function:

◆ validate()

void Mila::Dnn::TransformerBlockConfig::validate ( ) const

inlinevirtual

Validate configuration parameters.

Exceptions

std::invalid_argument If validation fails

Reimplemented from Mila::Dnn::ComponentConfig.

Here is the call graph for this function:

Here is the caller graph for this function:

◆ withActivation()

TransformerBlockConfig & Mila::Dnn::TransformerBlockConfig::withActivation ( ActivationType activation_type )

inline

Configure the activation function for the MLP.

Parameters

activation_type The activation function type

Returns: TransformerBlockConfig& Reference to this for method chaining

◆ withBias()

TransformerBlockConfig & Mila::Dnn::TransformerBlockConfig::withBias ( bool use_bias )

inline

Configure whether to use bias in attention and feedforward layers.

Parameters

use_bias Whether to use bias

Returns: TransformerBlockConfig& Reference to this for method chaining

◆ withDropout()

TransformerBlockConfig & Mila::Dnn::TransformerBlockConfig::withDropout ( float dropout )

inline

Configure the dropout rate.

Parameters

dropout Dropout probability (0.0 to 1.0)

Returns: TransformerBlockConfig& Reference to this for method chaining

◆ withHiddenDimension()

TransformerBlockConfig & Mila::Dnn::TransformerBlockConfig::withHiddenDimension ( size_t hidden_dim )

inline

Configure the hidden dimension for the feed-forward network.

Parameters

hidden_dim Size of the hidden layer in the feed-forward network

Returns: TransformerBlockConfig& Reference to this for method chaining

◆ withPreLayerNorm()

TransformerBlockConfig & Mila::Dnn::TransformerBlockConfig::withPreLayerNorm ( bool use_pre_ln )

inline

Configure whether to use pre-layer normalization architecture.

Parameters

use_pre_ln Whether to use pre-layer normalization

Returns: TransformerBlockConfig& Reference to this for method chaining

Member Data Documentation

◆ activation_type_

ActivationType Mila::Dnn::TransformerBlockConfig::activation_type_ = ActivationType::Gelu

private

◆ dropout_

float Mila::Dnn::TransformerBlockConfig::dropout_ = 0.0f

private

◆ hidden_dim_

size_t Mila::Dnn::TransformerBlockConfig::hidden_dim_ = 0

private

◆ input_shape_

std::vector<size_t> Mila::Dnn::TransformerBlockConfig::input_shape_

private

◆ num_heads_

size_t Mila::Dnn::TransformerBlockConfig::num_heads_

private

◆ use_bias_

bool Mila::Dnn::TransformerBlockConfig::use_bias_ = true

private

◆ use_pre_ln_

bool Mila::Dnn::TransformerBlockConfig::use_pre_ln_ = true

private

The documentation for this class was generated from the following file:

/home/runner/work/Mila/Mila/Mila/Src/Dnn/Modules/Blocks/TransformerBlockConfig.ixx

Public Member Functions

Private Attributes

Additional Inherited Members

Detailed Description

Constructor & Destructor Documentation

◆ TransformerBlockConfig()

Member Function Documentation

◆ getActivationType()

◆ getDropout()

◆ getHiddenDimension()

◆ getInputShape()

◆ getNumHeads()

◆ useBias()

◆ usePreLayerNorm()

◆ validate()

◆ withActivation()

◆ withBias()

◆ withDropout()

◆ withHiddenDimension()

◆ withPreLayerNorm()

Member Data Documentation

◆ activation_type_

◆ dropout_

◆ hidden_dim_

◆ input_shape_

◆ num_heads_

◆ use_bias_

◆ use_pre_ln_