Mila
Deep Neural Network Library
Loading...
Searching...
No Matches
Mila::Dnn::TransformerBlockConfig Class Referenceexport

Configuration class for TransformerBlock. More...

Inheritance diagram for Mila::Dnn::TransformerBlockConfig:
Collaboration diagram for Mila::Dnn::TransformerBlockConfig:

Public Member Functions

 TransformerBlockConfig (const std::vector< size_t > &input_shape, size_t num_heads)
 Constructor with required parameters.
 
ActivationType getActivationType () const
 Get the activation type for the MLP.
 
float getDropout () const
 Get the dropout rate.
 
size_t getHiddenDimension () const
 Get the hidden dimension for the feed-forward network.
 
const std::vector< size_t > & getInputShape () const
 Get the input shape.
 
size_t getNumHeads () const
 Get the number of attention heads.
 
bool useBias () const
 Check if bias is enabled.
 
bool usePreLayerNorm () const
 Check if using pre-layer normalization.
 
void validate () const
 Validate configuration parameters.
 
TransformerBlockConfigwithActivation (ActivationType activation_type)
 Configure the activation function for the MLP.
 
TransformerBlockConfigwithBias (bool use_bias)
 Configure whether to use bias in attention and feedforward layers.
 
TransformerBlockConfigwithDropout (float dropout)
 Configure the dropout rate.
 
TransformerBlockConfigwithHiddenDimension (size_t hidden_dim)
 Configure the hidden dimension for the feed-forward network.
 
TransformerBlockConfigwithPreLayerNorm (bool use_pre_ln)
 Configure whether to use pre-layer normalization architecture.
 
- Public Member Functions inherited from Mila::Dnn::ComponentConfig
virtual ~ComponentConfig ()=default
 Virtual destructor to support proper polymorphic destruction.
 
const std::string & getName () const
 Gets the configured component name.
 
ComputePrecision::Policy getPrecision () const
 Gets the configured precision policy.
 
bool isTraining () const
 Gets the configured training mode.
 
template<typename Self >
auto & withName (this Self &&self, std::string name)
 Sets the name of the component with fluent interface.
 
template<typename Self >
auto & withPrecision (this Self &&self, ComputePrecision::Policy policy)
 Sets the compute precision policy with fluent interface.
 
template<typename Self >
auto & withTraining (this Self &&self, bool is_training)
 Sets the training mode with fluent interface.
 

Private Attributes

ActivationType activation_type_ = ActivationType::Gelu
 
float dropout_ = 0.0f
 
size_t hidden_dim_ = 0
 
std::vector< size_t > input_shape_
 
size_t num_heads_
 
bool use_bias_ = true
 
bool use_pre_ln_ = true
 

Additional Inherited Members

- Protected Attributes inherited from Mila::Dnn::ComponentConfig
bool is_training_ = false
 Training mode flag, defaults to false (inference mode)
 
std::string name_ = "unnamed"
 Component name, defaults to "unnamed" if not explicitly set.
 
ComputePrecision::Policy precision_ = ComputePrecision::Policy::Auto
 Precision policy for computation, defaults to Auto.
 

Detailed Description

Configuration class for TransformerBlock.

Provides a type-safe fluent interface for configuring TransformerBlock modules.

Constructor & Destructor Documentation

◆ TransformerBlockConfig()

Mila::Dnn::TransformerBlockConfig::TransformerBlockConfig ( const std::vector< size_t > &  input_shape,
size_t  num_heads 
)
inline

Constructor with required parameters.

Parameters
input_shapeThe shape of the input tensor [batch_size, sequence_length, embedding_dim]
num_headsThe number of attention heads

Member Function Documentation

◆ getActivationType()

ActivationType Mila::Dnn::TransformerBlockConfig::getActivationType ( ) const
inline

Get the activation type for the MLP.

Here is the caller graph for this function:

◆ getDropout()

float Mila::Dnn::TransformerBlockConfig::getDropout ( ) const
inline

Get the dropout rate.

Here is the caller graph for this function:

◆ getHiddenDimension()

size_t Mila::Dnn::TransformerBlockConfig::getHiddenDimension ( ) const
inline

Get the hidden dimension for the feed-forward network.

Here is the caller graph for this function:

◆ getInputShape()

const std::vector< size_t > & Mila::Dnn::TransformerBlockConfig::getInputShape ( ) const
inline

Get the input shape.

Here is the caller graph for this function:

◆ getNumHeads()

size_t Mila::Dnn::TransformerBlockConfig::getNumHeads ( ) const
inline

Get the number of attention heads.

Here is the caller graph for this function:

◆ useBias()

bool Mila::Dnn::TransformerBlockConfig::useBias ( ) const
inline

Check if bias is enabled.

Here is the caller graph for this function:

◆ usePreLayerNorm()

bool Mila::Dnn::TransformerBlockConfig::usePreLayerNorm ( ) const
inline

Check if using pre-layer normalization.

Here is the caller graph for this function:

◆ validate()

void Mila::Dnn::TransformerBlockConfig::validate ( ) const
inlinevirtual

Validate configuration parameters.

Exceptions
std::invalid_argumentIf validation fails

Reimplemented from Mila::Dnn::ComponentConfig.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ withActivation()

TransformerBlockConfig & Mila::Dnn::TransformerBlockConfig::withActivation ( ActivationType  activation_type)
inline

Configure the activation function for the MLP.

Parameters
activation_typeThe activation function type
Returns
TransformerBlockConfig& Reference to this for method chaining

◆ withBias()

TransformerBlockConfig & Mila::Dnn::TransformerBlockConfig::withBias ( bool  use_bias)
inline

Configure whether to use bias in attention and feedforward layers.

Parameters
use_biasWhether to use bias
Returns
TransformerBlockConfig& Reference to this for method chaining

◆ withDropout()

TransformerBlockConfig & Mila::Dnn::TransformerBlockConfig::withDropout ( float  dropout)
inline

Configure the dropout rate.

Parameters
dropoutDropout probability (0.0 to 1.0)
Returns
TransformerBlockConfig& Reference to this for method chaining

◆ withHiddenDimension()

TransformerBlockConfig & Mila::Dnn::TransformerBlockConfig::withHiddenDimension ( size_t  hidden_dim)
inline

Configure the hidden dimension for the feed-forward network.

Parameters
hidden_dimSize of the hidden layer in the feed-forward network
Returns
TransformerBlockConfig& Reference to this for method chaining

◆ withPreLayerNorm()

TransformerBlockConfig & Mila::Dnn::TransformerBlockConfig::withPreLayerNorm ( bool  use_pre_ln)
inline

Configure whether to use pre-layer normalization architecture.

Parameters
use_pre_lnWhether to use pre-layer normalization
Returns
TransformerBlockConfig& Reference to this for method chaining

Member Data Documentation

◆ activation_type_

ActivationType Mila::Dnn::TransformerBlockConfig::activation_type_ = ActivationType::Gelu
private

◆ dropout_

float Mila::Dnn::TransformerBlockConfig::dropout_ = 0.0f
private

◆ hidden_dim_

size_t Mila::Dnn::TransformerBlockConfig::hidden_dim_ = 0
private

◆ input_shape_

std::vector<size_t> Mila::Dnn::TransformerBlockConfig::input_shape_
private

◆ num_heads_

size_t Mila::Dnn::TransformerBlockConfig::num_heads_
private

◆ use_bias_

bool Mila::Dnn::TransformerBlockConfig::use_bias_ = true
private

◆ use_pre_ln_

bool Mila::Dnn::TransformerBlockConfig::use_pre_ln_ = true
private

The documentation for this class was generated from the following file: