|
Mila 0.13.48
Deep Neural Network Library
|
Network-level configuration for GPT-style transformer networks. More...


Public Member Functions | |
| GptConfig (dim_t embedding_size, dim_t num_layers) | |
| void | fromMetadata (const SerializationMetadata &meta) |
| Populate configuration from provided metadata. | |
| dim_t | getEmbeddingSize () const noexcept |
| dim_t | getHiddenSize () const noexcept |
| dim_t | getMaxSequenceLength () const noexcept |
| dim_t | getNumHeads () const noexcept |
| dim_t | getNumLayers () const noexcept |
| bool | getUseBias () const noexcept |
| dim_t | getVocabSize () const noexcept |
| SerializationMetadata | toMetadata () const |
| Convert configuration into a SerializationMetadata object. | |
| std::string | toString () const override |
| Produce a short, human-readable summary of the configuration. | |
| void | validate () const override |
| Validate configuration parameters. | |
| template<typename Self> | |
| decltype(auto) | withBias (this Self &&self, bool use_bias) |
| template<typename Self> | |
| decltype(auto) | withHiddenSize (this Self &&self, dim_t hidden_size) |
| template<typename Self> | |
| decltype(auto) | withMaxSequenceLength (this Self &&self, dim_t max_seq_len) |
| template<typename Self> | |
| decltype(auto) | withNumHeads (this Self &&self, dim_t num_heads) |
| template<typename Self> | |
| decltype(auto) | withNumLayers (this Self &&self, dim_t num_layers) |
| template<typename Self> | |
| decltype(auto) | withVocabSize (this Self &&self, dim_t vocab_size) |
| Public Member Functions inherited from Mila::Dnn::ComponentConfig | |
| virtual | ~ComponentConfig ()=default |
| Virtual destructor for polymorphic base. | |
Private Attributes | |
| dim_t | embedding_size_ = 768 |
| dim_t | hidden_size_ = 768 |
| dim_t | max_seq_len_ = 1024 |
| dim_t | num_heads_ = 12 |
| dim_t | num_layers_ = 12 |
| bool | use_bias_ = true |
| dim_t | vocab_size_ = 50257 |
Network-level configuration for GPT-style transformer networks.
Contains only the minimal network-level settings required by GPT networks: embedding dim, number of layers, heads, vocabulary and max seq len.
|
inlinevirtual |
Populate configuration from provided metadata.
Implementations should read available keys and leave missing keys at their current/default values to preserve forward/backward compatibility.
| meta | Metadata to read configuration values from. |
Implements Mila::Dnn::ComponentConfig.

|
inlinenoexcept |
|
inlinenoexcept |
|
inlinenoexcept |
|
inlinenoexcept |
|
inlinenoexcept |
|
inlinenoexcept |
|
inlinenoexcept |
|
inlinevirtual |
Convert configuration into a SerializationMetadata object.
Implementations should include any fields required to fully reconstruct the configuration via fromMetadata.
Implements Mila::Dnn::ComponentConfig.

|
inlineoverridevirtual |
Produce a short, human-readable summary of the configuration.
Implementations should return a compact, single-line description suitable for logging and debugging.
Implements Mila::Dnn::ComponentConfig.
|
inlineoverridevirtual |
Validate configuration parameters.
Called by callers to ensure the configuration represents a valid, constructible component. Implementations must throw std::invalid_argument (or a derived exception) when validation fails.
| std::invalid_argument | If the configuration is invalid. |
Implements Mila::Dnn::ComponentConfig.
|
inline |
|
inline |
|
inline |
|
inline |

|
inline |
|
inline |

|
private |
|
private |
|
private |
|
private |
|
private |
|
private |
|
private |