|
Mila 0.13.48
Deep Neural Network Library
|
Abstract base configuration for all deployable Mila models. More...
Public Member Functions | |
| ModelConfig (const ModelConfig &)=delete | |
| virtual | ~ModelConfig ()=default |
| dim_t | getContextLength () const noexcept |
| ModelConfig & | operator= (const ModelConfig &)=delete |
| virtual std::string | toString () const =0 |
| Produce a human-readable summary of the model configuration. | |
| template<typename Self> | |
| Self & | withContextLength (this Self &self, dim_t context_length) |
| Set the weight quantization policy. | |
Protected Member Functions | |
| ModelConfig ()=default | |
| Default constructor for subclasses that set context_length via withContextLength(). | |
| ModelConfig (dim_t context_length) | |
| Construct with required context_length. | |
| std::string | baseToString () const |
| Produce the base fields portion of toString(). | |
Protected Attributes | |
| dim_t | context_length_ { 0 } |
Abstract base configuration for all deployable Mila models.
Subclasses add architecture-specific deployment concerns (e.g. LlamaModelConfig adds nothing beyond what ModelConfig already owns — all Llama architectural parameters come from checkpoint metadata).
Non-copyable by design — model configs are constructed once and passed by const reference into fromPretrained().
|
virtualdefault |
|
delete |


|
inlineexplicitprotected |
Construct with required context_length.
Protected — construction is via concrete subclass only.
| context_length | Maximum sequence length. Must be > 0. |
|
protecteddefault |
Default constructor for subclasses that set context_length via withContextLength().
context_length_ is initialised to zero. Subclasses or fromPretrained() must call withContextLength() before passing the config to build().
|
inlineprotected |
Produce the base fields portion of toString().
Subclasses call this and append their own fields.
|
inlinenoexcept |
|
delete |

|
pure virtual |
Produce a human-readable summary of the model configuration.
Implementations should include base fields by calling baseToString() and appending subclass-specific fields.
|
inline |
Set the weight quantization policy.
Determines weight storage dtype and scale allocation for all quantizable components (currently Linear only).
| quantization | Quantization policy to apply. |
Set the maximum sequence length.
Required. The model is built at this context length so that RoPE embeddings and KV cache buffers cover the full range.
| context_length | Maximum sequence length in tokens. |
|
protected |