|
Mila 0.13.48
Deep Neural Network Library
|
CRTP base configuration for all deployable Mila language models. More...
Classes | |
| struct | Mila::Dnn::LanguageModelConfig< TDerived > |
| CRTP base configuration for all deployable Mila language models. More... | |
Namespaces | |
| namespace | Mila |
| Mila main API namespace. | |
| namespace | Mila::Dnn |
Enumerations | |
| enum class | Mila::Dnn::KvCacheCompression { Mila::Dnn::None , Mila::Dnn::FP8 } |
| KV cache storage and compression strategy for GroupedQueryAttention. More... | |
| enum class | Mila::Dnn::WeightQuantization { Mila::Dnn::None , Mila::Dnn::FP8 , Mila::Dnn::FP4 } |
| Weight storage and matmul strategy for Linear components. More... | |
CRTP base configuration for all deployable Mila language models.
LanguageModelConfig<TDerived> owns the deployment concerns that are universal across all language model architectures:
All fluent setters return TDerived& so that chains work correctly across both base and derived methods without casting at the call site:
ModelConfig<TDevice, TPrecision> is the structural base for all Mila models. LanguageModelConfig is the deployment configuration counterpart for the language model branch of that hierarchy. Vision model configurations would derive from a sibling VisionModelConfig<TDerived>, not from this class.
LanguageModelConfig is the public API surface for deployment configuration. BuildContext is the internal carrier through the component tree. fromPretrained() projects LanguageModelConfig into BuildContext once — they are never the same object.
Convenience preset methods express common deployment decisions in user vocabulary. Fine-grained setters are available for atypical configurations: