Base configuration for all deployable Mila models. More...

#include <string>
#include <cstdint>
#include <stdexcept>
import Dnn.TensorTypes;

Classes
class	Mila::Dnn::ModelConfig
	Abstract base configuration for all deployable Mila models. More...

Namespaces
namespace	Mila
	Mila main API namespace.
namespace	Mila::Dnn

Detailed Description

Base configuration for all deployable Mila models.

ModelConfig carries model-wide deployment concerns that are common to all models and independent of network architecture. It is intentionally decoupled from ComponentConfig, which is purely structural (dimensions, features, flags).

Deployment concerns owned here:

ComputePrecision::Policy � cuBLASLt algorithm selection heuristic applied uniformly to all compute components (Linear, GQA, MHA). Replaces the precision_ field previously on ComponentConfig.
QuantizationConfig � weight storage dtype and scale allocation policy. Currently consumed by Linear only. Defaults to QuantizationConfig::none().
context_length � maximum sequence length the model is built for. Universal across all sequence models.
strict � whether unrecognized parameter names throw on load. Universal across all pretrained model loading.

Construction is via fluent setters on the concrete subclass. context_length is required and has no default � subclasses must enforce this.

Relationship to BuildContext

ModelConfig is the public API surface for deployment configuration. BuildContext is the internal carrier through the component tree. fromPretrained() projects ModelConfig into BuildContext once � they are never the same object.

Classes

Namespaces

Detailed Description

Relationship to BuildContext