Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
ModelConfig.ixx File Reference

Base configuration for all deployable Mila models. More...

#include <string>
#include <cstdint>
#include <stdexcept>
import Dnn.TensorTypes;

Classes

class  Mila::Dnn::ModelConfig
 Abstract base configuration for all deployable Mila models. More...

Namespaces

namespace  Mila
 Mila main API namespace.
namespace  Mila::Dnn

Detailed Description

Base configuration for all deployable Mila models.

ModelConfig carries model-wide deployment concerns that are common to all models and independent of network architecture. It is intentionally decoupled from ComponentConfig, which is purely structural (dimensions, features, flags).

Deployment concerns owned here:

  1. ComputePrecision::Policy — cuBLASLt algorithm selection heuristic applied uniformly to all compute components (Linear, GQA, MHA). Replaces the precision_ field previously on ComponentConfig.
  2. QuantizationConfig — weight storage dtype and scale allocation policy. Currently consumed by Linear only. Defaults to QuantizationConfig::none().
  3. context_length — maximum sequence length the model is built for. Universal across all sequence models.
  4. strict — whether unrecognized parameter names throw on load. Universal across all pretrained model loading.

Construction is via fluent setters on the concrete subclass. context_length is required and has no default — subclasses must enforce this.

Relationship to BuildContext

ModelConfig is the public API surface for deployment configuration. BuildContext is the internal carrier through the component tree. fromPretrained() projects ModelConfig into BuildContext once — they are never the same object.