Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
Mila::Dnn::LlamaModelConfig Struct Referenceexport

Deployment configuration for Llama language models. More...

Inheritance diagram for Mila::Dnn::LlamaModelConfig:
Collaboration diagram for Mila::Dnn::LlamaModelConfig:

Public Member Functions

 LlamaModelConfig ()=default
 Default constructor.
 LlamaModelConfig (dim_t context_length)
 Construct with a required context length.
std::string toString () const
 Produce a human-readable summary of the Llama model configuration.
Public Member Functions inherited from Mila::Dnn::LanguageModelConfig< LlamaModelConfig >
 LanguageModelConfig ()=default
std::string baseToString () const
 Produce the base fields portion of a toString() summary.
dim_t getContextLength () const noexcept
KvCacheCompression getKvCacheCompression () const noexcept
WeightQuantization getWeightQuantization () const noexcept
LlamaModelConfigwithContextLength (dim_t context_length)
 Set the maximum sequence length.
LlamaModelConfigwithFP4Quantization ()
 FP4 quantization — FP4 weights, FP8 KV cache.
LlamaModelConfigwithFP8Quantization ()
 FP8 quantization — FP8 weights, FP8 KV cache.
LlamaModelConfigwithFullPrecision ()
 Full precision — BF16 weights, BF16 KV cache.
LlamaModelConfigwithKvCacheCompression (KvCacheCompression kv)
 Set the KV cache compression mode independently.
LlamaModelConfigwithWeightQuantization (WeightQuantization wq)
 Set the weight quantization mode independently.

Additional Inherited Members

Protected Attributes inherited from Mila::Dnn::LanguageModelConfig< LlamaModelConfig >
dim_t context_length_
KvCacheCompression kv_cache_compression_
WeightQuantization weight_quantization_

Detailed Description

Deployment configuration for Llama language models.

Inherits all fluent setters and accessors from LanguageModelConfig<LlamaModelConfig>. Chains work across base and derived methods without casting.

Constructor & Destructor Documentation

◆ LlamaModelConfig() [1/2]

Mila::Dnn::LlamaModelConfig::LlamaModelConfig ( )
default

Default constructor.

context_length defaults to zero. Call withContextLength() before passing to fromPretrained(), or use the explicit constructor.

Here is the caller graph for this function:

◆ LlamaModelConfig() [2/2]

Mila::Dnn::LlamaModelConfig::LlamaModelConfig ( dim_t context_length)
inlineexplicit

Construct with a required context length.

Parameters
context_lengthMaximum sequence length in tokens. Must be > 0.
Exceptions
std::invalid_argumentif context_length is zero.
Here is the call graph for this function:

Member Function Documentation

◆ toString()

std::string Mila::Dnn::LlamaModelConfig::toString ( ) const
inline

Produce a human-readable summary of the Llama model configuration.

Here is the call graph for this function:

The documentation for this struct was generated from the following file: