Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
Dnn.LanguageModelConfig Module Reference

Classes

struct  Mila::Dnn::LanguageModelConfig< TDerived >
 CRTP base configuration for all deployable Mila language models. More...

Enumerations

enum class  Mila::Dnn::KvCacheCompression { None , FP8 }
 KV cache storage and compression strategy for GroupedQueryAttention. More...
enum class  Mila::Dnn::WeightQuantization { None , FP8 , FP4 }
 Weight storage and matmul strategy for Linear components. More...

Files

file  /__w/Mila/Mila/Mila/Src/Dnn/Core/LanguageModelConfig.ixx
 CRTP base configuration for all deployable Mila language models.