|
Mila 0.13.48
Deep Neural Network Library
|
Concepts | |
| concept | Mila::Dnn::Quant::KvCache::QuantKvPolicy |
| Concept for quantization-based KV cache compression policies. | |
Classes | |
| struct | Mila::Dnn::Quant::KvCache::PerChannelKvFp8< TStorage > |
| Symmetric per-head per-token FP8 KV cache compression policy. More... | |
Files | |
| file | /__w/Mila/Mila/Mila/Src/Dnn/Quantization/KvCache/QuantPolicy.ixx |
| Quantization-specific KV cache compression policies. | |