Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
Mila::Dnn::Quant::KvCache Namespace Reference

Classes

struct  NoKvCompression
struct  PerChannelKvFp8
 Symmetric per-head per-token FP8 KV cache compression policy. More...

Concepts

concept  KvCachePolicy
concept  QuantKvPolicy
 Concept for quantization-based KV cache compression policies.