|
Mila 0.13.48
Deep Neural Network Library
|
Grouped-Query Attention module (concatenated QKV input). More...
#include <memory>#include <vector>#include <string>#include <sstream>#include <type_traits>#include <stdexcept>#include <cstdint>#include <optional>import Dnn.Quantization.KvCache.Policy;import Serialization.Mode;import Serialization.ModelArchive;import Compute.IKvInference;import Compute.MemoryResource;import Compute.GqaState;import Dnn.Components.GqaConfig;import Dnn.Component;import Dnn.Tensor;import Compute.ExecutionContext;import Compute.ExecutionContextFactory;import Dnn.ComponentType;import Dnn.ITensor;import Dnn.TensorTypes;import Compute.IKvCacheLifecycle;import Compute.DeviceTypeTraits;import Dnn.TensorDataTypeTraits;import Compute.CpuMemoryResource;import Compute.OperationTraits;import Compute.Device;import Dnn.TensorDataType;import Compute.DeviceId;import Compute.DeviceType;Classes | |
| class | Mila::Dnn::GroupedQueryAttention< TDeviceType, TComputePrecision, TKvPolicy > |
| Grouped-Query Attention module that accepts concatenated QKV input. More... | |
Namespaces | |
| namespace | Mila |
| Mila main API namespace. | |
| namespace | Mila::Dnn |
Grouped-Query Attention module (concatenated QKV input).