Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
GroupedQueryAttention.ixx File Reference

Grouped-Query Attention module (concatenated QKV input). More...

#include <memory>
#include <vector>
#include <string>
#include <sstream>
#include <type_traits>
#include <stdexcept>
#include <cstdint>
#include <optional>
import Dnn.Quantization.KvCache.Policy;
import Serialization.Mode;
import Serialization.ModelArchive;
import Compute.IKvInference;
import Compute.MemoryResource;
import Compute.GqaState;
import Dnn.Components.GqaConfig;
import Dnn.Component;
import Dnn.Tensor;
import Compute.ExecutionContext;
import Compute.ExecutionContextFactory;
import Dnn.ComponentType;
import Dnn.ITensor;
import Dnn.TensorTypes;
import Compute.IKvCacheLifecycle;
import Compute.DeviceTypeTraits;
import Dnn.TensorDataTypeTraits;
import Compute.CpuMemoryResource;
import Compute.OperationTraits;
import Compute.Device;
import Dnn.TensorDataType;
import Compute.DeviceId;
import Compute.DeviceType;

Classes

class  Mila::Dnn::GroupedQueryAttention< TDeviceType, TComputePrecision, TKvPolicy >
 Grouped-Query Attention module that accepts concatenated QKV input. More...

Namespaces

namespace  Mila
 Mila main API namespace.
namespace  Mila::Dnn

Detailed Description

Grouped-Query Attention module (concatenated QKV input).