Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
Compute.CudaGqaOp Module Reference

Classes

class  Mila::Dnn::Compute::Cuda::Gqa::CudaGqaOp< TPrecision >
 CUDA Grouped-Query Attention operation. More...
class  Mila::Dnn::Compute::Cuda::Gqa::CudaGroupedQueryAttentionOpRegistrar

Files

file  /__w/Mila/Mila/Mila/Src/Dnn/Compute/Devices/Cuda/Operations/Attention/GQA/CudaGqaOp.ixx
 CUDA Grouped-Query Attention (GQA) operation using cuBLASLt.
file  /__w/Mila/Mila/Mila/Src/Dnn/Compute/Devices/Cuda/Operations/Attention/GQA/CudaGqa.Dispatch.ixx
file  /__w/Mila/Mila/Mila/Src/Dnn/Compute/Devices/Cuda/Operations/Attention/GQA/CudaGqa.Plans.ixx