Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
Mila::Dnn::Compute::GqaState Struct Referenceexport

Non-owning pointers to shared transient GQA scratch buffers. More...

Collaboration diagram for Mila::Dnn::Compute::GqaState:

Public Attributes

ITensoratt { nullptr }
ITensoratt_decode { nullptr }
ITensorpreatt { nullptr }
ITensorpreatt_decode { nullptr }
ITensorq_permute { nullptr }
ITensorv_out { nullptr }
ITensorv_out_decode { nullptr }

Detailed Description

Non-owning pointers to shared transient GQA scratch buffers.

All slots are nullable. CudaGqaOp::setState() accepts a partially populated state — only non-null slots replace the previously wired pointers.

Prefill slots: q_permute [B, NH, chunk, HS] preatt [B, NH, chunk, T] att [B, NH, chunk, T] v_out [B, NH, chunk, HS]

Decode slots: preatt_decode [B, NH, 1, T] att_decode [B, NH, 1, T] v_out_decode [B, NH, 1, HS]

Ownership: caller retains ownership and must ensure tensors outlive all GQA layers that reference them.

Member Data Documentation

◆ att

ITensor* Mila::Dnn::Compute::GqaState::att { nullptr }

◆ att_decode

ITensor* Mila::Dnn::Compute::GqaState::att_decode { nullptr }

◆ preatt

ITensor* Mila::Dnn::Compute::GqaState::preatt { nullptr }

◆ preatt_decode

ITensor* Mila::Dnn::Compute::GqaState::preatt_decode { nullptr }

◆ q_permute

ITensor* Mila::Dnn::Compute::GqaState::q_permute { nullptr }

◆ v_out

ITensor* Mila::Dnn::Compute::GqaState::v_out { nullptr }

◆ v_out_decode

ITensor* Mila::Dnn::Compute::GqaState::v_out_decode { nullptr }

The documentation for this struct was generated from the following file:
  • /__w/Mila/Mila/Mila/Src/Dnn/Compute/Operations/GqaState.ixx