Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
Compute.CublasLtPlanCache Module Reference

Classes

class  Mila::Dnn::Compute::Cuda::CublasLtPlanCache< TPlan >
 Generic plan cache keyed on batch size bucket. More...

Functions

std::vector< int > Mila::Dnn::Compute::Cuda::computeArchitectureBuckets (int max_batch_size)
 Computes optimal bucket boundaries for cuBLASLt plan caching based on CUDA device architecture.
int Mila::Dnn::Compute::Cuda::getBucket (const std::vector< int > &buckets, int batch_size)
 Fast O(log N) bucket lookup.

Files

file  /__w/Mila/Mila/Mila/Src/Dnn/Compute/Devices/Cuda/Operations/Common/CublasLtPlanCache.ixx