Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
CublasLtMatMulBias.ixx File Reference

CUDA-accelerated matrix multiplication with bias addition using cuBLASLt. More...

#include <cublasLt.h>
#include <cuda_fp16.h>
#include <cuda_bf16.h>
#include <cuda_fp8.h>
#include <type_traits>
import Logging.Logger;
import CublasLt.Error;
import Dnn.TensorDataTypeTraits;
import Dnn.ITensor;

Detailed Description

CUDA-accelerated matrix multiplication with bias addition using cuBLASLt.

This module provides high-performance matrix multiplication operations optimized for neural network linear layers. It leverages NVIDIA's cuBLASLt library to efficiently execute matrix operations on GPU tensor cores with configurable precision modes.

Key features:

  • Mixed precision computation capabilities (FP32, FP16, BF16, FP8)
  • Optimized matrix multiplication with fused bias addition
  • Support for adaptive precision based on computation policy
  • Automatic algorithm selection via cuBLASLt heuristics

The implementation handles various data types with appropriate compute precision selection based on the provided ComputePrecision policy, including accuracy vs. performance tradeoffs.