Mila
Deep Neural Network Library
|
Namespace for CUDA layer normalization implementation details. More...
Typedefs | |
using | BackwardFp16Func = void(*)(half *, const half *, const half *, int, cudaStream_t) |
using | BackwardFp32Func = void(*)(float *, const float *, const float *, int, cudaStream_t) |
using | ForwardFp16Func = void(*)(half *, const half *, int, cudaStream_t) |
using | ForwardFp32Func = void(*)(float *, const float *, int, cudaStream_t) |
Namespace for CUDA layer normalization implementation details.
Namespace for CUDA fused softmax cross entropy implementation details.
Namespace for CUDA softmax implementation details.
Namespace for CUDA residual implementation details.
Namespace for CUDA matrix multiplication implementation details.
This namespace contains the implementation details for the CUDA layer normalization operation, including specialized templates for different data types (float, half).
This namespace contains the implementation details for the CUDA matrix multiplication operation, including specialized templates for different data types (float, half).
This namespace contains the implementation details for the CUDA residual operation, including specialized templates for different data types (float, half).
This namespace contains the implementation details for the CUDA softmax operation, including specialized templates for different data types (float, half).
This namespace contains the implementation details for the CUDA fused softmax cross entropy operation, including specialized templates for different data types (float, half).
using Mila::Dnn::Compute::Detail::BackwardFp16Func = typedef void (*)(half*, const half*, const half*, int, cudaStream_t) |
using Mila::Dnn::Compute::Detail::BackwardFp32Func = typedef void (*)(float*, const float*, const float*, int, cudaStream_t) |
using Mila::Dnn::Compute::Detail::ForwardFp16Func = typedef void (*)(half*, const half*, int, cudaStream_t) |
using Mila::Dnn::Compute::Detail::ForwardFp32Func = typedef void (*)(float*, const float*, int, cudaStream_t) |