Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
Structural.h File Reference

Host-callable launcher declarations for CUDA structural tensor operations. More...

#include <cuda_runtime.h>

Go to the source code of this file.

Namespaces

namespace  Mila
 Mila main API namespace.
namespace  Mila::Dnn
namespace  Mila::Dnn::Compute
namespace  Mila::Dnn::Compute::Cuda

Functions

void Mila::Dnn::Compute::Cuda::cuda_split3_bf16 (const __nv_bfloat16 *__restrict__ src, __nv_bfloat16 *__restrict__ out0, __nv_bfloat16 *__restrict__ out1, __nv_bfloat16 *__restrict__ out2, int rows, int D0, int D1, int D2, cudaStream_t stream)
void Mila::Dnn::Compute::Cuda::cuda_split3_fp32 (const float *__restrict__ src, float *__restrict__ out_a, float *__restrict__ out_b, float *__restrict__ out_c, int src_rows, int dim_a, int dim_b, int dim_c, cudaStream_t stream)
 Vectorized 3-way last-dimension split, float32.

Detailed Description

Host-callable launcher declarations for CUDA structural tensor operations.