|
| CpuLinearOp (const LinearConfig &config) |
| Constructs a new CPU Fully Connected operation with the default device context.
|
|
| CpuLinearOp (std::shared_ptr< DeviceContext > context, const LinearConfig &config) |
| Constructs a new CPU Fully Connected operation with a specific device context.
|
|
void | backward (Tensor< float, MR > &input_grad, const std::vector< std::shared_ptr< Tensor< float, MR > > > ¶meter_grads, const Tensor< float, MR > &output_grad, const Tensor< float, MR > input, const Tensor< float, MR > weight, int B, int T, int C, int OC) |
| Performs the backward pass of the Fully Connected operation.
|
|
void | forward (const Tensor< float, MR > &input, const std::vector< std::shared_ptr< Tensor< float, MR > > > ¶meters, const OperationAttributes &properties, Tensor< float, MR > &output, std::vector< std::shared_ptr< Tensor< float, MR > > > &output_state) const override |
| Performs the forward pass of the Linear operation.
|
|
std::string | getName () const override |
| Gets the name of this operation.
|
|
| UnaryOperation (OperationType operation_type) |
| Constructs a UnaryOperation with the specified operation type.
|
|
| UnaryOperation (OperationType operation_type, std::shared_ptr< DeviceContext > context) |
| Constructs a UnaryOperation with the specified operation type and device context.
|
|
virtual | ~UnaryOperation ()=default |
| Virtual destructor for proper cleanup of derived classes.
|
|
virtual void | backward (const Tensor< float, MR > &grad, const std::vector< std::shared_ptr< Tensor< float, MR > > > ¶meters, std::vector< std::shared_ptr< Tensor< float, MR > > > &output_grads) const |
| Executes the backward pass of a unary operation.
|
|
virtual void | backward (const Tensor< float, MR > &input, const Tensor< float, MR > &output_grad, const std::vector< std::shared_ptr< Tensor< float, MR > > > ¶meters, std::vector< std::shared_ptr< Tensor< float, MR > > > ¶meter_grads, Tensor< float, MR > &input_grad, const OperationAttributes &properties, const std::vector< std::shared_ptr< Tensor< float, MR > > > &output_state) const |
| Executes the comprehensive backward pass of a unary operation.
|
|
virtual void | forward (const Tensor< float, MR > &input, const std::vector< std::shared_ptr< Tensor< float, MR > > > ¶meters, const OperationAttributes &properties, Tensor< float, MR > &output, std::vector< std::shared_ptr< Tensor< float, MR > > > &output_state) const=0 |
| Executes the forward pass of a unary operation.
|
|
| OperationBase (OperationType operation_type, std::shared_ptr< DeviceContext > context) |
| Constructs an OperationBase object with a specific device context and compute precision.
|
|
virtual | ~OperationBase ()=default |
| Virtual destructor for the OperationBase class.
|
|
std::shared_ptr< DeviceContext > | getDeviceContext () const |
| Gets the device context associated with this operation.
|
|
DeviceType | getDeviceType () const |
| Gets the device type for this operation.
|
|
OperationType | getOperationType () const |
| Gets the operation type enumeration value.
|
|
CPU implementation of the Fully Connected operation for neural networks.
This class provides a CPU-based implementation of the Fully Connected operation, which performs a matrix multiplication between the input and a weight matrix, optionally adds a bias, and produces an output. This operation implements the standard linear layer commonly used in neural networks.
The implementation includes both a performance-optimized version with loop unrolling and a naive fallback implementation for special cases.
- Template Parameters
-
float | The data type of the input tensor elements. |
TDataType | The data type used for computation and output (defaults to the input type). |