Mila
Deep Neural Network Library
|
The DeviceContext class manages device contexts for module and tensor computations. More...
Public Member Functions | |
DeviceContext (const DeviceContext &)=delete | |
Copy constructor (deleted). | |
DeviceContext (const std::string &device_name) | |
Constructor with a specific device. | |
DeviceContext (DeviceContext &&other) noexcept | |
Move constructor. | |
~DeviceContext () | |
Destructor. | |
std::pair< int, int > | getComputeCapability () const |
Gets the compute capability of the current CUDA device. | |
cublasLtHandle_t | getCublasLtHandle () |
Gets the cuBLASLt handle, initializing it if necessary. | |
std::shared_ptr< ComputeDevice > | getDevice () const |
Gets the current device. | |
int | getDeviceId () const |
Gets the ID of the current CUDA device. | |
cudaStream_t | getStream () const |
Gets the current CUDA stream. | |
bool | isCudaDevice () const |
Checks if the current device is a CUDA device. | |
bool | isDeviceType (DeviceType type) const |
Checks if the current device is of a specific type. | |
void | makeCurrent () const |
Gets the cuDNN handle, initializing it if necessary. | |
DeviceContext & | operator= (const DeviceContext &)=delete |
Copy assignment operator (deleted). | |
DeviceContext & | operator= (DeviceContext &&other) noexcept |
Move assignment operator. | |
void | synchronize () |
Synchronizes the device, waiting for all operations to complete. | |
Private Member Functions | |
void | initializeDeviceResources () |
Initializes resources specific to the current device. | |
void | moveFrom (DeviceContext &&other) |
Moves resources from another DeviceContext. | |
void | releaseResources () |
Releases all device-specific resources. | |
void | setDevice (const std::string &device_name) |
Sets the current device by name. | |
Private Attributes | |
cublasLtHandle_t | cublasLtHandle_ = nullptr |
Handle for cuBLASLt operations. | |
std::shared_ptr< ComputeDevice > | device_ |
The compute device used by this context. | |
int | device_id_ = -1 |
The CUDA device ID, -1 indicates uninitialized. | |
std::mutex | handle_mutex_ |
Mutex for thread-safe handle initialization. | |
cudaStream_t | stream_ = nullptr |
The CUDA stream for asynchronous operations. | |
bool | stream_created_ = false |
Indicates if the stream was created by this context and needs to be destroyed. | |
The DeviceContext class manages device contexts for module and tensor computations.
This class provides functionality for managing compute devices and their associated resources, such as CUDA streams and optional cuBLASLt and cuDNN handles. Multiple instances can be created to manage different devices.
|
inlineexplicit |
Constructor with a specific device.
device_name | The name of the device to use (e.g., "CUDA:0", "CPU"). |
std::runtime_error | If the device name is invalid or device initialization fails. |
|
inline |
Destructor.
Cleans up any associated resources.
|
delete |
Copy constructor (deleted).
|
inlinenoexcept |
Move constructor.
other | The source DeviceContext to move from. |
|
inline |
Gets the compute capability of the current CUDA device.
|
inline |
Gets the cuBLASLt handle, initializing it if necessary.
std::runtime_error | If creating the cuBLASLt handle fails. |
|
inline |
Gets the current device.
|
inline |
Gets the ID of the current CUDA device.
|
inline |
Gets the current CUDA stream.
|
inlineprivate |
Initializes resources specific to the current device.
For CUDA devices, this retrieves the device ID, sets the device as current, and creates a CUDA stream.
|
inline |
Checks if the current device is a CUDA device.
|
inline |
Checks if the current device is of a specific type.
type | The device type to check against. |
|
inline |
Gets the cuDNN handle, initializing it if necessary.
std::runtime_error | If creating the cuDNN handle fails. |
Sets the current device as active in the current thread.
This method ensures that subsequent CUDA operations are executed on the correct device by setting the current device in the thread if it's different from the previously set device. The method optimizes performance by tracking the currently active device per thread and avoiding unnecessary device switches.
std::runtime_error | If setting the CUDA device fails. |
|
inlineprivate |
Moves resources from another DeviceContext.
other | The DeviceContext to move resources from. |
|
delete |
Copy assignment operator (deleted).
|
inlinenoexcept |
Move assignment operator.
other | The source DeviceContext to move from. |
|
inlineprivate |
Releases all device-specific resources.
Frees CUDA streams and library handles when applicable.
|
inlineprivate |
Sets the current device by name.
device_name | The name of the device to set. |
std::runtime_error | If the device name is invalid or device initialization fails. |
|
inline |
Synchronizes the device, waiting for all operations to complete.
When using a CUDA device, this method ensures the current device is active and then synchronizes the CUDA stream, waiting for all enqueued operations to complete.
|
mutableprivate |
Handle for cuBLASLt operations.
|
private |
The compute device used by this context.
|
private |
The CUDA device ID, -1 indicates uninitialized.
|
mutableprivate |
Mutex for thread-safe handle initialization.
|
private |
The CUDA stream for asynchronous operations.
|
private |
Indicates if the stream was created by this context and needs to be destroyed.