GPU-accurate interval timer using a CUDA event pair.
More...
GPU-accurate interval timer using a CUDA event pair.
Records start and stop events on a given CUDA stream. Reading the elapsed time via elapsedMilliseconds() synchronizes on the stop event, ensuring the GPU has reached that point before the measurement is read back.
This approach avoids inserting a full cudaStreamSynchronize() at every measurement boundary, so it is suitable for fine-grained per-operation profiling during the beta.1 pass.
Example usage:
void start(cudaStream_t stream)
Record the start event into the given CUDA stream.
Definition CudaTimer.ixx:113
CudaTimer()
Construct a CudaTimer and allocate its CUDA event pair.
Definition CudaTimer.ixx:58
void stop(cudaStream_t stream)
Record the stop event into the given CUDA stream.
Definition CudaTimer.ixx:134
float elapsedMilliseconds() const
Returns the elapsed GPU time in milliseconds between start and stop.
Definition CudaTimer.ixx:156
Thread safety: A CudaTimer instance must not be used concurrently from multiple threads. Each CUDA stream should have its own CudaTimer.
◆ CudaTimer() [1/3]
| Mila::Dnn::Compute::CudaTimer::CudaTimer |
( |
| ) |
|
|
inline |
Construct a CudaTimer and allocate its CUDA event pair.
- Exceptions
-
| std::runtime_error | if either CUDA event cannot be created. |
◆ ~CudaTimer()
| Mila::Dnn::Compute::CudaTimer::~CudaTimer |
( |
| ) |
|
|
inline |
Destructor.
Releases both CUDA events.
◆ CudaTimer() [2/3]
| Mila::Dnn::Compute::CudaTimer::CudaTimer |
( |
const CudaTimer & | | ) |
|
|
delete |
◆ CudaTimer() [3/3]
| Mila::Dnn::Compute::CudaTimer::CudaTimer |
( |
CudaTimer && | | ) |
|
|
delete |
◆ elapsedMilliseconds()
| float Mila::Dnn::Compute::CudaTimer::elapsedMilliseconds |
( |
| ) |
const |
|
inlinenodiscard |
Returns the elapsed GPU time in milliseconds between start and stop.
Synchronizes on the stop event (cudaEventSynchronize) before reading back the elapsed time. The caller's thread blocks until the GPU has reached the stop event.
- Returns
- Elapsed time in milliseconds as reported by the CUDA event subsystem.
- Exceptions
-
| std::runtime_error | if synchronization or elapsed time query fails. |
◆ operator=() [1/2]
◆ operator=() [2/2]
◆ start()
| void Mila::Dnn::Compute::CudaTimer::start |
( |
cudaStream_t | stream | ) |
|
|
inline |
Record the start event into the given CUDA stream.
The GPU timestamp is recorded at the point the stream reaches this event, which may be later than the host-side call.
- Parameters
-
| stream | CUDA stream on which to record the start event. |
- Exceptions
-
| std::runtime_error | if event recording fails. |
◆ stop()
| void Mila::Dnn::Compute::CudaTimer::stop |
( |
cudaStream_t | stream | ) |
|
|
inline |
Record the stop event into the given CUDA stream.
The GPU timestamp is recorded at the point the stream reaches this event. Call elapsedMilliseconds() afterwards to read the measured interval.
- Parameters
-
| stream | CUDA stream on which to record the stop event. |
- Exceptions
-
| std::runtime_error | if event recording fails. |
◆ start_event_
| cudaEvent_t Mila::Dnn::Compute::CudaTimer::start_event_ { nullptr } |
|
private |
◆ stop_event_
| cudaEvent_t Mila::Dnn::Compute::CudaTimer::stop_event_ { nullptr } |
|
private |
The documentation for this class was generated from the following file:
- /__w/Mila/Mila/Mila/Src/Dnn/Compute/Devices/Cuda/Profiling/CudaTimer.ixx