Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
Mila::Dnn::GenerationStatistics Struct Referenceexport

Statistics captured during a single generateStreaming() call. More...

Public Member Functions

bool valid () const noexcept
 Returns true when at least one generation run has been recorded.

Public Attributes

float decode_time_ms { 0.0f }
 Total time spent in the autoregressive decode loop (ms); 0 when only one token was generated.
float decode_tokens_per_second { 0.0f }
 Decode throughput in tokens per second; 0 when decode loop produced no tokens.
float prefill_time_ms { 0.0f }
 Time to first token: prefill forward pass + synchronization + first token sampling (ms).
std::size_t prompt_tokens { 0 }
 Number of input prompt tokens processed during prefill.
std::size_t tokens_generated { 0 }
 Total tokens generated including the first token produced by prefill.

Detailed Description

Statistics captured during a single generateStreaming() call.

Populated by the derived model's onGenerating() implementation after each generation run. Retrieve via getLastGenerationStatistics() once generateStreaming() returns.

Member Function Documentation

◆ valid()

bool Mila::Dnn::GenerationStatistics::valid ( ) const
inlinenodiscardnoexcept

Returns true when at least one generation run has been recorded.

Member Data Documentation

◆ decode_time_ms

float Mila::Dnn::GenerationStatistics::decode_time_ms { 0.0f }

Total time spent in the autoregressive decode loop (ms); 0 when only one token was generated.

◆ decode_tokens_per_second

float Mila::Dnn::GenerationStatistics::decode_tokens_per_second { 0.0f }

Decode throughput in tokens per second; 0 when decode loop produced no tokens.

◆ prefill_time_ms

float Mila::Dnn::GenerationStatistics::prefill_time_ms { 0.0f }

Time to first token: prefill forward pass + synchronization + first token sampling (ms).

◆ prompt_tokens

std::size_t Mila::Dnn::GenerationStatistics::prompt_tokens { 0 }

Number of input prompt tokens processed during prefill.

◆ tokens_generated

std::size_t Mila::Dnn::GenerationStatistics::tokens_generated { 0 }

Total tokens generated including the first token produced by prefill.


The documentation for this struct was generated from the following file: