Statistics captured during a single generateStreaming() call. More...

Public Member Functions
bool	valid () const noexcept
	Returns true when at least one generation run has been recorded.

Public Attributes
float	decode_time_ms { 0.0f }
	Total time spent in the autoregressive decode loop (ms); 0 when only one token was generated.
float	decode_tokens_per_second { 0.0f }
	Decode throughput in tokens per second; 0 when decode loop produced no tokens.
float	prefill_time_ms { 0.0f }
	Time to first token: prefill forward pass + synchronization + first token sampling (ms).
std::size_t	prompt_tokens { 0 }
	Number of input prompt tokens processed during prefill.
std::size_t	tokens_generated { 0 }
	Total tokens generated including the first token produced by prefill.

Detailed Description

Statistics captured during a single generateStreaming() call.

Populated by the derived model's onGenerating() implementation after each generation run. Retrieve via getLastGenerationStatistics() once generateStreaming() returns.

Member Function Documentation

◆ valid()

bool Mila::Dnn::GenerationStatistics::valid ( ) const

inlinenodiscardnoexcept

Returns true when at least one generation run has been recorded.

Member Data Documentation

◆ decode_time_ms

float Mila::Dnn::GenerationStatistics::decode_time_ms { 0.0f }

Total time spent in the autoregressive decode loop (ms); 0 when only one token was generated.

◆ decode_tokens_per_second

float Mila::Dnn::GenerationStatistics::decode_tokens_per_second { 0.0f }

Decode throughput in tokens per second; 0 when decode loop produced no tokens.

◆ prefill_time_ms

float Mila::Dnn::GenerationStatistics::prefill_time_ms { 0.0f }

Time to first token: prefill forward pass + synchronization + first token sampling (ms).

◆ prompt_tokens

std::size_t Mila::Dnn::GenerationStatistics::prompt_tokens { 0 }

Number of input prompt tokens processed during prefill.

◆ tokens_generated

std::size_t Mila::Dnn::GenerationStatistics::tokens_generated { 0 }

Total tokens generated including the first token produced by prefill.

The documentation for this struct was generated from the following file:

/__w/Mila/Mila/Mila/Src/Dnn/Core/LanguageModel.ixx

Public Member Functions

Public Attributes