|
Mila 0.13.48
Deep Neural Network Library
|
Contract for SamplingOp: in-place token sampling from a logits tensor. More...
Contract for SamplingOp: in-place token sampling from a logits tensor.
temperature and top_k are per-call parameters – no separate configure() step. token_out is a device INT32 tensor written in-place; the caller provides the buffer (typically decode_token_device_ in LlamaModel).
Non-const because CpuSamplingOp holds an mt19937 rng_ updated on each call.
| TOp | Candidate op type. |
| TLogits | Logits tensor type (model compute precision). |
| TToken | Output tensor type (INT32 device tensor). |