Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
BpeVocabularyConfig.ixx File Reference

Unified configuration for BPE vocabulary construction and runtime properties. More...

#include <cstddef>
#include <cstdint>
#include <string>
#include <stdexcept>
#include <sstream>
import Data.SpecialTokens;
import Serialization.Metadata;
import Data.BpePreTokenizationMode;

Classes

class  Mila::Data::BpeVocabularyConfig
 Configuration for the BPE vocabulary. More...

Namespaces

namespace  Mila
 Mila main API namespace.
namespace  Mila::Data

Detailed Description

Unified configuration for BPE vocabulary construction and runtime properties.

Covers training hyperparameters (used only by BpeTrainer) and vocabulary properties shared across the GPT-2, Llama 3.x, and Mistral BPE families. Training fields (min_frequency, max_merges, enable_merge_caching) are ignored when loading pretrained vocabularies; validate() enforces them only when called by BpeTrainer.