Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
TrainerFactory.ixx File Reference

Factory helpers to construct tokenizer trainers and load vocabularies. More...

#include <memory>
#include <filesystem>
#include <string>
#include <span>
#include <stdexcept>
import Data.BpeVocabulary;
import Data.BpeTrainer;
import Data.CharVocabularyConfig;
import Data.CharTokenizer;
import Data.CharVocabulary;
import Data.BpeVocabularyConfig;
import Data.CharTrainer;
import Data.BpeTokenizer;
import Data.TokenizerVocabulary;
import Data.TokenizerTrainer;
import Data.TokenizerType;
import Data.Tokenizer;

Classes

class  Mila::Data::TrainerFactory
 Factory for creating tokenizer trainers and loading vocabularies. More...

Namespaces

namespace  Mila
 Mila main API namespace.
namespace  Mila::Data

Enumerations

enum class  Mila::Data::TokenizerType
 Tokenizer type discriminator used across tokenizer and vocabulary types. More...

Detailed Description

Factory helpers to construct tokenizer trainers and load vocabularies.

Provides simple factory functions used by preprocessing tools to obtain TokenizerTrainer and TokenizerVocabulary implementations for a given TokenizerType. Currently supports char and BPE tokenizers