Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
Mila::Dnn::Serialization::PretrainedModelReader Class Referenceexport

Reader for Mila pretrained binary format. More...

Collaboration diagram for Mila::Dnn::Serialization::PretrainedModelReader:

Public Member Functions

 PretrainedModelReader (const std::filesystem::path &filepath)
 Open a Mila model file for reading.
 ~PretrainedModelReader ()
bool close ()
const std::string & getFilename () const noexcept
size_t getMaxTensorSizeBytes () const
 Get the maximum byte size across all tensors in the index.
const PretrainedMetadatagetPretrainedMetadata () const
 Get pretrained model metadata.
std::vector< std::string > getTensorNames () const
 Get list of all tensor names in the model.
size_t getTensorSizeBytes (const std::string &name) const
 Get the raw byte size of a named tensor.
bool hasTensor (const std::string &name) const
 Check if tensor exists.
bool isOpen () const noexcept
template<typename MR = Compute::CpuMemoryResource>
requires isValidTensor<dtype_t::UINT8, MR>
TensorBlob< MR > readTensorBlob (const std::string &name, int device_id=0)
 Read raw tensor bytes by name into a memory-resource-typed blob.

Private Member Functions

const TensorBlobMetadatagetTensorBlobMetadata (const std::string &name) const
void parseMetadataJSON (const std::string &json)
void readHeader ()
void readMetadata ()
void readTensorIndex ()

Private Attributes

std::ifstream file_
std::string filename_
std::filesystem::path filepath_
PretrainedMetadata metadata_
uint32_t num_tensors_ { 0 }
std::unordered_map< std::string, TensorBlobMetadatatensor_index_

Static Private Attributes

static constexpr uint32_t MAGIC = 0x4D494C41
static constexpr uint32_t VERSION = 1

Detailed Description

Reader for Mila pretrained binary format.

File format:

  • Header: MILA magic (0x4D494C41), version, num_tensors
  • Metadata: JSON string with model configuration
  • Tensor index: for each tensor: name, dtype, shape, offset, nbytes
  • Tensor data: concatenated binary blobs

Provides flat key-value access to tensors by name:

  • "lenc.wte.weight"
  • "tf_layer_0.ln_1.bias"
  • "ln_final.weight"

Usage:

PretrainedModelReader reader( "gpt2_small.bin" );
auto metadata = reader.getPretrainedMetadata();
auto names = reader.getTensorNames();
for (const auto& name : names)
{
auto blob = reader.readTensorBlob<CudaPinnedMemoryResource>( name, device_id );
network->loadTensorByFlatName( name, blob );
}
CUDA pinned memory resource for fast host/device transfer memory.
Definition CudaPinnedMemoryResource.ixx:26
PretrainedModelReader(const std::filesystem::path &filepath)
Open a Mila model file for reading.
Definition PretrainedReader.ixx:137

Constructor & Destructor Documentation

◆ PretrainedModelReader()

Mila::Dnn::Serialization::PretrainedModelReader::PretrainedModelReader ( const std::filesystem::path & filepath)
inlineexplicit

Open a Mila model file for reading.

Parameters
filepathPath to .bin model file.
Exceptions
std::runtime_errorif file cannot be opened or format is invalid.
Here is the call graph for this function:

◆ ~PretrainedModelReader()

Mila::Dnn::Serialization::PretrainedModelReader::~PretrainedModelReader ( )
inline
Here is the call graph for this function:

Member Function Documentation

◆ close()

bool Mila::Dnn::Serialization::PretrainedModelReader::close ( )
inline
Here is the caller graph for this function:

◆ getFilename()

const std::string & Mila::Dnn::Serialization::PretrainedModelReader::getFilename ( ) const
inlinenoexcept

◆ getMaxTensorSizeBytes()

size_t Mila::Dnn::Serialization::PretrainedModelReader::getMaxTensorSizeBytes ( ) const
inline

Get the maximum byte size across all tensors in the index.

Returns the largest nbytes value in the tensor index. All sizes are known at construction time. No I/O is performed.

Returns
size_t Maximum tensor byte count, or 0 if the index is empty.

◆ getPretrainedMetadata()

const PretrainedMetadata & Mila::Dnn::Serialization::PretrainedModelReader::getPretrainedMetadata ( ) const
inline

Get pretrained model metadata.

Here is the caller graph for this function:

◆ getTensorBlobMetadata()

const TensorBlobMetadata & Mila::Dnn::Serialization::PretrainedModelReader::getTensorBlobMetadata ( const std::string & name) const
inlineprivate
Here is the caller graph for this function:

◆ getTensorNames()

std::vector< std::string > Mila::Dnn::Serialization::PretrainedModelReader::getTensorNames ( ) const
inline

Get list of all tensor names in the model.

Here is the caller graph for this function:

◆ getTensorSizeBytes()

size_t Mila::Dnn::Serialization::PretrainedModelReader::getTensorSizeBytes ( const std::string & name) const
inline

Get the raw byte size of a named tensor.

All sizes are known at construction time from the tensor index. No I/O is performed.

Parameters
nameTensor name.
Returns
size_t Byte count of the tensor data.
Exceptions
std::runtime_errorif name is not found.
Here is the call graph for this function:

◆ hasTensor()

bool Mila::Dnn::Serialization::PretrainedModelReader::hasTensor ( const std::string & name) const
inline

Check if tensor exists.

◆ isOpen()

bool Mila::Dnn::Serialization::PretrainedModelReader::isOpen ( ) const
inlinenoexcept

◆ parseMetadataJSON()

void Mila::Dnn::Serialization::PretrainedModelReader::parseMetadataJSON ( const std::string & json)
inlineprivate
Here is the caller graph for this function:

◆ readHeader()

void Mila::Dnn::Serialization::PretrainedModelReader::readHeader ( )
inlineprivate
Here is the caller graph for this function:

◆ readMetadata()

void Mila::Dnn::Serialization::PretrainedModelReader::readMetadata ( )
inlineprivate
Here is the call graph for this function:
Here is the caller graph for this function:

◆ readTensorBlob()

template<typename MR = Compute::CpuMemoryResource>
requires isValidTensor<dtype_t::UINT8, MR>
TensorBlob< MR > Mila::Dnn::Serialization::PretrainedModelReader::readTensorBlob ( const std::string & name,
int device_id = 0 )
inline

Read raw tensor bytes by name into a memory-resource-typed blob.

Allocates a TensorBuffer<UINT8, MR> of the exact tensor byte size and reads directly from the file into it. No intermediate buffer is used. When MR is CudaPinnedMemoryResource the returned blob data is page-locked, enabling direct DMA to device in copyFromBlob without a staging copy.

Template Parameters
MRMemory resource for the blob data buffer. Defaults to CpuMemoryResource.
Parameters
nameTensor name.
device_idDevice index passed to the memory resource constructor.
Returns
TensorBlob<MR> owning the metadata and raw byte buffer.
Exceptions
std::runtime_errorif the tensor is not found or the read fails.
Here is the call graph for this function:
Here is the caller graph for this function:

◆ readTensorIndex()

void Mila::Dnn::Serialization::PretrainedModelReader::readTensorIndex ( )
inlineprivate
Here is the caller graph for this function:

Member Data Documentation

◆ file_

std::ifstream Mila::Dnn::Serialization::PretrainedModelReader::file_
private

◆ filename_

std::string Mila::Dnn::Serialization::PretrainedModelReader::filename_
private

◆ filepath_

std::filesystem::path Mila::Dnn::Serialization::PretrainedModelReader::filepath_
private

◆ MAGIC

uint32_t Mila::Dnn::Serialization::PretrainedModelReader::MAGIC = 0x4D494C41
staticconstexprprivate

◆ metadata_

PretrainedMetadata Mila::Dnn::Serialization::PretrainedModelReader::metadata_
private

◆ num_tensors_

uint32_t Mila::Dnn::Serialization::PretrainedModelReader::num_tensors_ { 0 }
private

◆ tensor_index_

std::unordered_map<std::string, TensorBlobMetadata> Mila::Dnn::Serialization::PretrainedModelReader::tensor_index_
private

◆ VERSION

uint32_t Mila::Dnn::Serialization::PretrainedModelReader::VERSION = 1
staticconstexprprivate

The documentation for this class was generated from the following file: