Reader for Mila pretrained binary format. More...

Collaboration diagram for Mila::Dnn::Serialization::PretrainedModelReader:

Public Member Functions
	PretrainedModelReader (const std::filesystem::path &filepath)
	Open a Mila model file for reading.
	~PretrainedModelReader ()
bool	close ()
const std::string &	getFilename () const noexcept
size_t	getMaxTensorSizeBytes () const
	Get the maximum byte size across all tensors in the index.
const PretrainedMetadata &	getPretrainedMetadata () const
	Get pretrained model metadata.
std::vector< std::string >	getTensorNames () const
	Get list of all tensor names in the model.
size_t	getTensorSizeBytes (const std::string &name) const
	Get the raw byte size of a named tensor.
bool	hasTensor (const std::string &name) const
	Check if tensor exists.
bool	isOpen () const noexcept
template<typename MR = Compute::CpuMemoryResource> requires isValidTensor<dtype_t::UINT8, MR>
TensorBlob< MR >	readTensorBlob (const std::string &name, int device_id=0)
	Read raw tensor bytes by name into a memory-resource-typed blob.

Private Member Functions
const TensorBlobMetadata &	getTensorBlobMetadata (const std::string &name) const
void	parseMetadataJSON (const std::string &json)
void	readHeader ()
void	readMetadata ()
void	readTensorIndex ()

Private Attributes
std::ifstream	file_
std::string	filename_
std::filesystem::path	filepath_
PretrainedMetadata	metadata_
uint32_t	num_tensors_ { 0 }
std::unordered_map< std::string, TensorBlobMetadata >	tensor_index_

Static Private Attributes
static constexpr uint32_t	MAGIC = 0x4D494C41
static constexpr uint32_t	VERSION = 1

Detailed Description

Reader for Mila pretrained binary format.

File format:

Header: MILA magic (0x4D494C41), version, num_tensors
Metadata: JSON string with model configuration
Tensor index: for each tensor: name, dtype, shape, offset, nbytes
Tensor data: concatenated binary blobs

Provides flat key-value access to tensors by name:

"lenc.wte.weight"
"tf_layer_0.ln_1.bias"
"ln_final.weight"

Usage:

PretrainedModelReader reader( "gpt2_small.bin" );
 
auto metadata = reader.getPretrainedMetadata();
auto names = reader.getTensorNames();
 
for (const auto& name : names)
{
    auto blob = reader.readTensorBlob<CudaPinnedMemoryResource>( name, device_id );
    network->loadTensorByFlatName( name, blob );
}

Constructor & Destructor Documentation

◆ PretrainedModelReader()

Mila::Dnn::Serialization::PretrainedModelReader::PretrainedModelReader ( const std::filesystem::path & filepath )

inlineexplicit

Open a Mila model file for reading.

Parameters

filepath Path to .bin model file.

Exceptions

std::runtime_error if file cannot be opened or format is invalid.

Here is the call graph for this function:

◆ ~PretrainedModelReader()

Mila::Dnn::Serialization::PretrainedModelReader::~PretrainedModelReader ( )

inline

Here is the call graph for this function:

Member Function Documentation

◆ close()

bool Mila::Dnn::Serialization::PretrainedModelReader::close ( )

inline

Here is the caller graph for this function:

◆ getFilename()

const std::string & Mila::Dnn::Serialization::PretrainedModelReader::getFilename ( ) const

inlinenoexcept

◆ getMaxTensorSizeBytes()

size_t Mila::Dnn::Serialization::PretrainedModelReader::getMaxTensorSizeBytes ( ) const

inline

Get the maximum byte size across all tensors in the index.

Returns the largest nbytes value in the tensor index. All sizes are known at construction time. No I/O is performed.

Returns: size_t Maximum tensor byte count, or 0 if the index is empty.

◆ getPretrainedMetadata()

const PretrainedMetadata & Mila::Dnn::Serialization::PretrainedModelReader::getPretrainedMetadata ( ) const

inline

Get pretrained model metadata.

Here is the caller graph for this function:

◆ getTensorBlobMetadata()

const TensorBlobMetadata & Mila::Dnn::Serialization::PretrainedModelReader::getTensorBlobMetadata ( const std::string & name ) const

inlineprivate

Here is the caller graph for this function:

◆ getTensorNames()

std::vector< std::string > Mila::Dnn::Serialization::PretrainedModelReader::getTensorNames ( ) const

inline

Get list of all tensor names in the model.

Here is the caller graph for this function:

◆ getTensorSizeBytes()

size_t Mila::Dnn::Serialization::PretrainedModelReader::getTensorSizeBytes ( const std::string & name ) const

inline

Get the raw byte size of a named tensor.

All sizes are known at construction time from the tensor index. No I/O is performed.

Parameters

name	Tensor name.

Returns: size_t Byte count of the tensor data.

Exceptions

std::runtime_error if name is not found.

Here is the call graph for this function:

◆ hasTensor()

bool Mila::Dnn::Serialization::PretrainedModelReader::hasTensor ( const std::string & name ) const

inline

Check if tensor exists.

◆ isOpen()

bool Mila::Dnn::Serialization::PretrainedModelReader::isOpen ( ) const

inlinenoexcept

◆ parseMetadataJSON()

void Mila::Dnn::Serialization::PretrainedModelReader::parseMetadataJSON ( const std::string & json )

inlineprivate

Here is the caller graph for this function:

◆ readHeader()

void Mila::Dnn::Serialization::PretrainedModelReader::readHeader ( )

inlineprivate

Here is the caller graph for this function:

◆ readMetadata()

void Mila::Dnn::Serialization::PretrainedModelReader::readMetadata ( )

inlineprivate

Here is the call graph for this function:

Here is the caller graph for this function:

◆ readTensorBlob()

template<typename MR = Compute::CpuMemoryResource>
requires isValidTensor<dtype_t::UINT8, MR>

TensorBlob< MR > Mila::Dnn::Serialization::PretrainedModelReader::readTensorBlob	(	const std::string &	name,
		int	device_id = 0 )

inline

Read raw tensor bytes by name into a memory-resource-typed blob.

Allocates a TensorBuffer<UINT8, MR> of the exact tensor byte size and reads directly from the file into it. No intermediate buffer is used. When MR is CudaPinnedMemoryResource the returned blob data is page-locked, enabling direct DMA to device in copyFromBlob without a staging copy.

Template Parameters

MR	Memory resource for the blob data buffer. Defaults to CpuMemoryResource.

Parameters

name	Tensor name.
device_id	Device index passed to the memory resource constructor.

Returns: TensorBlob<MR> owning the metadata and raw byte buffer.

Exceptions

std::runtime_error if the tensor is not found or the read fails.

Here is the call graph for this function:

Here is the caller graph for this function:

◆ readTensorIndex()

void Mila::Dnn::Serialization::PretrainedModelReader::readTensorIndex ( )

inlineprivate

Here is the caller graph for this function:

Member Data Documentation

◆ file_

std::ifstream Mila::Dnn::Serialization::PretrainedModelReader::file_

private

◆ filename_

std::string Mila::Dnn::Serialization::PretrainedModelReader::filename_

private

◆ filepath_

std::filesystem::path Mila::Dnn::Serialization::PretrainedModelReader::filepath_

private

◆ MAGIC

uint32_t Mila::Dnn::Serialization::PretrainedModelReader::MAGIC = 0x4D494C41

staticconstexprprivate

◆ metadata_

PretrainedMetadata Mila::Dnn::Serialization::PretrainedModelReader::metadata_

private

◆ num_tensors_

uint32_t Mila::Dnn::Serialization::PretrainedModelReader::num_tensors_ { 0 }

private

◆ tensor_index_

std::unordered_map<std::string, TensorBlobMetadata> Mila::Dnn::Serialization::PretrainedModelReader::tensor_index_

private

◆ VERSION

uint32_t Mila::Dnn::Serialization::PretrainedModelReader::VERSION = 1

staticconstexprprivate

The documentation for this class was generated from the following file:

/__w/Mila/Mila/Mila/Src/Dnn/Serialization/PretrainedReader.ixx

Public Member Functions

Private Member Functions

Private Attributes

Static Private Attributes

Detailed Description

Constructor & Destructor Documentation

◆ PretrainedModelReader()

◆ ~PretrainedModelReader()

Member Function Documentation

◆ close()

◆ getFilename()

◆ getMaxTensorSizeBytes()

◆ getPretrainedMetadata()

◆ getTensorBlobMetadata()

◆ getTensorNames()

◆ getTensorSizeBytes()

◆ hasTensor()

◆ isOpen()

◆ parseMetadataJSON()

◆ readHeader()

◆ readMetadata()

◆ readTensorBlob()

◆ readTensorIndex()

Member Data Documentation

◆ file_

◆ filename_

◆ filepath_

◆ MAGIC

◆ metadata_

◆ num_tensors_

◆ tensor_index_

◆ VERSION