Mila 0.13.48
Deep Neural Network Library
Loading...
Searching...
No Matches
PretrainedReader.ixx File Reference

Reader for Mila pretrained binary format. More...

#include <filesystem>
#include <fstream>
#include <string>
#include <unordered_map>
#include <vector>
#include <cstdint>
#include <stdexcept>
#include <format>
#include <algorithm>
import Compute.CpuMemoryResource;
import Compute.Device;
import Dnn.TensorTypes;
import Dnn.TensorDataTypeTraits;
import Dnn.TensorBuffer;
import Dnn.TensorDataType;
import Serialization.Tensor;
import Serialization.OpenMode;
import Dnn.ITensor;
import Dnn.Tensor;
import Serialization.Serializer;

Classes

 Metadata for pretrained model. More...
class  Mila::Dnn::Serialization::PretrainedModelReader
 Reader for Mila pretrained binary format. More...
 Metadata for a tensor blob in pretrained model format. More...

Namespaces

namespace  Mila
 Mila main API namespace.
namespace  Mila::Dnn
namespace  Mila::Dnn::Serialization

Enumerations

enum class  Mila::Dnn::Serialization::DType : uint32_t { Mila::Dnn::Serialization::Float32 = 0 , Mila::Dnn::Serialization::Float16 = 1 , Mila::Dnn::Serialization::BFloat16 = 2 , Mila::Dnn::Serialization::Int32 = 3 }

Functions

TensorDataType Mila::Dnn::Serialization::dtypeToTensorDataType (uint32_t dtype)

Detailed Description

Reader for Mila pretrained binary format.

Provides direct access to pretrained model weights stored in Mila's flat binary format. Used by fromPretrained() factory methods.

TODO (Alpha.5 Phase 6): Replace the per-tensor fstream read loop with CreateFileMapping/MapViewOfFile. The current approach issues one read per tensor blob (224+ for Llama 3.1 8B), capping throughput at ~2 GB/s against a PCIe 4.0 NVMe floor of ~7 GB/s (~8s load vs ~2s target). TensorBlob::data() should return a pointer into the mapped view; the ITensorBlob interface is stable.