HRM-Text is a one-billion-parameter text generation model and pretraining framework based on the Hierarchical Reasoning Model architecture. It is designed to make foundation model pretraining more accessible by reducing compute and data requirements compared with traditional scaling-heavy approaches. The system combines hierarchical recurrent design, task-completion strengthening, and latent-space reasoning. Its training stack includes PrefixLM sequence packing, FlashAttention 3 kernels, PyTorch FSDP2, evaluation scripts, and checkpoint conversion tools. The repository supports reference pretraining runs for smaller and larger configurations, with Hopper-class GPUs expected for the attention path. It is useful for researchers and engineers exploring efficient language model pretraining, reasoning-focused architectures, and reproducible foundation model experiments.

Features

  • Hierarchical recurrent model architecture
  • One-billion-parameter text generation model
  • Efficient pretraining framework
  • PrefixLM sequence packing
  • FlashAttention 3 training path
  • Evaluation and checkpoint conversion tools

Project Samples

Project Activity

See All Activity >