HRM-Text

1B text generation model based on the HRM architecture

This is an exact mirror of the HRM-Text project, hosted at https://github.com/sapientinc/HRM-Text. SourceForge is not affiliated with HRM-Text.

Downloads: 1 This Week

Last Update: 2026-06-17

Get an email when there's a new version of HRM-Text

Windows Mac Linux BSD ChromeOS

HRM-Text is a one-billion-parameter text generation model and pretraining framework based on the Hierarchical Reasoning Model architecture. It is designed to make foundation model pretraining more accessible by reducing compute and data requirements compared with traditional scaling-heavy approaches. The system combines hierarchical recurrent design, task-completion strengthening, and latent-space reasoning. Its training stack includes PrefixLM sequence packing, FlashAttention 3 kernels, PyTorch FSDP2, evaluation scripts, and checkpoint conversion tools. The repository supports reference pretraining runs for smaller and larger configurations, with Hopper-class GPUs expected for the attention path. It is useful for researchers and engineers exploring efficient language model pretraining, reasoning-focused architectures, and reproducible foundation model experiments.

Features

Hierarchical recurrent model architecture
One-billion-parameter text generation model
Efficient pretraining framework
PrefixLM sequence packing
FlashAttention 3 training path
Evaluation and checkpoint conversion tools

Project Samples

HRM-Text Screenshot 1

Project Activity

See All Activity >

{{ this.obj.activity_extras.summary }}

{{/each}}

Categories

License

Apache License V2.0

Follow HRM-Text

HRM-Text Web Site

Other Useful Business Software

Get full visibility and control over your tasks and projects with Wrike. Icon

Get full visibility and control over your tasks and projects with Wrike.

A cloud-based collaboration, work management, and project management software

Wrike offers world-class features that empower cross-functional, distributed, or growing teams take their projects from the initial request stage all the way to tracking work progress and reporting results.

Learn More

Rate This Project

Login To Rate This Project

User Reviews

Be the first to post a review of HRM-Text!

Additional Project Details

Programming Language

Related Categories

Python AI Models

Registered

2026-06-16

Similar Business Software

Gemini Enterprise Agent Platform

Gemini Enterprise Agent Platform is a comprehensive solution from Google Cloud designed to help organizations build, scale, govern, and optimize AI agents. It represents the evolution of Vertex AI, combining advanced model development with new capabilities for agent orchestration and...

See Software
LM-Kit.NET

LM-Kit.NET is a complete local AI runtime for .NET that lets engineering teams ship AI-powered features without cloud dependencies, per-token costs, or data leaving the network. Most .NET AI integrations stop at inference. LM-Kit.NET covers the full range of capabilities production...

See Software
Google AI Studio

Google AI Studio is a unified development platform that helps teams explore, build, and deploy applications using Google’s most advanced AI models, including Gemini 3.5. It brings text, image, audio, and video models together in one interactive playground. With vibe coding, developers can use...

See Software
Qwen-7B

Qwen-7B is the 7B-parameter version of the large language model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud. Qwen-7B is a Transformer-based large language model, which is pretrained on a large volume of data, including web texts, books, codes, etc. Additionally, based on the...

See Software
GPT-4

GPT-4 (Generative Pre-trained Transformer 4) is a large-scale unsupervised language model, yet to be released by OpenAI. GPT-4 is the successor to GPT-3 and part of the GPT-n series of natural language processing models, and was trained on a dataset of 45TB of text to produce human-like text...

See Software
Mistral NeMo

Mistral NeMo, our new best small model. A state-of-the-art 12B model with 128k context length, and released under the Apache 2.0 license. Mistral NeMo is a 12B model built in collaboration with NVIDIA. Mistral NeMo offers a large context window of up to 128k tokens. Its reasoning, world...

See Software

Report inappropriate content

Get full visibility and control over your tasks and projects with Wrike.

A cloud-based collaboration, work management, and project management software

Wrike offers world-class features that empower cross-functional, distributed, or growing teams take their projects from the initial request stage all the way to tracking work progress and reporting results.

Learn More

Recommended Projects

Kimi K2.5
Moonshot's most powerful AI model
GLM-5.1
GLM-5: From Vibe Coding to Agentic Engineering
Qwen3-Omni
Qwen3-omni is a natively end-to-end, omni-modal LLM
Megatron-LM
Ongoing research training transformer models at scale
VibeThinker
Diversity-driven optimization and large-model reasoning ability