Coconut

Training Large Language Model to Reason in a Continuous Latent Space

This is an exact mirror of the Coconut project, hosted at https://github.com/facebookresearch/coconut. SourceForge is not affiliated with Coconut.

Downloads: 3 This Week

Last Update: 4 days ago

Get an email when there's a new version of Coconut

Linux

Coconut is the official PyTorch implementation of the research paper “Training Large Language Models to Reason in a Continuous Latent Space.” The framework introduces a novel method for enhancing large language models (LLMs) with continuous latent reasoning steps, enabling them to generate and refine reasoning chains within a learned latent space rather than relying solely on discrete symbolic reasoning. It supports training across multiple reasoning paradigms—including standard Chain-of-Thought (CoT), no-thought, and hybrid configurations—using configurable training stages and latent representations. The repository is built with Hugging Face Transformers, PyTorch Distributed, and Weights & Biases (wandb) for logging, supporting large-scale experiments on mathematical and logical reasoning datasets such as GSM8K, ProntoQA, and ProsQA.

Features

Reproducible experiment scripts matching the paper’s benchmark protocols
Supports distributed multi-GPU training with torchrun and mixed-precision (bf16)
Dataset preprocessing tools for GSM8K, ProntoQA, and ProsQA
Integrated wandb logging and checkpoint management across training stages
Modular YAML-based configuration for multi-stage training and evaluation
Implements continuous latent reasoning for LLMs beyond discrete CoT prompting

Project Samples

Coconut Screenshot 1

Coconut Screenshot 2

Project Activity

See All Activity >

{{ this.obj.activity_extras.summary }}

{{/each}}

Categories

Large Language Models (LLM)

License

MIT License

Follow Coconut

Coconut Web Site

Other Useful Business Software

Cloudbrink Personal SASE service Icon

Cloudbrink Personal SASE service

For companies looking for low maintenance, secure, high performance connectivity for hybrid and remote workers

Cloudbrink’s Personal SASE is a high-performance connectivity and security service that delivers a lightning-fast, in-office experience to the modern hybrid workforce anywhere. Combining high-performance ZTNA with Automated Moving Target Defense (AMTD), and Personal SD-WAN all connections are ultra-secure.

Learn More

Rate This Project

Login To Rate This Project

User Reviews

Be the first to post a review of Coconut!

Additional Project Details

Operating Systems

Programming Language

Python, Unix Shell

Related Categories

Unix Shell Large Language Models (LLM), Python Large Language Models (LLM)

Registered

2025-10-08

Similar Business Software

Gemini Enterprise Agent Platform

Gemini Enterprise Agent Platform is a comprehensive solution from Google Cloud designed to help organizations build, scale, govern, and optimize AI agents. It represents the evolution of Vertex AI, combining advanced model development with new capabilities for agent orchestration and...

See Software
Google AI Studio

Google AI Studio is a unified development platform that helps teams explore, build, and deploy applications using Google’s most advanced AI models, including Gemini 3.5. It brings text, image, audio, and video models together in one interactive playground. With vibe coding, developers can use...

See Software
LM-Kit.NET

LM-Kit.NET is a complete local AI runtime for .NET that lets engineering teams ship AI-powered features without cloud dependencies, per-token costs, or data leaving the network. Most .NET AI integrations stop at inference. LM-Kit.NET covers the full range of capabilities production...

See Software
Qwen3.6-Max-Preview

Qwen3.6-Max-Preview is a next-generation frontier language model designed to push the limits of intelligence, instruction following, and real-world agent capabilities within the Qwen ecosystem. Building on the Qwen3 series, this preview release introduces stronger world knowledge, sharper...

See Software
Qwen3

Qwen3, the latest iteration of the Qwen family of large language models, introduces groundbreaking features that enhance performance across coding, math, and general capabilities. With models like the Qwen3-235B-A22B and Qwen3-30B-A3B, Qwen3 achieves impressive results compared to top-tier...

See Software
DeepSeek-V3.2

DeepSeek-V3.2 is a next-generation open large language model designed for efficient reasoning, complex problem solving, and advanced agentic behavior. It introduces DeepSeek Sparse Attention (DSA), a long-context attention mechanism that dramatically reduces computation while preserving...

See Software

Report inappropriate content

Cloudbrink Personal SASE service

For companies looking for low maintenance, secure, high performance connectivity for hybrid and remote workers

Cloudbrink’s Personal SASE is a high-performance connectivity and security service that delivers a lightning-fast, in-office experience to the modern hybrid workforce anywhere. Combining high-performance ZTNA with Automated Moving Target Defense (AMTD), and Personal SD-WAN all connections are ultra-secure.

Learn More

Recommended Projects

ToRA
Tool-integrated Reasoning LLM Agents
DecryptPrompt
Summarize Prompt & LLM papers, open source data & models
$Qwen2.5-Math$

Qwen2.5-Math
A series of math-specific large language models of our Qwen2 series
GLM-V
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
GibbsLDA++: A C/C++ Gibbs Sampling LDA
GibbsLDA++: A C/C++ Implementation of Latent Dirichlet Allocation (LDA) using Gibbs Sampling for parameter estimation and inference. GibbsLDA++ is fast and is designed to analyze hidden/latent topic structures of large-scale (text) data collections.