GPT-2

Code for the paper Language Models are Unsupervised Multitask Learners

This is an exact mirror of the GPT-2 project, hosted at https://github.com/openai/gpt-2. SourceForge is not affiliated with GPT-2.

Downloads: 16 This Week

Last Update: 2025-09-26

Get an email when there's a new version of GPT-2

Windows Mac Linux BSD ChromeOS

This repository contains the code and model weights for GPT-2, a large-scale unsupervised language model described in the OpenAI paper “Language Models are Unsupervised Multitask Learners.” The intent is to provide a starting point for researchers and engineers to experiment with GPT-2: generate text, fine‐tune on custom datasets, explore model behavior, or study its internal phenomena. The repository includes scripts for sampling, training, downloading pre-trained models, and utilities for tokenization and model handling. Support for memory-saving gradient techniques/optimizations during training. Sampling/generation scripts (conditional, unconditional, interactive).

Features

Pretrained model weights for multiple GPT-2 sizes (e.g. 117M, 345M, up to 1.5B parameters)
Sampling / generation scripts (conditional, unconditional, interactive)
Tokenizer and encoding / decoding utilities
Training / fine-tuning script support (for smaller models)
Support for memory-saving gradient techniques / optimizations during training
Utilities to download / manage model checkpoints via script

Project Samples

GPT-2 Screenshot 1

GPT-2 Screenshot 2

Project Activity

See All Activity >

{{ this.obj.activity_extras.summary }}

{{/each}}

Categories

Artificial Intelligence

License

MIT License

Follow GPT-2

Other Useful Business Software

Monitor, analyze, and improve website performance and end-user experience. Icon

Monitor, analyze, and improve website performance and end-user experience.

A slow, unresponsive website can prove costly. Every minute of website downtime or slowness affects your revenue and impacts your brand reputation.

Monitor websites, servers, processes, Windows services, event logs, uptime, response time, SSL certificates, and internal network health from a lightweight 2.1 MB agent that reports a heartbeat every minute.

Learn More

Rate This Project

Login To Rate This Project

User Reviews

Be the first to post a review of GPT-2!

Additional Project Details

Programming Language

Related Categories

Python Artificial Intelligence Software

Registered

2025-09-26

Similar Business Software

Devin Desktop

Devin Desktop (formerly Windsurf) is an AI-powered development environment that combines a full-featured IDE with advanced coding agents in a unified workspace. Formerly known as Windsurf, the platform enables developers to manage local and cloud-based AI agents, delegate tasks, review code, and...

See Software
LTX

Control every aspect of your video using AI, from ideation to final edits, on one holistic platform. We’re pioneering the integration of AI and video production, enabling the transformation of a single idea into a cohesive, AI-generated video. LTX empowers individuals to share their visions,...

See Software
Google AI Studio

Google AI Studio is a unified development platform that helps teams explore, build, and deploy applications using Google’s most advanced AI models, including Gemini 3.5. It brings text, image, audio, and video models together in one interactive playground. With vibe coding, developers can use...

See Software
Wave Browser

Wave Browser is an efficient, eco-conscious browser that creates a cleaner, more organized, and more meaningful online experience while helping remove ocean plastic through its partnership with 4ocean. Built on the trusted Chromium foundation, Wave Browser brings essential tools directly into...

See Software
Google Cloud BigQuery

BigQuery is a serverless, multicloud data warehouse that simplifies the process of working with all types of data so you can focus on getting valuable business insights quickly. At the core of Google’s data cloud, BigQuery allows you to simplify data integration, cost effectively and securely...

See Software
Gemini Enterprise Agent Platform

Gemini Enterprise Agent Platform is a comprehensive solution from Google Cloud designed to help organizations build, scale, govern, and optimize AI agents. It represents the evolution of Vertex AI, combining advanced model development with new capabilities for agent orchestration and...

See Software

Report inappropriate content

Monitor, analyze, and improve website performance and end-user experience.

A slow, unresponsive website can prove costly. Every minute of website downtime or slowness affects your revenue and impacts your brand reputation.

Monitor websites, servers, processes, Windows services, event logs, uptime, response time, SSL certificates, and internal network health from a lightweight 2.1 MB agent that reports a heartbeat every minute.

Learn More

Recommended Projects

UnsupervisedMT
Phrase-Based & Neural Unsupervised Machine Translation
GPT Neo
An implementation of model parallel GPT-2 and GPT-3-style models
GPT-2 Output Dataset
Dataset of GPT-2 outputs for research in detection, biases, and more
RWKV
RNN with great LLM performance
PixelCNN
Code for the paper "PixelCNN++: A PixelCNN Implementation..."