This repository contains the code and model weights for GPT-2, a large-scale unsupervised language model described in the OpenAI paper “Language Models are Unsupervised Multitask Learners.” The intent is to provide a starting point for researchers and engineers to experiment with GPT-2: generate text, fine‐tune on custom datasets, explore model behavior, or study its internal phenomena. The repository includes scripts for sampling, training, downloading pre-trained models, and utilities for tokenization and model handling. Support for memory-saving gradient techniques/optimizations during training. Sampling/generation scripts (conditional, unconditional, interactive).

Features

  • Pretrained model weights for multiple GPT-2 sizes (e.g. 117M, 345M, up to 1.5B parameters)
  • Sampling / generation scripts (conditional, unconditional, interactive)
  • Tokenizer and encoding / decoding utilities
  • Training / fine-tuning script support (for smaller models)
  • Support for memory-saving gradient techniques / optimizations during training
  • Utilities to download / manage model checkpoints via script

Project Samples

Project Activity

See All Activity >