DiffRhythm is an open-source, diffusion-based model designed to generate full-length songs. Focused on music creation, it combines advanced AI techniques to produce coherent and creative audio compositions. The model utilizes a latent diffusion architecture, making it capable of producing high-quality, long-form music. It can be accessed on Huggingface, where users can interact with a demo or download the model for further use. DiffRhythm offers tools for both training and inference, and its flexibility makes it ideal for AI-based music production and research in music generation.

Features

  • Diffusion-based model for full-length song generation.
  • Open source
  • Supports fast and simple end-to-end song creation.
  • Focuses on rhythm and musicality with advanced audio processing.
  • Includes models such as DiffRhythm-base and DiffRhythm-vae.
  • Compatible with Hugging Face for model deployment.
  • Easy environment setup with installation scripts for dependencies.
  • Provides a demo and online serving through Hugging Face Space.
  • Future plans include local deployment, Colab support, and Docker integration.

Project Samples

Project Activity

See All Activity >