pyVideoTrans is an ambitious open-source multimedia processing project that assembles speech recognition, subtitle generation, AI translation, voice synthesis, and video assembly into a unified pipeline for converting videos from one language to another with embedded dubbing and captions. At its core it runs speech-to-text models to transcribe audio tracks, translates the resulting text into a target language using local or cloud-based translation engines, synthesizes new speech to match the translated subtitles, and then merges that speech back into the video, creating a fully localized media file. The tool supports both command-line and GUI modes, making it accessible to developers and creatives needing batch or automated processing.

Features

  • Automatic end-to-end video translation pipeline
  • Speech transcription and subtitle generation
  • AI voice synthesis and dubbing
  • Interactive proofreading support
  • Speaker diarization and role assignment
  • Supports local and cloud model backends

Project Samples

Project Activity

See All Activity >