Automatically fix metadata and rename audio files in your music library. Runs nightly as a systemd service or on-demand from the command line.
For every audio file in your music folder (recursively):
- Skip if already processed — tracks files by checksum, so re-runs are fast
- AcoustID fingerprint — identifies the song from the audio waveform itself (most accurate)
- MusicBrainz search — looks up artist + title extracted from the filename
- Filename parser fallback — uses the filename if nothing is found online
Then it writes correct metadata (title, artist, album, year) and renames the file to Artist-Title.ext.
xXx_radiohead_creep_OFFICIAL_2023_[HD].mp3 --> Radiohead-Creep.mp3
01 - unknown artist - no title.flac --> Radiohead-Creep.flac
some_random_hash_a8f3e2.ogg --> Massive Attack-Teardrop.ogg
%%{init: {'theme': 'default'}}%%
graph LR
tagger["music_tagger.py<br/>Orchestrator"]:::core
subgraph modules["Source Modules"]
direction TB
config["config.py<br/>Configuration"]:::data
parser["parser.py<br/>Filename Parsing"]:::engine
lookup["lookup.py<br/>API Lookup"]:::engine
tags["tags.py<br/>Metadata R/W"]:::engine
state["state.py<br/>Checksum Tracking"]:::data
end
subgraph external["External Services"]
direction TB
acoustid["AcoustID API"]:::ext
musicbrainz["MusicBrainz API"]:::ext
audio_files[("Audio Files")]:::ext
state_file[("processed.json")]:::ext
end
tagger --> config
tagger --> parser
tagger --> lookup
tagger --> tags
tagger --> state
lookup -->|"fingerprint"| acoustid
lookup -->|"metadata"| musicbrainz
tags -->|"read/write tags"| audio_files
tags -->|"rename"| audio_files
state -->|"load/save"| state_file
classDef core fill:#2563eb,stroke:#1d4ed8,color:#fff
classDef data fill:#d97706,stroke:#b45309,color:#fff
classDef ext fill:#6b7280,stroke:#4b5563,color:#fff
classDef engine fill:#059669,stroke:#047857,color:#fff
MP3, FLAC, M4A, AAC, OGG, Opus, WMA
- Linux (tested on Ubuntu/Xubuntu 22.04+)
- Python 3.11+
ffmpegandchromaprint-tools(installed automatically byinstall.sh)
Note: The Python code is cross-platform. The automated
install.shis Linux-specific, but step-by-step guides for macOS and Windows are provided below.
git clone https://github.com/AndreaBonn/audio-filename-fixer.git
cd audio-filename-fixer
# Install everything — requires sudo for apt packages
bash install.sh /path/to/your/musicThe installer handles everything:
- Installs system dependencies (
ffmpeg,chromaprint-tools) - Installs uv if not present, then syncs the Python environment
- Creates
config.envwith your music directory - Sets up a systemd user service with a nightly timer (see Scheduling)
- Runs a dry-run test to verify the setup
If you prefer to set things up yourself on Linux:
cd audio-filename-fixer
# Install system dependencies
sudo apt-get install -y ffmpeg chromaprint-tools
# Create Python environment
uv sync
# Create config file
cp .env.example config.env
# Edit config.env with your settingsClick to expand the macOS step-by-step guide
Homebrew is a package manager for macOS — think of it as an "app store for developer tools". Open Terminal (you can find it in Applications > Utilities, or search for it with Spotlight) and paste:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"Follow the on-screen instructions. When it finishes, close and reopen Terminal.
To verify it worked, type:
brew --versionYou should see something like Homebrew 4.x.x.
Still in Terminal, run:
brew install ffmpeg chromaprintThis installs ffmpeg (audio decoder) and fpcalc (audio fingerprinting tool). It may take a few minutes.
Verify both are installed:
ffmpeg -version
fpcalc -versionBoth commands should print version information (not "command not found").
curl -LsSf https://astral.sh/uv/install.sh | shClose and reopen Terminal, then verify:
uv --versiongit clone https://github.com/AndreaBonn/audio-filename-fixer.git
cd audio-filename-fixer
uv synccp .env.example config.envNow open config.env with any text editor (TextEdit, VS Code, nano...) and set your music folder path:
nano config.envChange MUSIC_DIR to point to your music folder, for example:
MUSIC_DIR=/Users/yourname/Music
Save and close (in nano: Ctrl+O, Enter, Ctrl+X).
uv run python music_tagger.py --dry-run --music-dir ~/MusicThis runs in preview mode — it shows what would change without touching any file. If you see output listing your audio files, everything works.
When you're satisfied with the dry-run output:
uv run python music_tagger.pymacOS uses launchd instead of systemd. To run the tagger every night at 3:00 AM:
- Create the file
~/Library/LaunchAgents/com.music-tagger.plist:
mkdir -p ~/Library/LaunchAgents
cat > ~/Library/LaunchAgents/com.music-tagger.plist << 'EOF'
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.music-tagger</string>
<key>ProgramArguments</key>
<array>
<string>/bin/bash</string>
<string>-c</string>
<string>cd "$HOME/audio-filename-fixer" && source config.env && .venv/bin/python music_tagger.py</string>
</array>
<key>StartCalendarInterval</key>
<dict>
<key>Hour</key>
<integer>3</integer>
<key>Minute</key>
<integer>0</integer>
</dict>
<key>StandardOutPath</key>
<string>/tmp/music-tagger.log</string>
<key>StandardErrorPath</key>
<string>/tmp/music-tagger.log</string>
</dict>
</plist>
EOF- Enable it:
launchctl load ~/Library/LaunchAgents/com.music-tagger.plist- To disable it later:
launchctl unload ~/Library/LaunchAgents/com.music-tagger.plistClick to expand the Windows step-by-step guide
- Go to python.org/downloads and download the latest Python installer
- Run the installer
- Important: check the box "Add Python to PATH" at the bottom of the first screen
- Click "Install Now"
To verify, open PowerShell (search for it in the Start menu) and type:
python --versionYou should see Python 3.11.x or higher.
- Go to git-scm.com/downloads/win and download the installer
- Run it with default settings (click "Next" through all screens)
Verify in PowerShell:
git --version- Go to gyan.dev/ffmpeg/builds and download "ffmpeg-release-essentials.zip"
- Extract the zip file to
C:\ffmpeg(create this folder if it doesn't exist) - Inside you'll find a folder like
ffmpeg-7.x-essentials_build— open it and go into thebinfolder - Copy the full path to the
binfolder (e.g.,C:\ffmpeg\ffmpeg-7.1-essentials_build\bin) - Add it to your PATH:
- Press
Win + R, typesysdm.cpl, press Enter - Go to the "Advanced" tab, click "Environment Variables"
- Under "User variables", find "Path", select it, click "Edit"
- Click "New" and paste the path to the
binfolder - Click "OK" on all windows
- Press
Close and reopen PowerShell, then verify:
ffmpeg -version- Go to acoustid.org/chromaprint and download the Windows package
- Extract the zip file
- Copy
fpcalc.exeto the samebinfolder where you put ffmpeg (e.g.,C:\ffmpeg\ffmpeg-7.1-essentials_build\bin), so it's already in your PATH
Verify:
fpcalc -versionIn PowerShell, run:
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"Close and reopen PowerShell, then verify:
uv --versiongit clone https://github.com/AndreaBonn/audio-filename-fixer.git
cd audio-filename-fixer
uv synccopy .env.example config.envOpen config.env with Notepad:
notepad config.envChange MUSIC_DIR to point to your music folder, for example:
MUSIC_DIR=C:\Users\YourName\Music
Save and close Notepad.
In PowerShell, you need to load the environment variables from config.env before running:
Get-Content config.env | ForEach-Object {
if ($_ -match '^([^#][^=]+)=(.*)$') {
[Environment]::SetEnvironmentVariable($Matches[1], $Matches[2], 'Process')
}
}
uv run python music_tagger.py --dry-runThis runs in preview mode — it shows what would change without touching any file.
Load the config and run (same two commands, without --dry-run):
Get-Content config.env | ForEach-Object {
if ($_ -match '^([^#][^=]+)=(.*)$') {
[Environment]::SetEnvironmentVariable($Matches[1], $Matches[2], 'Process')
}
}
uv run python music_tagger.pyTip: To avoid typing the config loading command every time, you can create a shortcut file. Save this as run.ps1 in the project folder:
# run.ps1 — Run the music tagger on Windows
Get-Content "$PSScriptRoot\config.env" | ForEach-Object {
if ($_ -match '^([^#][^=]+)=(.*)$') {
[Environment]::SetEnvironmentVariable($Matches[1], $Matches[2], 'Process')
}
}
uv run python "$PSScriptRoot\music_tagger.py" @argsThen run it with: powershell -File run.ps1 or powershell -File run.ps1 --dry-run.
- Open Task Scheduler (search for it in the Start menu)
- Click "Create Basic Task" in the right panel
- Name:
Music Filename-Fixer & Auto-Tagger, click Next - Trigger: Daily, click Next
- Set the time to 3:00 AM, click Next
- Action: Start a program, click Next
- Program:
powershell - Arguments:
-ExecutionPolicy Bypass -File "C:\Users\YourName\audio-filename-fixer\run.ps1" - Click Finish
To test it immediately: right-click the task and select "Run".
Edit config.env after installation:
MUSIC_DIR=/home/user/Music
ACOUSTID_API_KEY=your-key-here| Variable | Required | Description |
|---|---|---|
MUSIC_DIR |
Yes | Path to your music folder (scanned recursively) |
ACOUSTID_API_KEY |
No | AcoustID API key for audio fingerprinting |
STATE_FILE |
No | Path to state file (default: state/processed.json) |
LOG_FILE |
No | Path to log file (default: logs/tagger.log) |
AcoustID identifies songs from the audio waveform — it works even when the filename is completely wrong or meaningless. Without it, the tagger relies only on filename parsing and MusicBrainz text search.
- Go to acoustid.org and create a free account
- Register a new application
- Copy the API key into
config.env
# Dry run — preview changes without modifying any file
bash run.sh --dry-run
# Run normally — fix tags and rename files
bash run.sh
# Force reprocessing of all files (ignores state)
bash run.sh --reset-state
# Process a different folder (temporary override)
bash run.sh --music-dir /other/path- The tagger scans
MUSIC_DIRrecursively for audio files - Already-processed files (tracked by SHA-1 checksum) are skipped
- Files with complete tags and a clean filename are marked as done
- For each remaining file, it tries the 3-step lookup pipeline
- On success: writes metadata tags and renames the file
- On failure: logs a warning and moves to the next file
- State is saved to
processed.jsonfor future runs
Always run with --dry-run first on a new music folder. It shows exactly what would change without touching any file:
2024-03-15 10:30:01 [INFO] → xXx_radiohead_creep_HD.mp3
2024-03-15 10:30:02 [INFO] AcoustID match (score=0.95): Radiohead - Creep
2024-03-15 10:30:02 [INFO] [DRY RUN] Tag: {'title': 'Creep', 'artists': ['Radiohead'], ...}
2024-03-15 10:30:02 [INFO] [DRY RUN] Rename: xXx_radiohead_creep_HD.mp3 -> Radiohead-Creep.mp3
The install.sh script sets up a systemd user timer that runs the tagger every night at 03:00. If the machine was off at that time, it runs as soon as it boots (thanks to Persistent=true).
# Check timer status
systemctl --user status music-tagger.timer
# View next scheduled run
systemctl --user list-timers music-tagger.timer
# Manually trigger the service
systemctl --user start music-tagger.service
# Disable automatic runs
systemctl --user disable --now music-tagger.timer
# Re-enable automatic runs
systemctl --user enable --now music-tagger.timerIf you prefer cron over systemd:
# Edit your crontab
crontab -e
# Add this line to run every night at 3:00 AM
0 3 * * * cd /path/to/music-tagger && bash run.sh >> logs/cron.log 2>&1audio-filename-fixer/
├── music_tagger.py # Entry point — orchestrates the pipeline
├── src/
│ ├── config.py # Centralized configuration (dataclass + env vars)
│ ├── lookup.py # AcoustID + MusicBrainz API integration
│ ├── parser.py # Filename parsing, slugify, artist splitting
│ ├── state.py # Checksum tracking + atomic JSON persistence
│ └── tags.py # Read/write audio metadata (mutagen) + rename
├── tests/ # Mirrors src/ — 65 tests
│ ├── test_config.py
│ ├── test_lookup.py
│ ├── test_parser.py
│ ├── test_state.py
│ └── test_tags.py
├── install.sh # One-command setup (deps + venv + systemd)
├── run.sh # Manual run wrapper
├── config.env # Your configuration (gitignored)
├── .env.example # Configuration template
├── pyproject.toml # Project config (uv, ruff, pytest)
├── logs/
│ └── tagger.log # All operations logged here
└── state/
└── processed.json # Tracks processed files by checksum
%%{init: {'theme': 'default'}}%%
graph TD
scan(["Scan audio file"]):::core
check_state{"Already processed?<br/>checksum match"}
skip_done(["Skip"]):::data
read_tags["Read existing tags"]:::engine
check_tags{"Tags complete AND<br/>filename OK?"}
mark_done(["Mark done, skip"]):::data
acoustid{"AcoustID fingerprint<br/>score #gt;= 0.5?"}:::engine
mb_search{"MusicBrainz search<br/>score #gt;= 70?"}:::engine
fallback["Filename parser<br/>fallback"]:::engine
write_tags["Write tags + rename"]:::core
log_warn(["Log warning, skip"]):::ext
save_state["Update state"]:::data
scan --> check_state
check_state -->|"Yes"| skip_done
check_state -->|"No"| read_tags
read_tags --> check_tags
check_tags -->|"Yes"| mark_done
check_tags -->|"No"| acoustid
acoustid -->|"Yes"| write_tags
acoustid -->|"No"| mb_search
mb_search -->|"Yes"| write_tags
mb_search -->|"No"| fallback
fallback -->|"Found"| write_tags
fallback -->|"Failed"| log_warn
write_tags --> save_state
classDef core fill:#2563eb,stroke:#1d4ed8,color:#fff
classDef data fill:#d97706,stroke:#b45309,color:#fff
classDef ext fill:#6b7280,stroke:#4b5563,color:#fff
classDef engine fill:#059669,stroke:#047857,color:#fff
sequenceDiagram
participant mt as music_tagger
participant lk as lookup
participant fp as fpcalc
participant ac as AcoustID API
participant mb as MusicBrainz API
participant ps as parser
mt->>+lk: acoustid_lookup(path)
lk->>+fp: calculate fingerprint
fp-->>-lk: fingerprint data
lk->>+ac: fingerprint + api_key
ac-->>-lk: recording_id or error
alt score >= 0.5
lk->>+mb: get recording details
mb-->>-lk: title, artists, album, year
lk-->>-mt: metadata found
else AcoustID failed
lk-->>mt: no result
mt->>+lk: mb_search(artists, title)
lk->>+mb: text search query
alt score >= 70
mb-->>-lk: title, artists, album, year
lk-->>-mt: metadata found
else MusicBrainz failed
mb-->>lk: no match
lk-->>mt: no result
mt->>+ps: parse_filename(stem)
ps-->>-mt: parsed artists + title
end
end
Handles common patterns from YouTube downloads, ripped CDs, and messy libraries:
| Input | Parsed Artist | Parsed Title |
|---|---|---|
Radiohead - Creep |
Radiohead | Creep |
Drake feat. Rihanna - Take Care |
Drake, Rihanna | Take Care |
Simon & Garfunkel - The Sound of Silence |
Simon & Garfunkel | The Sound of Silence |
01. Radiohead - Creep [Official Video] |
Radiohead | Creep |
The parser intelligently handles feat./ft. collaborations while preserving band names with & (e.g., Simon & Garfunkel stays as one artist).
- Each processed file is tracked by its path and a SHA-1 checksum (first 64KB)
- If a file is modified externally, the checksum changes and it gets reprocessed
- State is written atomically (write to
.tmp, then rename) to prevent corruption --reset-stateclears the state and reprocesses everything
| Problem | Solution |
|---|---|
fpcalc not found |
Install chromaprint: sudo apt-get install chromaprint-tools |
| AcoustID not matching | Check your API key in config.env. The tool still works without it (text search fallback). |
| Permission errors | Ensure you own the music files: ls -la /path/to/music |
| Slow first run | Normal — MusicBrainz rate limits to ~1 request/second. Subsequent runs skip already-processed files. |
| Wrong metadata written | Run --reset-state to reprocess. Check logs/tagger.log for details. |
| Timer not running | Check: systemctl --user status music-tagger.timer and loginctl show-user $USER | grep Linger |
# Run tests
uv run pytest -v
# Lint
uv run ruff check .
# Format
uv run ruff format .
# Security audit
uv run bandit -r src/
uv run pip-auditIf you find this project useful, consider giving it a star on GitHub — it helps others discover it and motivates further development.
Copyright 2025 Andrea Bonacci
Licensed under the Apache License, Version 2.0. See LICENSE for details.