A PyTorch-based room impulse response (RIR) simulation toolkit with a clean API and GPU support. This project has been developed with substantial assistance from Codex.
Warning
TorchRIR is under active development and may contain bugs or breaking changes. Please validate results for your use case. If you find bugs or have feature requests, please open an issue. Contributions are welcome.
pip install torchrir| Feature | torchrir |
gpuRIR |
pyroomacoustics |
rir-generator |
|---|---|---|---|---|
| 🎯 Dynamic Sources | ✅ | 🟡 Single moving source | 🟡 Manual loop | ❌ |
| 🎤 Dynamic Microphones | ✅ | ❌ | 🟡 Manual loop | ❌ |
| 🖥️ CPU | ✅ | ❌ | ✅ | ✅ |
| 🧮 CUDA | ✅ | ✅ | ❌ | ❌ |
| 🍎 MPS | ✅ | ❌ | ❌ | ❌ |
| 📊 Scene Plot | ✅ | ❌ | ✅ | ❌ |
| 🎞️ Dynamic Scene GIF | ✅ | ❌ | 🟡 Manual animation script | ❌ |
| 🗂️ Dataset Build | ✅ | ❌ | ✅ | ❌ |
| 🎛️ Signal Processing | ❌ Scope out | ❌ | ✅ | ❌ |
| 🧱 Non-shoebox Geometry | 🚧 Candidate | ❌ | ✅ | ❌ |
| 🌐 Geometric Acoustics | 🚧 Candidate | ❌ | ✅ | ❌ |
Legend: ✅ native support, 🟡 manual setup, 🚧 candidate (not yet implemented), ❌ unavailable
For detailed notes and equations, see Read the Docs: Library Comparisons.
- CUDA tests run in
.github/workflows/cuda-ci.ymlon a self-hosted runner with labels:self-hosted,linux,x64,cuda. - The workflow validates installation via
uv sync --group test, checkstorch.cuda.is_available(), runstests/test_device_parity.pywith-k cuda, and then tries to installgpuRIRfrom GitHub. - If
gpuRIRinstalls successfully, the workflow runstests/test_compare_gpurir.py(static + dynamic RIR comparisons). If installation fails, those comparison tests are skipped without failing the whole CUDA CI job.
examples/static.py: fixed sources and microphones with configurable mic count (default: binaural).uv run python examples/static.py --plotexamples/dynamic_src.py: moving sources, fixed microphones.uv run python examples/dynamic_src.py --plotexamples/dynamic_mic.py: fixed sources, moving microphones.uv run python examples/dynamic_mic.py --plotexamples/cli.py: unified CLI for static/dynamic scenes with JSON/YAML configs.uv run python examples/cli.py --mode static --plotexamples/build_dynamic_dataset.py: small dynamic dataset generation script (CMU ARCTIC / LibriSpeech; fixed room/mics, randomized source motion).uv run python examples/build_dynamic_dataset.py --dataset cmu_arctic --num-scenes 4 --num-sources 2torchrir.datasets.dynamic_cmu_arctic: oobss-compatible dynamic CMU ARCTIC builder CLI.python -m torchrir.datasets.dynamic_cmu_arctic --cmu-root datasets/cmu_arctic --n-scenes 2 --overwrite-datasetexamples/benchmark_device.py: CPU/GPU benchmark for RIR simulation.uv run python examples/benchmark_device.py --dynamic
- For dataset attribution and redistribution notes, see THIRD_PARTY_DATASETS.md.
torchrir.datasets.CmuArcticDataset(root, speaker=..., download=...)- Accepted
speaker:aew,ahw,aup,awb,axb,bdl,clb,eey,fem,gka,jmk,ksp,ljm,lnh,rms,rxr,slp,slt - Invalid
speakerraisesValueError. - Missing local files with
download=FalseraisesFileNotFoundError.
- Accepted
torchrir.datasets.LibriSpeechDataset(root, subset=..., speaker=..., download=...)- Accepted
subset:dev-clean,dev-other,test-clean,test-other,train-clean-100,train-clean-360,train-other-500 - Invalid
subsetraisesValueError. - Missing subset/speaker paths with
download=FalseraiseFileNotFoundError.
- Accepted
torchrir.datasets.build_dynamic_cmu_arctic_dataset(...)- Builds oobss-compatible scene folders with
mixture.wav,source_XX.wav,metadata.json, andsource_info.json. - Static layout images (
room_layout_2d.png,room_layout_3d.png) and optional layout videos (room_layout_2d.mp4,room_layout_3d.mp4) are generated, with source-index annotations by default. - Default behavior includes
n_sources=3, moving speed range0.3-0.8 m/s, and motion profile ratios0-35%,35-65%,65-100%.
- Builds oobss-compatible scene folders with
- Local-only (no download) example:
from pathlib import Path from torchrir.datasets import CmuArcticDataset, LibriSpeechDataset cmu = CmuArcticDataset(Path("datasets/cmu_arctic"), speaker="bdl", download=False) libri = LibriSpeechDataset( Path("datasets/librispeech"), subset="train-clean-100", speaker="103", download=False, )
- Full dataset usage details, expected directory layout, and invalid-input handling: Read the Docs: Datasets
- Geometry:
Room,Source,MicrophoneArray - Scene models:
StaticScene,DynamicScene(Sceneis deprecated) - Static RIR:
torchrir.sim.simulate_rir - Dynamic RIR:
torchrir.sim.simulate_dynamic_rir - Simulator object:
torchrir.sim.ISMSimulator(max_order=..., tmax=... | nsample=...) - Dynamic convolution:
torchrir.signal.DynamicConvolver - Audio I/O:
- wav-specific:
torchrir.io.load_wav,torchrir.io.save_wav,torchrir.io.info_wav - backend-supported formats:
torchrir.io.load_audio,torchrir.io.save_audio,torchrir.io.info_audio - metadata-preserving:
torchrir.io.AudioData,torchrir.io.load_audio_data
- wav-specific:
- Metadata export:
torchrir.io.build_metadata,torchrir.io.save_metadata_json
torchrir.sim: simulation backends (ISM implementation lives undertorchrir.sim.ism)torchrir.signal: convolution utilities and dynamic convolvertorchrir.geometry: array geometries, sampling, trajectoriestorchrir.viz: plotting and GIF/MP4 animation helpers- Default plot style follows SciencePlots Grid (
science+grid).
- Default plot style follows SciencePlots Grid (
torchrir.models: room/scene/result data modelstorchrir.io: audio I/O and metadata serialization (*_wavfor wav-only,*_audiofor backend-supported formats)torchrir.util: shared math/tensor/device helperstorchrir.logging: logging utilitiestorchrir.config: simulation configuration objects
- Scene typing is explicit: use
StaticScenefor fixed geometry andDynamicScenefor trajectory-based simulation. DynamicSceneaccepts tensor-like trajectories (e.g., lists) and normalizes them to tensors internally.Sceneremains as a backward-compatibility wrapper and emitsDeprecationWarning.Scene.validate()performs validation without emitting additional deprecation warnings.ISMSimulatorfails fast whenmax_orderortmaxconflicts with the providedSimulationConfig.- Model dataclasses are frozen, but tensor payloads remain mutable (shallow immutability).
torchrir.load/torchrir.saveandtorchrir.io.load/save/infoare deprecated compatibility aliases.
from torchrir import MicrophoneArray, Room, Source
from torchrir.sim import simulate_rir
from torchrir.signal import DynamicConvolver
room = Room.shoebox(size=[6.0, 4.0, 3.0], fs=16000, beta=[0.9] * 6)
sources = Source.from_positions([[1.0, 2.0, 1.5]])
mics = MicrophoneArray.from_positions([[2.0, 2.0, 1.5]])
rir = simulate_rir(room=room, sources=sources, mics=mics, max_order=6, tmax=0.3)
# For dynamic scenes, compute rirs with torchrir.sim.simulate_dynamic_rir and convolve:
# y = DynamicConvolver(mode="trajectory").convolve(signal, rirs)For detailed documentation: Read the Docs
- Advanced room geometry pipeline beyond shoebox rooms (e.g., irregular polygons/meshes and boundary handling).
Motivation: pyroomacoustics#393, pyroomacoustics#405 - General reflection/path capping controls (e.g., first-K, strongest-K, or energy-threshold-based path selection).
Motivation: pyroomacoustics#338 - Microphone hardware response modeling (frequency response, sensitivity, and self-noise).
Motivation: pyroomacoustics#394 - Near-field speech source modeling for more realistic close-talk scenarios.
Motivation: pyroomacoustics#417 - Integrated 3D spatial response visualization (e.g., array/directivity beam-pattern rendering).
Motivation: pyroomacoustics#397