MambaGlue 🐍 @ICRA2025
Fast and Robust Local Feature Matching With Mamba
Kihwan Ryoo · Hyungtae Lim · Hyun Myung

MambaGlue is a hybrid neural network combining the Mamba and the Transformer architectures to match local features.
- Overview
- Tested Environment
- Install
- Quickstart
- Training (experimental reproduction)
- Visualization with hloc (coming soon)
- FAQ
- To Do
- Citation
- License
The main branch contains:
- the standard MambaGlue model and inference utilities (
mambaglue/); and - an experimental training adapter for Glue Factory under
mambaglue/training/— see Training.
SfM and visual-localization integration is planned for a separate hloc branch, built on Hierarchical-Localization.
Note: the
hlocbranch is not yet pushed. The training pipeline is functional but has not yet been verified to reproduce the paper's reported numbers end-to-end — defaults are inherited from LightGlue. See#8.
- Linux (Ubuntu 20.04)
- NVIDIA GPU (TITAN V, RTX 3080, or other Ampere/newer architectures)
- CUDA 11.8 + cuDNN 8
- PyTorch 2.1.0
- Python 3.10+ (3.8 is end-of-life and no longer supported)
Mamba's selective-scan kernels must be built first, then install MambaGlue itself:
# 1) Install Mamba (state-spaces/mamba)
git clone https://github.com/state-spaces/mamba && cd mamba
pip install .
cd ..
# 2) Install MambaGlue
git clone https://github.com/url-kaist/MambaGlue.git && cd MambaGlue
python -m pip install -e .To skip CUDA/toolchain headaches, start from a known-good environment:
The inference API mirrors LightGlue's, so existing LightGlue pipelines drop in with a one-line swap of the matcher.
import torch
from mambaglue import MambaGlue, SuperPoint, match_pair
from mambaglue.utils import load_image
device = "cuda" if torch.cuda.is_available() else "cpu"
extractor = SuperPoint(max_num_keypoints=2048).eval().to(device)
matcher = MambaGlue(features="superpoint").eval().to(device)
image0 = load_image("path/to/image0.jpg").to(device)
image1 = load_image("path/to/image1.jpg").to(device)
feats0, feats1, matches01 = match_pair(extractor, matcher, image0, image1)
matches = matches01["matches"] # indices into kpts0/kpts1
points0 = feats0["keypoints"][matches[..., 0]] # matched keypoints in image0
points1 = feats1["keypoints"][matches[..., 1]] # matched keypoints in image1Supported front-end extractors: superpoint, disk, aliked, sift (passed via the features= argument). To visualize matches, see mambaglue.viz2d.
The mambaglue/training/ subpackage adds a Glue Factory adapter (MambaGlueMatcher) and two YAML configs that reproduce the paper's two-stage recipe (synthetic homographies → MegaDepth) without modifying glue-factory itself.
# 1) Install glue-factory (not on PyPI)
pip install "git+https://github.com/cvg/glue-factory.git"
# 2) Install MambaGlue with training extras
pip install -e ".[train]"
# 3) Run both stages (SuperPoint + MambaGlue)
bash mambaglue/training/run.shThe configs are 10-12 GB-tuned (batch 32 for homographies, batch 4 for MegaDepth, bfloat16 autocast, gradient checkpointing). End-to-end training on a single RTX 3080 takes roughly a week. Target numbers from the paper (SuperPoint + MambaGlue):
| Benchmark | Metric | Paper |
|---|---|---|
| HPatches | PR@3px | 94.6 |
| HPatches (LO-RANSAC) | AUC@1 / 5 px | 39.0 / 79.3 |
| MegaDepth-1500 (LO-RANSAC) | AUC@5° / 10° / 20° | 67.5 / 80.3 / 87.6 |
The paper does not disclose optimizer, learning rate, batch size, layer count, or Mamba SSM dimensions, so defaults are inherited from LightGlue (lr=1e-4, AdamW, 9 layers) and from the released MambaGlue checkpoint's architecture. Treat the first run as exploratory. Mamba kernel hyperparameters (d_state, d_conv, expand) are hard-coded in mambaglue.mambaglue.MambaMixer — edit the source to sweep them.
⚠️ Thehlocbranch has not been pushed yet. Tracked in the To-Do below.
When released, the branch will integrate MambaGlue as a matcher in Hierarchical-Localization for end-to-end Structure-from-Motion and visual localization.
Q. The released checkpoint scores below LightGlue on MegaDepth1500. Is the weight wrong? (#6)
The weight currently published is a pre-publication version, and the runtime environment used for the paper differs from a fresh install. To match the numbers reported in the paper, train from scratch on your target front-end and tune the inference hyperparameters (e.g. filter_threshold, depth_confidence, width_confidence) on a held-out split.
Q. How is MambaGlue trained? (#8)
The mambaglue/training/ subpackage in this branch plugs MambaGlue into Glue Factory with the standard two-stage protocol used by SuperGlue/LightGlue (correspondence head first, then the confidence regressor used for point pruning). See Training for setup and the reproduction caveats — the recipe defaults are inherited from LightGlue because the paper does not disclose them.
Q. Does MambaGlue support point pruning? (#5)
Yes. It is enabled with the width_confidence and depth_confidence config keys (set to a positive value to activate, -1 to disable), the same convention as LightGlue. Pruning is auto-skipped on CPU and on small keypoint counts, where the gather overhead outweighs the savings.
Q. Why does pip install fail to build Mamba on macOS?
Mamba's selective-scan CUDA kernels do not build on macOS. Use the provided Docker image or a Linux machine with a CUDA toolchain.
- Push the
hlocbranch (SfM/visual-localization integration) - Push the published-version checkpoint (currently the released weight is a pre-publication version, see #6)
- Release demo code (notebook)
- ONNX export
If MambaGlue is useful for your research, please cite:
@article{ryoo2025mambaglue,
title={{MambaGlue: Fast and Robust Local Feature Matching With Mamba}},
author={Ryoo, Kihwan and
Lim, Hyungtae and
Myung, Hyun},
journal={arXiv preprint arXiv:2502.00462},
year={2025}
}The MambaGlue code in this repository is released under the Apache-2.0 license.